CI/CD for IaC Guide (V2 - Enterprise)
A Reference Architecture by Ditah Kumbong, ZSoftly Technologies Inc.
December 2025
This document is provided for informational purposes only. It represents ZSoftly Technologies Inc.'s current product offerings and practices as of the date of publication, which are subject to change without notice.
Reference Implementation
All patterns, scripts, and configurations referenced in this whitepaper are available in our open-source reference repository:
github.com/zsoftly/iac-cicd-reference
This repository is designed to be AI-friendly—provide it to your preferred AI assistant along with your organization's context to generate customized CI/CD pipelines.
Infrastructure as Code (IaC) enables organizations to manage cloud resources with the same rigor as application code. However, implementing production-grade CI/CD pipelines for IaC presents unique challenges: credential management, multi-account deployments, state consistency, and approval workflows.
This whitepaper presents a comprehensive reference architecture that addresses these challenges while delivering measurable business outcomes.
| Outcome | How We Deliver It |
|---|---|
| Scale Revenue | Ship infrastructure changes faster with automated pipelines |
| Reduce Costs | Eliminate wasted pipeline runs, optimize credential management |
| Minimize Risk | Zero stored secrets, mandatory approvals, state locking |
| Pattern | Benefit |
|---|---|
| OU-Based Account Model | PLT OU (runner) + WKL OU (environments) |
| Two-StackSet Deployment | Automatic role provisioning based on OU membership |
| Role Chaining | Runner assumes deploy roles cross-account |
| Skip Feature Branches | Eliminate 60-80% of wasteful pipeline runs |
| Shared Modules | No boilerplate, faster reviews, consistency |
| Terraform State Backend | Versioned, locked, encrypted state management |
All implementation details are available at:
github.com/zsoftly/iac-cicd-reference
| Directory | Contents |
|---|---|
docs/ |
Authentication, pipeline rules, OU conventions |
scripts/ |
OIDC setup, role assumption, rollback utilities |
cloudformation/stacksets/plt/ |
StackSet templates for PLT OU (runner role) |
cloudformation/stacksets/wkl/ |
StackSet templates for WKL OU (deploy roles) |
terraform/ |
File naming conventions and patterns |
ansible/ |
Roles, playbooks, inventory structure |
.github/, gitlab-ci/, jenkins/ |
Platform-specific guidance |
Manual infrastructure changes create bottlenecks. Teams wait for approvals, operators make mistakes, and deployments queue up.
Before: 2-week infrastructure change cycle
After: Same-day deployments with automated validation

Impact: More features reach customers faster, driving revenue growth.
Traditional CI/CD pipelines run on every commit, including work-in-progress feature branches that will never be deployed.
The Problem:
| Event | Traditional | Our Approach |
|---|---|---|
| Feature branch push | Runs pipeline | Skipped |
| PR/MR opened | Runs pipeline | Runs pipeline |
| Push to main | Runs pipeline | Runs pipeline |
Savings: Skip 60-80% of pipeline runs by only building when code is ready for review.
Reference: See docs/pipeline-rules.md for trigger configuration.
Stored credentials are the #1 source of cloud security breaches. Our architecture eliminates them entirely.
Risk Reduction:
| Risk | Traditional | Our Approach |
|---|---|---|
| Credential exposure | Static keys in CI/CD | Zero stored secrets (OIDC) |
| Over-privileged access | Single admin key | Role chaining (minimal → full) |
| Unauthorized changes | Anyone can deploy | Mandatory approvals for prod |
| State corruption | No locking | DynamoDB locking + versioning |
| Disaster recovery | Manual rebuild | State rollback scripts |
Reference: See docs/authentication.md and scripts/95-rollback.sh.
Every team writing infrastructure from scratch creates inconsistency, duplicates effort, and multiplies review burden. Shared modules solve this.
The Problem with Copy-Paste Infrastructure:
| Issue | Impact |
|---|---|
| Duplicate code | Every team writes the same VPC, IAM, S3 patterns |
| Inconsistent configs | Different teams use different defaults |
| Review overhead | Reviewers check the same patterns repeatedly |
| Drift over time | No single source of truth |
The Shared Module Strategy:
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#3B82F6', 'lineColor': '#64748B', 'background': '#fff'}}}%%
flowchart LR
TEAM1[Team A] --> MOD[(Shared<br/>Modules<br/>v1.2.0)]
TEAM2[Team B] --> MOD
TEAM3[Team C] --> MOD
MOD --> AWS[Consistent<br/>Infrastructure]
style TEAM1 fill:#64748B,stroke:#475569,color:#fff
style TEAM2 fill:#64748B,stroke:#475569,color:#fff
style TEAM3 fill:#64748B,stroke:#475569,color:#fff
style MOD fill:#3B82F6,stroke:#2563EB,color:#fff
style AWS fill:#059669,stroke:#047857,color:#fff
Benefits of Versioned Modules:
| Benefit | How |
|---|---|
| No boilerplate | Teams consume modules, don't write from scratch |
| Faster reviews | Reviewers trust pinned module versions |
| Guaranteed consistency | Same module = same configuration |
| Safe upgrades | Pin to v1.2.0, upgrade when ready |
| No breaking changes | Teams control when they adopt new versions |
Example Module Versioning:
# Team A - stable, pinned
module "vpc" {
source = "git::https://github.com/org/tf-modules.git//vpc?ref=v1.2.0"
}
# Team B - same module, same consistency
module "vpc" {
source = "git::https://github.com/org/tf-modules.git//vpc?ref=v1.2.0"
}
Applies to All IaC Tools:
| Tool | Reusable Pattern |
|---|---|
| Terraform | Git-sourced modules with version tags |
| Ansible | Roles in collections with version constraints |
| CloudFormation | Nested stacks, Service Catalog products |
Reference: See terraform/ for module conventions and ansible/roles/ for reusable roles.
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#3B82F6', 'lineColor': '#64748B', 'background': '#fff'}}}%%
flowchart LR
CICD[CI/CD Platform<br/>GitHub / GitLab / Jenkins] --> OIDC[OIDC<br/>Provider]
OIDC --> RUNNER[PLT OU<br/>Runner Account]
RUNNER --> NP[WKL-NPD OU<br/>SBX / DEV / QAT]
RUNNER --> PROD[WKL-PRD OU<br/>STG / PRD / DR]
NP --> S3[(S3 State<br/>+ Versioning)]
PROD --> S3
S3 --> DDB[(DynamoDB<br/>Lock)]
style CICD fill:#64748B,stroke:#475569,color:#fff
style OIDC fill:#3B82F6,stroke:#2563EB,color:#fff
style RUNNER fill:#7C3AED,stroke:#5B21B6,color:#fff
style NP fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style PROD fill:#059669,stroke:#047857,color:#fff
style S3 fill:#D97706,stroke:#B45309,color:#fff
style DDB fill:#D97706,stroke:#B45309,color:#fff
| Principle | Implementation |
|---|---|
| Zero Secrets | OIDC federation, no stored credentials |
| OU-Based Isolation | PLT OU for runners, WKL OU for deploys |
| Automatic Onboarding | StackSets auto-deploy roles to new OU accounts |
| Least Privilege | Runner assumes scoped deploy roles cross-account |
| Consistent Ordering | Numbered prefixes for predictable sorting |
| Platform Agnostic | Patterns work across GitHub, GitLab, Jenkins |
Reference: See README.md for complete architecture overview.
Traditional CI/CD uses static AWS credentials stored in the platform. This creates risk: credentials can leak, don't expire, and provide excessive access.
Our approach uses cross-account role chaining: a runner role in PLT OU assumes deploy roles in WKL accounts.

| Role | Location | Purpose | Permissions |
|---|---|---|---|
| cicd-runner-role | PLT OU | Authenticate via OIDC | sts:AssumeRole to deploy roles |
| cicd-deploy-role | Each WKL Acct | Deploy infrastructure | Full infrastructure access |
Why This Matters:
| Method | Best For | Reference |
|---|---|---|
| OIDC + Role Chain | GitHub Actions, GitLab Premium | scripts/00-setup-oidc-github.sh |
| IMDv2 + Role Chain | Self-hosted EC2 runners | docs/authentication.md |
Reference: See docs/authentication.md for complete pattern documentation.
A well-structured Organizational Unit (OU) hierarchy enables automated IAM role deployment via CloudFormation StackSets. This approach separates the CI/CD runner account from workload accounts, providing clear security boundaries and automatic onboarding for new accounts.
OU Code Reference:
| Code | Full Name | Purpose |
|---|---|---|
| PLT | Platform | CI/CD runners and shared tooling |
| WKL | Workloads | Application environments (parent) |
| WKL-NPD | Workloads-NonProd | Development and testing |
| WKL-PRD | Workloads-Prod | Production and DR |
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#3B82F6', 'lineColor': '#64748B', 'background': '#fff'}}}%%
flowchart TB
ROOT[Root] --> PLT[PLT OU]
ROOT --> WKL[WKL OU]
PLT --> RUNNER[PLT-Runner<br/>Account]
WKL --> NPD[WKL-NPD OU]
WKL --> PRD[WKL-PRD OU]
NPD --> SBX[SBX Account]
NPD --> DEV[DEV Account]
NPD --> QAT[QAT Account]
PRD --> STG[STG Account]
PRD --> PROD[PRD Account]
PRD --> DR[DR Account]
style ROOT fill:#64748B,stroke:#475569,color:#fff
style PLT fill:#7C3AED,stroke:#5B21B6,color:#fff
style WKL fill:#3B82F6,stroke:#2563EB,color:#fff
style NPD fill:#3B82F6,stroke:#2563EB,color:#fff
style PRD fill:#059669,stroke:#047857,color:#fff
style RUNNER fill:#A78BFA,stroke:#7C3AED,color:#1E1B4B
style SBX fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style DEV fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style QAT fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style STG fill:#6EE7B7,stroke:#059669,color:#064E3B
style PROD fill:#6EE7B7,stroke:#059669,color:#064E3B
style DR fill:#FCD34D,stroke:#D97706,color:#78350F
| OU | Purpose | Accounts |
|---|---|---|
| PLT | CI/CD infrastructure, runner execution | PLT-Runner |
| WKL | Application environments (parent OU) | - |
| WKL-NPD | Development and testing environments | SBX, DEV, QAT |
| WKL-PRD | Production and disaster recovery | STG, PRD, DR |
The runner account in PLT OU assumes roles in WKL accounts. This separation ensures:
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#232F3E', 'lineColor': '#232F3E', 'background': '#fff'}}}%%
flowchart LR
RUNNER[PLT-Runner<br/>cicd-runner-role] --> DEV[DEV Account<br/>cicd-deploy-role]
RUNNER --> QAT[QAT Account<br/>cicd-deploy-role]
RUNNER --> STG[STG Account<br/>cicd-deploy-role]
RUNNER --> PRD[PRD Account<br/>cicd-deploy-role]
style RUNNER fill:#7C3AED,stroke:#5B21B6,color:#fff
style DEV fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style QAT fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style STG fill:#6EE7B7,stroke:#059669,color:#064E3B
style PRD fill:#6EE7B7,stroke:#059669,color:#064E3B
Security Benefits:
| Prefix | Environment | OU | Account | Region | Change Request |
|---|---|---|---|---|---|
| 00 | runner | PLT | PLT | ca-central-1 | No |
| 05 | sbx | WKL-NPD | SBX | ca-central-1 | No |
| 10 | dev | WKL-NPD | DEV | ca-central-1 | No |
| 20 | qat | WKL-NPD | QAT | ca-central-1 | No |
| 40 | stg | WKL-PRD | STG | ca-central-1 | No |
| 70 | prod | WKL-PRD | PRD | ca-central-1 | Yes |
| 90 | dr | WKL-PRD | DR | ca-west-1 | No |
Why Numbered Prefixes?
Alphabetical sorting produces incorrect order: dev, dr, prod, qat, sbx, stg
Numbered prefixes ensure correct order across all tools: 00-runner, 05-sbx, 10-dev, 20-qat, 40-stg, 70-prod, 90-dr
Reference: See docs/conventions.md for complete naming standards.
The most impactful optimization: only run pipelines when code is ready for review.

| Stage | Trigger | Purpose |
|---|---|---|
| Validate | Automatic | Format check, linting, syntax validation |
| Plan | Automatic | Generate execution plan, show diff |
| Deploy | Manual | Apply changes to environment |
| Destroy | Manual + Admin | Teardown resources (protected) |
| Environment | OU | On PR/MR | On Main | Requires CR |
|---|---|---|---|---|
| 05-sbx | WKL-NPD | Manual | Manual | No |
| 10-dev | WKL-NPD | Manual | Manual | No |
| 20-qat | WKL-NPD | Blocked | Manual | No |
| 40-stg | WKL-PRD | Blocked | Manual | No |
| 70-prod | WKL-PRD | Blocked | Manual | Yes |
| 90-dr | WKL-PRD | Blocked | Manual | No |
Reference: See docs/pipeline-rules.md for complete trigger configuration.
Foundation resources are deployed via CloudFormation StackSets targeting specific OUs. This architecture uses two StackSets—one for PLT OU (runner account) and one for WKL OU (deployment targets)—enabling automatic role deployment to new accounts.
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#232F3E', 'lineColor': '#232F3E', 'background': '#fff'}}}%%
flowchart TB
MGMT[Management Account] --> SS1[StackSet 1:<br/>PLT Runner Role]
MGMT --> SS2[StackSet 2:<br/>WKL Deploy Roles]
SS1 --> PLT_OU[PLT OU]
PLT_OU --> RUNNER[PLT-Runner Account<br/>cicd-runner-role]
SS2 --> WKL_OU[WKL OU]
WKL_OU --> NPD[WKL-NPD OU]
WKL_OU --> PRD[WKL-PRD OU]
NPD --> DEV[DEV: cicd-deploy-role]
NPD --> QAT[QAT: cicd-deploy-role]
PRD --> STG[STG: cicd-deploy-role]
PRD --> PROD[PRD: cicd-deploy-role]
style MGMT fill:#64748B,stroke:#475569,color:#fff
style SS1 fill:#7C3AED,stroke:#5B21B6,color:#fff
style SS2 fill:#3B82F6,stroke:#2563EB,color:#fff
style PLT_OU fill:#A78BFA,stroke:#7C3AED,color:#1E1B4B
style WKL_OU fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style NPD fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style PRD fill:#6EE7B7,stroke:#059669,color:#064E3B
style RUNNER fill:#A78BFA,stroke:#7C3AED,color:#1E1B4B
style DEV fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style QAT fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style STG fill:#6EE7B7,stroke:#059669,color:#064E3B
style PROD fill:#6EE7B7,stroke:#059669,color:#064E3B
| StackSet | Target OU | Purpose | Role Created |
|---|---|---|---|
| PLT Runner StackSet | PLT OU | Creates runner execution role with OIDC trust | cicd-runner-role |
| WKL Deploy StackSet | WKL OU | Creates deploy roles trusting runner role ARN | cicd-deploy-role |
How It Works:
cicd-runner-role in the runner accountAutomatic Account Onboarding:
When a new account is added to WKL-NPD or WKL-PRD OU:
cicd-deploy-role to the new account| Template | Target OU | Purpose | Resources Created |
|---|---|---|---|
plt/00-oidc-provider-github.yaml |
PLT | GitHub OIDC federation | IAM OIDC Provider |
plt/10-iam-runner-role.yaml |
PLT | Runner execution role | cicd-runner-role |
wkl/15-iam-deploy-role.yaml |
WKL | Deployment target roles | cicd-deploy-role (per account) |
wkl/20-terraform-state-backend.yaml |
WKL | State management | S3 bucket, DynamoDB, KMS key |
Runner Role (PLT OU):
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Federated: !Sub arn:aws:iam::${AWS::AccountId}:oidc-provider/token.actions.githubusercontent.com
Action: sts:AssumeRoleWithWebIdentity
Condition:
StringEquals:
token.actions.githubusercontent.com:aud: sts.amazonaws.com
StringLike:
token.actions.githubusercontent.com:sub: repo:org/repo:*
Deploy Role (WKL OU):
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
AWS: !Sub arn:aws:iam::${RunnerAccountId}:role/cicd-runner-role
Action: sts:AssumeRole
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#232F3E', 'lineColor': '#232F3E', 'background': '#fff'}}}%%
flowchart LR
A[1. Deploy OIDC Provider<br/>to PLT OU] --> B[2. Deploy Runner Role<br/>to PLT OU]
B --> C[3. Deploy Deploy Roles<br/>to WKL OU]
C --> D[4. Deploy State Backend<br/>to WKL OU]
style A fill:#3B82F6,stroke:#1E40AF,color:#fff
style B fill:#7C3AED,stroke:#5B21B6,color:#fff
style C fill:#F59E0B,stroke:#B45309,color:#fff
style D fill:#10B981,stroke:#047857,color:#fff
| Mode | Use Case |
|---|---|
| SERVICE_MANAGED | Automatic deployment to all accounts in target OU (recommended) |
| SELF_MANAGED | Manual deployment to specific account list |
Reference: See cloudformation/stacksets/ for templates and scripts/deploy-foundation.md for commands.
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#232F3E', 'lineColor': '#232F3E', 'background': '#fff'}}}%%
flowchart LR
TF[Terraform] --> S3[(S3 State)]
TF --> DDB[(DynamoDB Lock)]
S3 --> VER[Versioning]
S3 --> ENC[Encryption]
style TF fill:#7B42BC,stroke:#5C32A3,color:#fff
style S3 fill:#232F3E,stroke:#232F3E,color:#fff
style DDB fill:#3B82F6,stroke:#1E40AF,color:#fff
style VER fill:#10B981,stroke:#047857,color:#fff
style ENC fill:#10B981,stroke:#047857,color:#fff
| Feature | Implementation | Benefit |
|---|---|---|
| Locking | DynamoDB table | Prevent concurrent modifications |
| Versioning | S3 versioning | Rollback to previous state |
| Encryption | S3 SSE or KMS | Data protection at rest |
| Cross-Region | S3 replication | Disaster recovery (ca-central-1 ↔ ca-west-1) |
| TTL Cleanup | DynamoDB TTL | Auto-expire stale locks |
Each WKL account has its own state bucket, deployed via StackSet:
# WKL-NPD OU accounts
s3://org-tfstate-SBX-ca-central-1/
├── org/repo/05-sbx/terraform.tfstate
s3://org-tfstate-DEV-ca-central-1/
├── org/repo/10-dev/terraform.tfstate
s3://org-tfstate-QAT-ca-central-1/
├── org/repo/20-qat/terraform.tfstate
# WKL-PRD OU accounts
s3://org-tfstate-STG-ca-central-1/
├── org/repo/40-stg/terraform.tfstate
s3://org-tfstate-PRD-ca-central-1/
├── org/repo/70-prod/terraform.tfstate
s3://org-tfstate-DR-ca-west-1/
├── org/repo/90-dr/terraform.tfstate
Reference: See cloudformation/stacksets/wkl/20-terraform-state-backend.yaml for backend template.
Every CI/CD run that downloads dependencies from scratch wastes time and resources:
| Task | Without Cache | With Cache |
|---|---|---|
| Download Terraform providers | 30-60s | 2-5s |
| Install Ansible collections | 20-40s | 1-3s |
| Fetch Python/Node packages | 45-90s | 3-8s |
| Pull container base images | 60-120s | 5-10s |
Impact: A 5-minute pipeline becomes 1-2 minutes with proper caching.
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#3B82F6', 'lineColor': '#64748B', 'background': '#fff'}}}%%
flowchart LR
RUNNER[CI/CD Runner] --> CACHE[(Artifact Cache<br/>S3 / R2)]
CACHE --> TF[Terraform<br/>Providers]
CACHE --> ANS[Ansible<br/>Collections]
CACHE --> PKG[Package<br/>Dependencies]
CACHE --> IMG[Container<br/>Layers]
style RUNNER fill:#7C3AED,stroke:#5B21B6,color:#fff
style CACHE fill:#F59E0B,stroke:#B45309,color:#fff
style TF fill:#7B42BC,stroke:#5C32A3,color:#fff
style ANS fill:#EE0000,stroke:#CC0000,color:#fff
style PKG fill:#3B82F6,stroke:#2563EB,color:#fff
style IMG fill:#0DB7ED,stroke:#0B93BD,color:#fff
Deploy a shared artifact bucket in the PLT account for all pipelines:
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#232F3E', 'lineColor': '#232F3E', 'background': '#fff'}}}%%
flowchart TB
PLT[PLT Account] --> BUCKET[(cicd-artifacts<br/>S3 Bucket)]
BUCKET --> PROV[terraform-providers/]
BUCKET --> COLL[ansible-collections/]
BUCKET --> DEPS[dependencies/]
BUCKET --> PLANS[terraform-plans/]
style PLT fill:#7C3AED,stroke:#5B21B6,color:#fff
style BUCKET fill:#F59E0B,stroke:#B45309,color:#fff
style PROV fill:#7B42BC,stroke:#5C32A3,color:#fff
style COLL fill:#EE0000,stroke:#CC0000,color:#fff
style DEPS fill:#3B82F6,stroke:#2563EB,color:#fff
style PLANS fill:#10B981,stroke:#047857,color:#fff
| Artifact Type | Path / Key Pattern | TTL | Why Cache |
|---|---|---|---|
| Terraform providers | terraform-providers/{hash}/ |
7 days | Large binaries, rarely change |
| Terraform plugin cache | terraform-plugins/{os}-{arch}/ |
7 days | Provider binaries per platform |
| Ansible collections | ansible-collections/{hash}/ |
3 days | Galaxy downloads are slow |
| Python packages | pip-cache/{requirements-hash}/ |
3 days | pip install overhead |
| Node modules | node-modules/{lockfile-hash}/ |
3 days | npm/yarn install overhead |
| Go modules | go-mod/{go.sum-hash}/ |
7 days | go mod download overhead |
| Container layers | docker-layers/{image-hash}/ |
1 day | Base image pull overhead |
| Terraform plans | terraform-plans/{run-id}/ |
24 hours | Share plan between jobs |
Critical: Caches must expire to receive security updates.
# S3 Lifecycle Rules for artifact bucket
LifecycleConfiguration:
Rules:
# Expire provider cache after 7 days
- Id: ExpireProviderCache
Status: Enabled
Prefix: terraform-providers/
ExpirationInDays: 7
# Expire dependency caches after 3 days
- Id: ExpireDependencyCache
Status: Enabled
Prefix: dependencies/
ExpirationInDays: 3
# Expire plan artifacts after 1 day
- Id: ExpirePlanArtifacts
Status: Enabled
Prefix: terraform-plans/
ExpirationInDays: 1
# Clean up incomplete uploads
- Id: AbortIncompleteUploads
Status: Enabled
AbortIncompleteMultipartUpload:
DaysAfterInitiation: 1
Why Short TTLs:
| Concern | Solution |
|---|---|
| Security patches | 3-7 day TTL ensures updates within a week |
| Vulnerability fixes | Short TTL forces fresh downloads regularly |
| Cache poisoning | Hash-based keys + expiration limits blast radius |
| Storage costs | Automatic cleanup prevents unbounded growth |
| Provider | Best For | Pros | Cons |
|---|---|---|---|
| AWS S3 | AWS-native pipelines | Native IAM, same region as infra | Egress costs |
| Cloudflare R2 | Multi-cloud, cost-sensitive | Zero egress fees, global edge | Separate auth needed |
| GitHub Cache | GitHub Actions only | Built-in, no setup | 10GB limit, repo-scoped |
| GitLab Cache | GitLab CI only | Built-in, no setup | Runner-scoped |
GitHub Actions:
- name: Cache Terraform providers
uses: actions/cache@v4
with:
path: ~/.terraform.d/plugin-cache
key: terraform-${{ runner.os }}-${{ hashFiles('**/.terraform.lock.hcl') }}
restore-keys: |
terraform-${{ runner.os }}-
- name: Cache Ansible collections
uses: actions/cache@v4
with:
path: ~/.ansible/collections
key: ansible-${{ hashFiles('**/requirements.yml') }}
restore-keys: |
ansible-
GitLab CI:
.cache-terraform: &cache-terraform
cache:
key: terraform-${CI_COMMIT_REF_SLUG}
paths:
- .terraform/providers/
policy: pull-push
when: on_success
.cache-ansible: &cache-ansible
cache:
key: ansible-${CI_COMMIT_REF_SLUG}
paths:
- .ansible/collections/
policy: pull-push
S3 Backend Cache (All Platforms):
# Download cache from S3
CACHE_KEY="terraform-providers-$(sha256sum .terraform.lock.hcl | cut -d' ' -f1)"
aws s3 cp "s3://${ARTIFACT_BUCKET}/${CACHE_KEY}.tar.gz" /tmp/cache.tar.gz || true
if [[ -f /tmp/cache.tar.gz ]]; then
tar -xzf /tmp/cache.tar.gz -C ~/.terraform.d/
fi
# After terraform init, upload cache
tar -czf /tmp/cache.tar.gz -C ~/.terraform.d/ plugin-cache/
aws s3 cp /tmp/cache.tar.gz "s3://${ARTIFACT_BUCKET}/${CACHE_KEY}.tar.gz"
Configure Terraform to use a plugin cache directory:
# ~/.terraformrc or environment variable
plugin_cache_dir = "$HOME/.terraform.d/plugin-cache"
# Or via environment
# TF_PLUGIN_CACHE_DIR="$HOME/.terraform.d/plugin-cache"
Provider Mirror for Air-Gapped:
provider_installation {
filesystem_mirror {
path = "/opt/terraform/providers"
include = ["registry.terraform.io/*/*"]
}
direct {
exclude = ["registry.terraform.io/*/*"]
}
}
# Install to specific path for caching
ansible-galaxy collection install -r requirements.yml -p ~/.ansible/collections
# Or use environment variable
export ANSIBLE_COLLECTIONS_PATH=~/.ansible/collections
Force cache refresh when needed:
# GitHub Actions - add date to key for daily refresh
key: terraform-${{ runner.os }}-${{ hashFiles('**/.terraform.lock.hcl') }}-${{ steps.date.outputs.date }}
# Or use workflow dispatch input
on:
workflow_dispatch:
inputs:
refresh_cache:
description: 'Force cache refresh'
type: boolean
default: false
Deploy via StackSet to PLT OU:
| Template | Target OU | Purpose |
|---|---|---|
plt/25-cicd-artifacts.yaml |
PLT | Shared artifact/cache bucket |
# Key bucket features
Resources:
ArtifactBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: !Sub '${Org}-cicd-artifacts-${AWS::AccountId}'
LifecycleConfiguration:
Rules:
- Id: ExpireCache
Status: Enabled
ExpirationInDays: 7
# Intelligent tiering for cost optimization
IntelligentTieringConfigurations:
- Id: CacheOptimization
Status: Enabled
Tierings:
- AccessTier: ARCHIVE_ACCESS
Days: 90
Reference: See cloudformation/stacksets/plt/25-cicd-artifacts.yaml and docs/caching.md for implementation details.
The repository includes a complete Ansible structure following the same numbered naming conventions:
ansible/
├── inventories/
│ ├── 05-sbx/ # Sandbox (WKL-NPD)
│ ├── 10-dev/ # Development (WKL-NPD)
│ ├── 20-qat/ # QA Testing (WKL-NPD)
│ ├── 40-stg/ # Staging (WKL-PRD)
│ ├── 70-prod/ # Production (WKL-PRD)
│ └── 90-dr/ # Disaster Recovery (WKL-PRD)
├── roles/
│ ├── 10-common/ # Base OS configuration
│ └── 20-security/ # Security hardening
└── playbooks/
└── site.yml
| Role | Purpose | Key Tasks |
|---|---|---|
| 10-common | Base configuration | Packages, timezone, NTP, system limits |
| 20-security | Security hardening | SSH hardening, fail2ban, auto-updates |
| Stage | Command | Purpose |
|---|---|---|
| Validate | ansible-lint |
Style and syntax checking |
| Test | --check --diff |
Dry run with change preview |
| Deploy | ansible-playbook |
Apply configuration |
Reference: See ansible/ for complete implementation.
The reference architecture supports all major CI/CD platforms with platform-specific guidance.
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#3B82F6', 'lineColor': '#64748B', 'background': '#fff'}}}%%
flowchart LR
PAT[Reference<br/>Patterns] --> GH[GitHub<br/>Actions]
PAT --> GL[GitLab<br/>CI]
PAT --> JK[Jenkins]
style PAT fill:#3B82F6,stroke:#2563EB,color:#fff
style GH fill:#64748B,stroke:#475569,color:#fff
style GL fill:#D97706,stroke:#B45309,color:#fff
style JK fill:#059669,stroke:#047857,color:#fff
| Feature | GitHub Actions | GitLab CI | Jenkins |
|---|---|---|---|
| OIDC Support | Native | Manual | Plugin |
| Trigger Rules | Workflow on: |
workflow: rules: |
Multibranch filter |
| Manual Approval | Environments | when: manual |
input step |
| Concurrency | concurrency: |
resource_group: |
Lock plugin |
| Reference | .github/workflows/ |
gitlab-ci/ |
jenkins/ |
Each platform directory contains:
Reference: See .github/, gitlab-ci/, and jenkins/ directories.
The reference repository includes comprehensive troubleshooting documentation:
| Category | Common Issues |
|---|---|
| Authentication | IMDv2 timeout, token expiry, cross-account denied |
| Terraform | State lock deadlock, plan artifacts, init slow |
| Ansible | Vault decrypt, dynamic inventory, SSH timeout |
| Pipeline | Concurrent conflicts, artifact mismatch, timeouts |
Reference: See docs/troubleshooting.md for solutions.
State Rollback Script (scripts/95-rollback.sh):
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#EF4444', 'lineColor': '#EF4444', 'background': '#fff'}}}%%
flowchart LR
A[Acquire Lock] --> B[Backup Current]
B --> C[Restore Previous]
C --> D[Release Lock]
style A fill:#3B82F6,stroke:#1E40AF,color:#fff
style B fill:#F59E0B,stroke:#B45309,color:#fff
style C fill:#EF4444,stroke:#B91C1C,color:#fff
style D fill:#10B981,stroke:#047857,color:#fff
Features:
| Script | Purpose |
|---|---|
_utils.sh |
Shared logging, validation, formatting |
00-setup-oidc-github.sh |
Manual OIDC provider setup |
05-assume-role.sh |
Cross-account role assumption |
95-rollback.sh |
Emergency state rollback |
Reference: See scripts/ for all utilities.
Step 1: Clone the Reference Repository
git clone https://github.com/zsoftly/iac-cicd-reference.git
Step 2: Set Up OU Structure
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#3B82F6', 'lineColor': '#3B82F6', 'background': '#fff'}}}%%
flowchart LR
A[Create PLT OU] --> B[Create WKL OU]
B --> C[Create WKL-NPD OU]
B --> D[Create WKL-PRD OU]
C --> E[Move accounts to OUs]
D --> E
style A fill:#7C3AED,stroke:#5B21B6,color:#fff
style B fill:#3B82F6,stroke:#2563EB,color:#fff
style C fill:#93C5FD,stroke:#3B82F6,color:#1E3A8A
style D fill:#6EE7B7,stroke:#059669,color:#064E3B
style E fill:#10B981,stroke:#047857,color:#fff
Step 3: Deploy Foundation StackSets
%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#3B82F6', 'lineColor': '#3B82F6', 'background': '#fff'}}}%%
flowchart LR
A[1. OIDC Provider<br/>→ PLT OU] --> B[2. Runner Role<br/>→ PLT OU]
B --> C[3. Deploy Roles<br/>→ WKL OU]
C --> D[4. State Backend<br/>→ WKL OU]
style A fill:#3B82F6,stroke:#1E40AF,color:#fff
style B fill:#7C3AED,stroke:#5B21B6,color:#fff
style C fill:#F59E0B,stroke:#B45309,color:#fff
style D fill:#10B981,stroke:#047857,color:#fff
Follow commands in scripts/deploy-foundation.md. New accounts added to WKL OU automatically receive deploy roles.
Step 4: Configure Your CI/CD Platform
Copy patterns from the appropriate directory:
.github/workflows/gitlab-ci/jenkins/Step 5: Customize for Your Organization
Provide the repository and docs/conventions.md to your AI assistant with your organization's context to generate customized pipelines.
CI/CD Pipeline Specialists
Infrastructure as Code Experts
Canadian-Based Team
| Service | Description |
|---|---|
| Assessment | Review current pipelines, identify improvements |
| Implementation | Deploy reference architecture for your organization |
| Migration | Move from legacy CI/CD to modern patterns |
| Training | Enable your team on IaC CI/CD best practices |
| Support | Ongoing assistance and troubleshooting |
| Outcome | How |
|---|---|
| Scale Revenue | Faster deployments, more features to market |
| Reduce Costs | 60-80% fewer pipeline runs, no credential rotation overhead |
| Minimize Risk | Zero stored secrets, mandatory approvals, state protection |
Start with our open-source reference:
github.com/zsoftly/iac-cicd-reference
Or get expert help:
ZSoftly Technologies Inc.
| Contact | Link |
|---|---|
| Website | zsoftly.com |
| Services | zsoftly.com/services/devops |
| Contact | zsoftly.com/contact |
| Get in Touch | zsoftly.com/contact |
| Reference Repo | github.com/zsoftly/iac-cicd-reference |
Document Version: 1.0 - December 2025
Copyright (c) 2025 ZSoftly Technologies Inc. All rights reserved.