Bootstrap¶
One-time setup steps that must run before Terraform can manage anything. These create the resources Terraform itself depends on (state bucket, the Photon index bucket — kept manual deliberately so terraform destroy can't touch them) plus the bootstrap Argo CD applications.
Read this before tearing down a cluster
The Photon index bucket is intentionally outside Terraform management. Each multilingual planet index costs roughly $500 worth of compute to rebuild (multiple hours of pipeline + storage). Detaching it from Terraform means terraform destroy can never accidentally delete it. The steps in this doc create those resources by hand once, then leave them alone forever.
What lives outside Terraform¶
Three permanent resources that Terraform never touches:
| Resource | Why it's outside Terraform |
|---|---|
gs://development-485000-tfstate |
Hosts the Terraform state itself. Classic chicken-and-egg — Terraform can't manage the bucket that holds its own state |
gs://development-485000-photon-index |
$500 of compute per planet index. Survives any terraform destroy. Re-attachable via terraform import if needed later |
photon-indexer@…iam.gserviceaccount.com |
SA used by the indexer to write to the bucket. Bound to the K8s SA via Workload Identity — manually maintained alongside the bucket |
Everything else (GKE, VPC, Cloud Build, Argo CD, ACM, S3, CloudFront, Cloudflare records) is Terraform-managed.
Prerequisites¶
# 1. gcloud authenticated for both shell + ADC (Application Default Credentials,
# used by Terraform + Google client libraries).
gcloud auth login
gcloud config set project development-485000
gcloud auth application-default login
gcloud auth application-default set-quota-project development-485000
# 2. Required gcloud components — installed once per machine
gcloud components install gke-gcloud-auth-plugin
Step 1 — Terraform state bucket¶
Run once per project. Creates the GCS bucket that holds Terraform state for both infrastructure/gcp/ and infrastructure/aws/.
gcloud storage buckets create gs://development-485000-tfstate \
--project=development-485000 \
--location=asia-east1 \
--uniform-bucket-level-access \
--public-access-prevention
# Object versioning so a bad apply doesn't lose state
gcloud storage buckets update gs://development-485000-tfstate --versioning
# Verify
gcloud storage buckets describe gs://development-485000-tfstate \
--format="value(name,location,versioning.enabled)"
State is stored under two prefixes — terraform/state/ for the GCP stack and aws/state/ for the AWS stack. Splitting by prefix keeps state files separate; sharing the bucket means one set of permissions and one place to look for history.
Step 2 — Photon index bucket + service account¶
Run once per project. Creates the bucket where Photon multilingual indexes are uploaded and the service account the indexer uses to write to it.
PROJECT_ID="development-485000"
BUCKET="${PROJECT_ID}-photon-index"
# 2a. The bucket itself. STANDARD class with NEARLINE transition at 35 days
# for cost savings. NO DELETE LIFECYCLE — planet indexes cost ~$500 of
# compute each to rebuild. Old indexes stay forever; manual cleanup only.
# Public read so the rtuszik/photon-docker image (which has no GCS auth)
# can download index tarballs.
gcloud storage buckets create "gs://${BUCKET}" \
--project="${PROJECT_ID}" \
--location=asia-east1 \
--default-storage-class=STANDARD \
--uniform-bucket-level-access \
--lifecycle-file=<(cat <<'EOF'
{
"lifecycle": {
"rule": [
{ "action": {"type": "SetStorageClass", "storageClass": "NEARLINE"}, "condition": {"age": 35} }
]
}
}
EOF
)
# 2a-bis. Belt-and-suspenders: turn on bucket retention policy so even an
# accidental `gcloud storage rm` can't blow indexes away. Set effectively
# infinite (100 years) — adjust later if you ever genuinely want to delete.
# NOTE: once set, this can only be RELAXED by `gcloud storage buckets update
# --no-retention-policy`, but the policy itself protects against accidental
# rm and rapid lifecycle reconfiguration.
gcloud storage buckets update "gs://${BUCKET}" --retention-period=3155760000s
# 2b. Relax the iam.allowedPolicyMemberDomains org policy on this project so
# the bucket can grant allUsers read. Scoped to the project — does not
# loosen the org-level constraint anywhere else.
gcloud services enable orgpolicy.googleapis.com --project="${PROJECT_ID}"
gcloud org-policies set-policy --project="${PROJECT_ID}" /dev/stdin <<EOF
name: projects/${PROJECT_ID}/policies/iam.allowedPolicyMemberDomains
spec:
inheritFromParent: false
rules:
- allowAll: true
EOF
# 2c. Public read on the bucket
gcloud storage buckets add-iam-policy-binding "gs://${BUCKET}" \
--member="allUsers" --role="roles/storage.objectViewer"
# 2d. Service account for the indexer
gcloud iam service-accounts create photon-indexer \
--display-name="Photon Indexer" \
--description="Builds and uploads Photon multilingual indexes to GCS" \
--project="${PROJECT_ID}"
# 2e. SA can write to the bucket
gcloud storage buckets add-iam-policy-binding "gs://${BUCKET}" \
--member="serviceAccount:photon-indexer@${PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"
# 2f. Workload Identity binding so the GKE Job pod can impersonate the SA.
# Namespace + KSA name must match the K8s manifests in k8s/.../photon-indexer/.
gcloud iam service-accounts add-iam-policy-binding \
"photon-indexer@${PROJECT_ID}.iam.gserviceaccount.com" \
--project="${PROJECT_ID}" \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:${PROJECT_ID}.svc.id.goog[platform/photon-indexer]"
Bucket name (${PROJECT_ID}-photon-index) is the contract — backend code, K8s manifests, and the local index-build script all hardcode it. Don't rename.
Step 3 — terraform init against the GCS backend¶
After Step 1, Terraform can initialise its remote state. Run in each Terraform directory:
cd infrastructure/gcp && terraform init
cd ../aws && terraform init
Both directories have a backend "gcs" block in backend.tf — terraform init reads it, talks to GCS, and refuses to operate against local state. If you ever see Backend reinitialization required, run terraform init -reconfigure.
Step 4 — Terraform apply (cluster + supporting resources)¶
# GCP — GKE cluster, VPC, Argo CD, Cloud Build, Artifact Registry, IAM, ...
cd infrastructure/gcp
terraform plan -out=plan.out
terraform apply plan.out
# AWS — S3 asset bucket, CloudFront distribution, ACM cert (Cloudflare-validated)
cd ../aws
terraform workspace select default
terraform plan -out=plan.out
terraform apply plan.out
Both should report 0 destroys when run against a fresh state (because Step 1 + Step 2 left the existing buckets out of Terraform's view).
Step 5 — GitHub App for ARC self-hosted runners¶
Optional unless you want the self-hosted ARC runner pool in k8s/envs/platform/arc-*/ to come up. Skip this step if you're fine relying on GitHub-hosted runners (or until you exhaust the GitHub free-tier minutes).
The runner pool authenticates to GitHub as a GitHub App, not a personal access token. Apps are scoped, rotatable, and don't tie CI to a single human's GitHub account.
- Go to https://github.com/organizations/tomoda-labs/settings/apps → New GitHub App.
- Fill in:
- GitHub App name:
Tomoda ARC Runners - Homepage URL:
https://github.com/tomoda-labs/devops - Webhook URL: disable webhooks (uncheck "Active")
- GitHub App name:
- Permissions — Repository permissions:
- Actions: Read and write
- Administration: Read and write
- Checks: Read
- Metadata: Read (auto-selected)
- Where can this GitHub App be installed? → Only on this account.
- Click Create GitHub App.
- On the next page, note the App ID (top of the page, ~6 digits).
- Click Generate a private key → downloads
tomoda-arc-runners.<date>.private-key.pem. Treat this file like a password. - Left sidebar → Install App → install on the
tomoda-labsorg → choose "All repositories" (or selected repos for tighter scope). - After install, the URL is
https://github.com/organizations/tomoda-labs/settings/installations/<installation-id>— note the Installation ID (the numeric path segment).
Push all three to GCP SM:
PROJECT_ID=development-485000
# App ID (numeric, not sensitive)
echo -n "<app-id-from-step-6>" | gcloud secrets create tomoda-github-app-id \
--project="${PROJECT_ID}" --replication-policy=automatic --data-file=-
# Installation ID (numeric, not sensitive)
echo -n "<installation-id-from-step-9>" | gcloud secrets create tomoda-github-app-installation-id \
--project="${PROJECT_ID}" --replication-policy=automatic --data-file=-
# Private key — the PEM file from step 7. Send the WHOLE FILE including the
# BEGIN/END markers.
gcloud secrets create tomoda-github-app-private-key \
--project="${PROJECT_ID}" --replication-policy=automatic \
--data-file=tomoda-arc-runners.<date>.private-key.pem
# Verify
gcloud secrets list --project="${PROJECT_ID}" --filter="name~tomoda-github-app"
# Expected: tomoda-github-app-id, tomoda-github-app-installation-id, tomoda-github-app-private-key
After uploading, delete the local PEM file — the only authoritative copy now lives in GCP SM:
shred -u tomoda-arc-runners.<date>.private-key.pem # Linux
rm -P tomoda-arc-runners.<date>.private-key.pem # macOS
If you rotate the App's private key in the GitHub UI later, push the new PEM via gcloud secrets versions add tomoda-github-app-private-key --data-file=…, then restart the ARC controller + listener pods to pick up the new key without waiting for ESO's 1h refresh:
kubectl rollout restart deploy/arc-controller-gha-rs-controller -n arc-systems
kubectl delete pod -n arc-runners -l app.kubernetes.io/component=runner-set-listener
Step 6 — Apply the Argo CD bootstrap¶
Once Argo CD is running (created by infrastructure/gcp/argocd.tf), point it at this repo:
# Use the new GKE context
gcloud container clusters get-credentials gke-tomoda \
--zone asia-east1-a --project development-485000
# App-of-apps root. Creates the platform/dev/prod Argo Applications that
# recurse through k8s/envs/ and bring up the rest of the cluster.
kubectl apply -f k8s/envs/bootstrap.yaml
Reconciliation takes ~5-10 minutes — kubectl get applications -n argocd -w shows progress.
To skip a tier (e.g. you only want platform + prod, not dev), after the bootstrap apply: kubectl delete application dev -n argocd.
Verification checklist¶
After all the steps above:
-
gcloud storage buckets list --filter="name~tfstate"showsdevelopment-485000-tfstate -
gcloud storage buckets list --filter="name~photon-index"showsdevelopment-485000-photon-index -
gcloud iam service-accounts list --filter="email~photon-indexer"shows the SA -
terraform state list(in each dir) shows resources — no Photon entries -
kubectl get applications -n argocdshows the bootstrap apps inSynced/Healthy
Related docs¶
- Environments — what
devvsprodmeans in this single-cluster setup - Photon Indexer — how the index pipeline runs
- Secrets Management — the GCP SM + AWS SM bridge