Skip to content

Multi-Cloud

Tomoda spans three providers. Each one has a specific job; they do not overlap.

Provider Responsibility Why
GCP Compute, data, CI, images, backups, secrets, auth GKE is the home of every workload that holds state or serves a request
AWS Static assets + their CDN Historical: the S3+CloudFront pipeline predates the GKE setup and works well
Cloudflare DNS only (proxied: false) Avoid double-CDN; keep CloudFront's edge logic simple

GCP — the workload home

Everything that runs, holds data, or builds images lives on GCP, in a single project development-485000 in asia-east1. The Terraform for this surface is in infrastructure/gcp/.

  • Compute: GKE cluster gke-tomoda with spot node pools. See infrastructure/gcp/gke.tf.
  • Network: A custom VPC gke-tomoda-vpc with secondary IP ranges for Pods and Services. See infrastructure/gcp/vpc.tf.
  • Data: CloudNativePG and Bitnami Redis both run inside the cluster — there is no Cloud SQL and no Memorystore. Photon runs in-cluster from an index hosted on GCS.
  • CI: Cloud Build hosts dev triggers (push to main, manual approval) and prod release triggers (semver tags, auto). See infrastructure/gcp/cloudbuild.tf.
  • Images: Artifact Registry in asia-east1 with separate tomoda-dev-repo and tomoda-prod-repo Docker repos. See infrastructure/gcp/registry.tf.
  • Backups: GCS bucket tomoda-db-backups-development-485000 is the destination for CNPG Barman WAL + base backups. 30-day lifecycle. See infrastructure/gcp/backup.tf.
  • Photon index storage: GCS bucket <project>-photon-index holds versioned Photon index tarballs. Public read, because the rtuszik/photon-docker image fetches over HTTP without auth. Manually provisioned, not Terraform-managed — see bootstrap doc. Detached from Terraform so a terraform destroy can never delete the ~$500 planet index.
  • Secrets: GCP Secret Manager holds application secrets. Workload Identity binds K8s service accounts to GCP service accounts so pods can sync via External Secrets Operator without static credentials. See infrastructure/gcp/oauth.tf and scripts/setup-gcp-secrets.sh.
  • Auth: Argo CD's bundled Dex uses Google OAuth restricted to the tomoda.life hosted domain. See infrastructure/gcp/argocd.tf.

AWS — static assets and their CDN

AWS owns the static-asset pipeline and nothing else. The Terraform lives in infrastructure/aws/ and is operated per-environment with workspaces (default and dev).

  • Region: ap-northeast-1 (Tokyo).
  • Storage: S3 buckets tomoda-assets-dev and tomoda-assets-prod. All public access is blocked; only CloudFront can read via Origin Access Control. See infrastructure/aws/s3.tf.
  • CDN: A CloudFront distribution per environment, alias assets.tomoda.life (prod) or assets-dev.tomoda.life (dev). See infrastructure/aws/cloudfront.tf.
  • TLS: ACM certificate provisioned in us-east-1 (mandatory for CloudFront), validated via DNS records that this same Terraform writes into Cloudflare. See infrastructure/aws/acm.tf.

Why AWS for assets when GCP could do it

GCS + Cloud CDN would technically work. The S3+CloudFront stack was set up early and has not been a problem, so there has been no reason to migrate. See Decisions for the trade-off discussion.

Cloudflare — DNS only

Cloudflare hosts the tomoda.life zone. Every record set by Terraform or by external-dns uses proxied = false — Cloudflare's CDN and WAF are off. The role is pure name resolution plus ACM DNS validation.

There are two writers to the zone:

  • Terraform (infrastructure/aws/cloudflare.tf) writes the ACM validation records and the assets.* CNAME pointing at CloudFront.
  • external-dns, running in-cluster (k8s/envs/dev/sys/external-dns/values.yaml), watches Ingress and Service objects and syncs api.*, app.*, www.* (and their -dev siblings) and argo-app.* to Cloudflare records pointing at the Traefik LoadBalancer's external IP. It uses a txt registry with owner ID k8s-dev so it only touches records it created.

This split means: when you add a host to an Ingress in k8s/apps/tomoda/, external-dns writes the DNS record automatically. When you add a CloudFront distribution in infrastructure/aws/, Terraform writes its CNAME. The two do not overlap.

Putting it together

flowchart LR
    User([User])

    subgraph DNS[Cloudflare DNS]
        Z[tomoda.life zone<br/>proxied: false]
    end

    subgraph AWSBlock[AWS ap-northeast-1]
        CF[CloudFront]
        S3[(S3)]
    end

    subgraph GCPBlock[GCP asia-east1]
        LB[GCP External LB]
        GKE[GKE cluster]
        GCS[(GCS backups + photon)]
        AR[(Artifact Registry)]
        SM[(Secret Manager)]
        CB[Cloud Build]
    end

    User --> Z
    Z -.->|assets.*| CF
    Z -.->|api/app/www/argo-app.*| LB
    CF --> S3
    LB --> GKE
    GKE --> GCS
    GKE --> SM
    CB --> AR
    AR --> GKE

If you are deciding where a new piece of infrastructure goes, the rule of thumb is: anything that needs to be reached from inside the cluster — or that the cluster reaches out to — lives on GCP. Anything that serves cacheable assets to the public lives on AWS. DNS is Cloudflare's, full stop.