Skip to content

Architecture

This page is the production architecture in one place. Everything else in the docs is a deeper dive into one of the boxes below.

The big picture

Tomoda runs on a single GKE cluster in asia-east1-a. User traffic arrives over Cloudflare DNS, terminates at either a CloudFront distribution (static assets) or a GCP Load Balancer (API + app), and inside the cluster Traefik routes to backend and frontend Pods. Data services (Postgres, Redis, Photon) all live inside the same cluster. ArgoCD reconciles the entire desired state from this git repository.

flowchart TB
    User([User])

    subgraph Cloudflare["Cloudflare (DNS only)"]
        CF[Zone tomoda.life<br/>proxied: false]
    end

    subgraph AWS["AWS ap-northeast-1"]
        CFront[CloudFront<br/>assets.tomoda.life]
        S3[(S3<br/>tomoda-assets-prod)]
    end

    subgraph GCP["GCP development-485000 / asia-east1"]
        LB[GCP External LB<br/>api.tomoda.life<br/>app.tomoda.life]

        subgraph GKE["GKE cluster: gke-tomoda"]
            Traefik[Traefik Ingress]

            subgraph AppNS["namespace: prod / tomoda"]
                Backend[Backend Pods]
                Frontend[Frontend Pods]
            end

            subgraph DataNS["namespace: data"]
                CNPG[(CNPG Postgres<br/>+ PostGIS)]
                Redis[(Redis<br/>Bitnami)]
                Photon[Photon geocoder]
                Indexer[/photon-indexer<br/>CronJob suspended/]
            end

            subgraph SysNS["namespace: argocd / cert-manager / etc."]
                ArgoCD[ArgoCD]
                ESO[External Secrets<br/>Operator]
                CertMgr[cert-manager]
                ExtDNS[external-dns]
            end
        end

        GCSBackup[(GCS<br/>tomoda-db-backups-*)]
        GCSPhoton[(GCS<br/>*-photon-index)]
        SecretMgr[(GCP Secret Manager)]
        AR[(Artifact Registry<br/>tomoda-dev-repo<br/>tomoda-prod-repo)]
    end

    SecretMgrAWS[(AWS Secrets Manager)]
    Repo[(GitHub<br/>tomoda-labs/devops)]

    User -->|assets.tomoda.life| CF
    User -->|api/app/www.tomoda.life| CF
    CF -.->|CNAME| CFront
    CF -.->|A/CNAME| LB
    CFront --> S3
    LB --> Traefik
    Traefik --> Backend
    Traefik --> Frontend
    Backend --> CNPG
    Backend --> Redis
    Backend --> Photon
    Indexer --> GCSPhoton
    Photon -.->|reads index| GCSPhoton
    CNPG -->|Barman WAL + base backups| GCSBackup
    ESO -.->|sync| SecretMgr
    ESO -.->|sync| SecretMgrAWS
    ArgoCD -.->|reconcile| Repo
    ArgoCD --> AppNS
    ArgoCD --> DataNS
    ArgoCD --> SysNS
    ExtDNS -.->|manage records| CF

Component-by-component

Edge and DNS

Cloudflare hosts the tomoda.life zone but proxies nothing — every record is set with proxied = false. Cloudflare's job is name resolution and ACM DNS-01 validation for the CloudFront certificate. See infrastructure/aws/cloudflare.tf.

User-facing CNAMEs split into two paths:

  • assets.tomoda.life and assets-dev.tomoda.life point at a CloudFront distribution in us-east-1 (CloudFront is global, but ACM certs for CloudFront must live in us-east-1, see infrastructure/aws/acm.tf).
  • api.tomoda.life, app.tomoda.life, www.tomoda.life (and their -dev siblings) point at the external IP of the Traefik LoadBalancer Service in GKE. Records are written automatically by external-dns running in the cluster against the Cloudflare zone — see k8s/envs/dev/sys/external-dns/values.yaml.

CDN and static assets

Static assets are hosted on S3 (tomoda-assets-prod, tomoda-assets-dev) and fronted by CloudFront with Origin Access Control. The bucket has all public access blocked; only CloudFront can read it. ACM issues the TLS cert (validated via Cloudflare DNS), and cert-manager is not involved for this path. See infrastructure/aws/s3.tf and infrastructure/aws/cloudfront.tf.

Load balancer and ingress

The Traefik Service is annotated cloud.google.com/load-balancer-type: "External" (see k8s/envs/dev/sys/traefik/values.yaml), so GKE provisions a Google external L4 load balancer. All non-asset HTTPS traffic enters here, TLS terminates at Traefik using a letsencrypt-prod cert-manager ClusterIssuer, and Traefik routes by host header — see the Ingress in k8s/apps/tomoda/base/ingress.yaml and the prod overlay in k8s/apps/tomoda/overlays/prod/kustomization.yaml.

Application Pods

The backend and frontend Deployments live in k8s/apps/tomoda/base/. The dev overlay deploys into the tomoda namespace; the prod overlay deploys into the prod namespace and adds a PodDisruptionBudget (k8s/apps/tomoda/overlays/prod/pdb.yaml). Images are pulled from Artifact Registry in asia-east1tomoda-dev-repo for dev and tomoda-prod-repo for prod.

Data plane (in-cluster)

All stateful services run inside the cluster in the data namespace:

  • Postgres is a CloudNativePG Cluster with the ghcr.io/cloudnative-pg/postgis:17-3.5 image — PostGIS is available out of the box. See k8s/envs/prod/middleware/postgres/manifests/cluster.yaml. Continuous WAL archiving and base backups go to GCS via Barman (Workload Identity binds the CNPG pod SA to the cnpg-backup-sa GCP SA, see infrastructure/gcp/backup.tf).
  • Redis is the Bitnami chart in standalone mode with auth disabled (in-cluster only). See k8s/envs/prod/middleware/redis/values.yaml.
  • Photon serves geocoding from a multilingual index it reads at startup from the GCS *-photon-index bucket — provisioned manually via the bootstrap doc (deliberately outside Terraform so the ~$500 planet index survives any terraform destroy). The photon-indexer CronJob that rebuilds the index is currently suspended; index rebuilds happen via scripts/photon-index-local.sh on a large VM until the in-cluster Nominatim is provisioned.

Secrets

Application secrets live in GCP Secret Manager (and a smaller set in AWS Secrets Manager). External Secrets Operator runs in-cluster and materializes them as native Secret objects via ExternalSecret and ClusterSecretStore resources. See examples wired into k8s/envs/prod/middleware/postgres/manifests/cluster.yaml and k8s/apps/tomoda/overlays/prod/external-secret.yaml.

GitOps control plane

ArgoCD is installed via Helm 7.7.13 in infrastructure/gcp/argocd.tf and exposed at argo-app.tomoda.life. SSO is handled by ArgoCD's bundled Dex with Google as the connector, restricted to the tomoda.life hosted domain. Two bootstrap Application manifests (k8s/envs/dev/bootstrap.yaml, k8s/envs/prod/bootstrap.yaml) seed an app-of-apps pattern that pulls in every other workload.

Why everything in-cluster

Postgres, Redis, and Photon could all be managed services. We chose in-cluster operators for cost, portability, and a single source of truth. The trade-offs are spelled out on the Decisions page.

Request paths in one sentence

  • Static asset: User → Cloudflare DNS → CloudFront → S3.
  • API call: User → Cloudflare DNS → GCP LB → Traefik → backend Pod → CNPG / Redis / Photon (all in-cluster).
  • Geocode: backend Pod → in-cluster Photon Service → Photon reads index from GCS at startup.