Operations¶
Day-to-day operational guides for running the tomoda platform on GKE. Start with the Runbook for routine commands, then drill into specific topics below.
-
Daily ops cookbook: connect to clusters, tail logs, sync apps, restart deployments.
-
How a code change reaches dev and prod via Cloud Build, Artifact Registry, and Argo CD Image Updater.
-
Reverting bad deploys. Image rollback, Git revert, manual scale-down, and DB rollback options.
-
Scale backend replicas, CNPG storage, Redis, and GKE node pools.
-
Production debugging: kubectl, Loki, Grafana, Argo CD events, External Secrets diagnostics.
-
PITR, latest-backup restore, and full-rebuild recovery using
scripts/disaster-recovery.sh. -
CNPG cluster shape, Barman WAL archiving, manual backups, restores, scaling, upgrades.
-
Building and uploading the multilingual Photon index. Atomic in-cluster swap.
-
Argo CD SSO via Dex, Cloud Build approvals, GCP/AWS IAM bindings.
-
Tempo (traces), Prometheus scrape coverage, Loki retention + Promtail pipeline.