Skip to content

Runbook

Daily operational commands. Bookmark this page.

Prerequisites

  • gcloud authenticated against development-485000
  • kubectl installed
  • argocd CLI installed for sync operations (optional, UI works too)

task shortcuts

The repo's Taskfile.yml wraps most of the commands below as one-liners. Run task --list to see all of them. The long-form kubectl / terraform / gcloud commands below are kept for reference (and for situations where the task wrapper isn't expressive enough), but the short form is faster day-to-day:

Manual command Task shortcut
gcloud container clusters get-credentials … task kube:auth
kubectl get pods -A -o wide task kube:pods
kubectl logs … deploy/tomoda-api (or tomoda-async) task kube:logs -- prod
kubectl rollout restart deployment/tomoda-api (or tomoda-async) task kube:rollout -- tomoda-api prod
argocd app list -o wide task argo:status
argocd app sync <name> task argo:sync -- <name>
terraform plan in infrastructure/gcp/ task tf:plan:gcp
terraform plan in infrastructure/aws/ (env-aware) task tf:plan:aws -- prod
gcloud artifacts docker images list … task images:list -- prod
kubectl exec … psql … task pg:console (dev) or task pg:console -- prod
kubectl exec … redis-cli task redis:cli (dev) or task redis:cli -- prod
./scripts/disaster-recovery.sh --env … --mode … task dr:latest -- dev / task dr:pitr -- dev <timestamp>

Connecting to the cluster

gcloud container clusters get-credentials gke-tomoda \
  --zone asia-east1-a \
  --project development-485000

# Verify
kubectl get nodes
kubectl config current-context

The cluster lives in asia-east1-a. Project development-485000 hosts both dev and prod workloads, separated by namespace.

Tailing logs

The backend is split into two pools — tomoda-api (HTTP + WS) and tomoda-async (worker + scheduler). Pick the right one based on what you're debugging.

# API pool (dev / prod)
kubectl logs -f deployment/tomoda-api -n tomoda
kubectl logs -f deployment/tomoda-api -n prod

# Async pool (worker + scheduler)
kubectl logs -f deployment/tomoda-async -n prod

# Last 200 lines of a specific pod
kubectl logs --tail=200 <pod-name> -n <ns>

# Previous container after a crash
kubectl logs <pod-name> -n <ns> --previous

For longer-range queries, use Loki via Grafana. The Loki datasource is wired into the in-cluster Grafana and indexes all pod logs with namespace, pod, and container labels. Use the Explore tab with LogQL:

{namespace="prod", app="tomoda-api"} |= "error"
{namespace="prod", app="tomoda-async"} |= "error"

Checking Argo CD sync status

UI: https://argo-app.tomoda.life (Google SSO).

CLI:

argocd login argo-app.tomoda.life --sso

argocd app list
argocd app get tomoda
argocd app get prod-tomoda

A healthy app shows Sync Status: Synced and Health Status: Healthy.

Forcing a sync or refresh

# Pull latest manifests from Git and reconcile
argocd app sync tomoda

# Just re-evaluate without pulling Git (refresh)
argocd app get tomoda --refresh

# Hard refresh (re-clone Git repo)
argocd app get tomoda --hard-refresh

From the UI: open the app, click Sync (top right). Use Hard Refresh under the kebab menu if you suspect Argo CD has a stale Git cache.

Restarting a deployment

# Rolls pods one-by-one respecting PDB
kubectl rollout restart deployment/tomoda-api -n prod
kubectl rollout status deployment/tomoda-api -n prod

# Async pool
kubectl rollout restart deployment/tomoda-async -n prod
kubectl rollout status deployment/tomoda-async -n prod

# Frontend
kubectl rollout restart deployment/frontend -n prod

For HPA inspection and scaling-bound changes, see Scaling.

Use this after rotating a secret (so the pod re-reads env vars) or after a manual ConfigMap change.

Inspecting secrets

Never kubectl get secret ... -o yaml in front of others. To verify a secret exists and is synced without revealing the value:

# Are all ExternalSecrets in SecretSynced state?
kubectl get externalsecret -A

# Inspect a specific ExternalSecret resource (safe — no payload)
kubectl describe externalsecret backend-secrets-prod -n prod

# Confirm the synced K8s Secret exists, see only key names
kubectl get secret backend-secrets-prod -n prod -o jsonpath='{.data}' | jq 'keys'

If STATUS is anything but SecretSynced, see Debugging.

Tail Cloud Build logs

# List recent builds
gcloud builds list --project=development-485000 --limit=10

# Stream logs for a specific build
gcloud builds log <BUILD_ID> --project=development-485000 --stream

Or via the Cloud Console: Cloud Build > History.

Manually triggering Cloud Build dev approval

Dev builds require manual approval to control spend. When a push to main triggers a dev build, it pauses in PENDING state.

# List pending approvals
gcloud builds list --filter="status=PENDING" --project=development-485000

# Approve a build
gcloud builds approve <BUILD_ID> --project=development-485000

Or in the Cloud Console: Cloud Build > History > [build] > Approve.

Connecting to Postgres

The CNPG clusters live in the data namespace. Use port-forward for ad-hoc psql access — never expose them via LoadBalancer.

# Dev primary
kubectl port-forward -n data svc/postgres-dev-rw 5432:5432 &

# Get the password from the K8s secret
PGPASSWORD=$(kubectl get secret postgres-dev-credentials -n data \
  -o jsonpath='{.data.password}' | base64 -d)

PGPASSWORD="$PGPASSWORD" psql -h localhost -U tomoda_dev_user -d tomoda_dev

For prod, substitute postgres-prod-rw and tomoda_prod_user / tomoda_prod. Or exec directly into the pod:

kubectl exec -it postgres-dev-1 -n data -- psql -U tomoda_dev_user -d tomoda_dev

Connecting to Redis

# Dev
kubectl port-forward -n data svc/redis-master 6379:6379 &
REDIS_PW=$(kubectl get secret backend-secrets-dev -n tomoda \
  -o jsonpath='{.data.REDIS_PASSWORD}' | base64 -d)
redis-cli -h localhost -a "$REDIS_PW"

# Prod
kubectl port-forward -n data svc/prod-redis-master 6379:6379 &

Switching kubectl context between dev and prod namespaces

There is one GKE cluster (gke-tomoda). Dev and prod are separated by namespace, not context. Set a default namespace per shell:

# Work in dev
kubectl config set-context --current --namespace=tomoda

# Work in prod
kubectl config set-context --current --namespace=prod

# Verify
kubectl config view --minify | grep namespace

Prefer explicit -n flags in scripts. The data namespace holds shared middleware (Postgres, Redis) that both environments reference by service DNS.