Traefik¶
The cluster's ingress controller. Every HTTPS request to *.tomoda.life that hits the cluster (Argo CD UI, Grafana, the tomoda app, auth callbacks) is terminated at the GCP load balancer in front of Traefik and routed by Traefik to the right backend Service.
Installed by k8s/envs/dev/sys/traefik/application.yaml, configured by k8s/envs/dev/sys/traefik/values.yaml.
Chart and source¶
| Field | Value |
|---|---|
| Helm chart | traefik |
| Repository | https://traefik.github.io/charts |
| Version | 39.0.0 |
| Destination namespace | traefik-system (created by Argo CD) |
| Argo CD Application | traefik |
The Application uses Argo CD's multi-source pattern — the chart is pulled from the upstream Helm repo, and valueFiles references this repo's values.yaml via $values/k8s/envs/dev/sys/traefik/values.yaml. That keeps the chart at a pinned version while letting us track tweaks to the values file in Git.
Why Traefik (not GKE-native Ingress)¶
GCP ships a perfectly functional native Ingress controller (gce / gce-internal), but Traefik was picked deliberately:
- Cross-cloud portability. The same Ingress manifests would work unchanged if the cluster moved to EKS or AKS, or to a self-hosted Kubernetes box. GKE Ingress ties manifests to GCP load balancer semantics.
- Middleware ecosystem. The Argo CD UI authentication chain —
sys-oauth2-proxy-errors(redirect 401/403 to Google login) plussys-oauth2-proxy-auth(forwardAuth to/oauth2/auth) — is twotraefik.io/v1alpha1MiddlewareCRs glued onto an Ingress with a single annotation. The equivalent on GKE Ingress would be Cloud Armor + IAP, with significantly more GCP-side configuration. - Simpler ingress-class declaration. A workload sets
ingressClassName: traefikand that's it. Traefik picks it up, configures routing, and (combined with cert-manager and external-dns) the workload is reachable on HTTPS with a real DNS record and a real cert without anyone touching GCP consoles.
Service exposure¶
Traefik runs as a Deployment in traefik-system and is fronted by a Service of type LoadBalancer:
# values.yaml
service:
annotations:
cloud.google.com/load-balancer-type: "External"
That annotation tells GKE to provision an external GCP load balancer (a regional Network LB with a public IP) — not an internal one. The LB's IP is the destination of the tomoda.life A records that External-DNS writes into Cloudflare.
There is only one load balancer in front of the cluster, and Traefik fans it out to every backend based on Host header.
Ingress class¶
The chart is configured to register itself as the controller for ingressClassName: traefik. From values.yaml:
additionalArguments:
- "--providers.kubernetesingress.ingressclass=traefik"
- "--providers.kubernetesingress.ingressendpoint.publishedservice=traefik-system/traefik"
- "--providers.kubernetesingress.allowExternalNameServices=true"
- "--providers.kubernetescrd=true"
Every Ingress in this repo sets ingressClassName: traefik explicitly. Anything that doesn't is ignored by Traefik and (since there is no other controller installed) will never be reachable.
allowExternalNameServices=true is what lets ingress paths target ExternalName Services — used by the oauth2-proxy-redir Services in k8s/envs/dev/sys/manifests/external-services.yaml to forward /oauth2/* paths from arbitrary namespaces (argocd, tomoda, prod, data) to the single oauth2-proxy Deployment in sys.
kubernetescrd=true enables the Traefik CRD provider (IngressRoute, Middleware, TLSStore, etc.) — that's what makes the oauth2-proxy Middleware resources work.
Dashboard, logs, metrics¶
The Traefik dashboard is enabled in values.yaml:
ingressRoute:
dashboard:
enabled: true
It is reachable inside the cluster as a default IngressRoute — port-forward traefik in traefik-system on port 9000 and visit /dashboard/ for a real-time view of routers, services, and middlewares.
Logs are emitted as JSON (general log + access log), which is what makes the Promtail pipeline in Loki cleanly extract entryPointName, request_Host, status, method, and level as labels.
Prometheus metrics are exposed on port 8082 (entrypoint metrics), and the chart creates a ServiceMonitor in the monitoring namespace with the release: monitoring label so kube-prometheus-stack picks it up. The traefik-metrics and traefik-logs Grafana dashboards (gnetId: 11462 and 13702) are auto-provisioned by the monitoring chart values.
Operational notes¶
- There is one Traefik replica by default. Spot-node preemption will black all ingress traffic for the ~60 seconds it takes the pod to reschedule. Bumping replicas is a values change; the GCP LB will spread across them automatically.
- Routing decisions are driven purely by Ingress + Middleware resources in Git. Do not
kubectl editTraefik's runtime — Argo CD will overwrite. - Adding a new public hostname requires three pieces: an
IngresswithingressClassName: traefik, thecert-manager.io/cluster-issuer: letsencrypt-prodannotation, and a route in External-DNS'sdomainFilters. The first two go in the workload's manifests; the third (tomoda.life) is already configured.