Skip to content

Traefik

The cluster's ingress controller. Every HTTPS request to *.tomoda.life that hits the cluster (Argo CD UI, Grafana, the tomoda app, auth callbacks) is terminated at the GCP load balancer in front of Traefik and routed by Traefik to the right backend Service.

Installed by k8s/envs/dev/sys/traefik/application.yaml, configured by k8s/envs/dev/sys/traefik/values.yaml.

Chart and source

Field Value
Helm chart traefik
Repository https://traefik.github.io/charts
Version 39.0.0
Destination namespace traefik-system (created by Argo CD)
Argo CD Application traefik

The Application uses Argo CD's multi-source pattern — the chart is pulled from the upstream Helm repo, and valueFiles references this repo's values.yaml via $values/k8s/envs/dev/sys/traefik/values.yaml. That keeps the chart at a pinned version while letting us track tweaks to the values file in Git.

Why Traefik (not GKE-native Ingress)

GCP ships a perfectly functional native Ingress controller (gce / gce-internal), but Traefik was picked deliberately:

  • Cross-cloud portability. The same Ingress manifests would work unchanged if the cluster moved to EKS or AKS, or to a self-hosted Kubernetes box. GKE Ingress ties manifests to GCP load balancer semantics.
  • Middleware ecosystem. The Argo CD UI authentication chain — sys-oauth2-proxy-errors (redirect 401/403 to Google login) plus sys-oauth2-proxy-auth (forwardAuth to /oauth2/auth) — is two traefik.io/v1alpha1 Middleware CRs glued onto an Ingress with a single annotation. The equivalent on GKE Ingress would be Cloud Armor + IAP, with significantly more GCP-side configuration.
  • Simpler ingress-class declaration. A workload sets ingressClassName: traefik and that's it. Traefik picks it up, configures routing, and (combined with cert-manager and external-dns) the workload is reachable on HTTPS with a real DNS record and a real cert without anyone touching GCP consoles.

Service exposure

Traefik runs as a Deployment in traefik-system and is fronted by a Service of type LoadBalancer:

# values.yaml
service:
  annotations:
    cloud.google.com/load-balancer-type: "External"

That annotation tells GKE to provision an external GCP load balancer (a regional Network LB with a public IP) — not an internal one. The LB's IP is the destination of the tomoda.life A records that External-DNS writes into Cloudflare.

There is only one load balancer in front of the cluster, and Traefik fans it out to every backend based on Host header.

Ingress class

The chart is configured to register itself as the controller for ingressClassName: traefik. From values.yaml:

additionalArguments:
  - "--providers.kubernetesingress.ingressclass=traefik"
  - "--providers.kubernetesingress.ingressendpoint.publishedservice=traefik-system/traefik"
  - "--providers.kubernetesingress.allowExternalNameServices=true"
  - "--providers.kubernetescrd=true"

Every Ingress in this repo sets ingressClassName: traefik explicitly. Anything that doesn't is ignored by Traefik and (since there is no other controller installed) will never be reachable.

allowExternalNameServices=true is what lets ingress paths target ExternalName Services — used by the oauth2-proxy-redir Services in k8s/envs/dev/sys/manifests/external-services.yaml to forward /oauth2/* paths from arbitrary namespaces (argocd, tomoda, prod, data) to the single oauth2-proxy Deployment in sys.

kubernetescrd=true enables the Traefik CRD provider (IngressRoute, Middleware, TLSStore, etc.) — that's what makes the oauth2-proxy Middleware resources work.

Dashboard, logs, metrics

The Traefik dashboard is enabled in values.yaml:

ingressRoute:
  dashboard:
    enabled: true

It is reachable inside the cluster as a default IngressRoute — port-forward traefik in traefik-system on port 9000 and visit /dashboard/ for a real-time view of routers, services, and middlewares.

Logs are emitted as JSON (general log + access log), which is what makes the Promtail pipeline in Loki cleanly extract entryPointName, request_Host, status, method, and level as labels.

Prometheus metrics are exposed on port 8082 (entrypoint metrics), and the chart creates a ServiceMonitor in the monitoring namespace with the release: monitoring label so kube-prometheus-stack picks it up. The traefik-metrics and traefik-logs Grafana dashboards (gnetId: 11462 and 13702) are auto-provisioned by the monitoring chart values.

Operational notes

  • There is one Traefik replica by default. Spot-node preemption will black all ingress traffic for the ~60 seconds it takes the pod to reschedule. Bumping replicas is a values change; the GCP LB will spread across them automatically.
  • Routing decisions are driven purely by Ingress + Middleware resources in Git. Do not kubectl edit Traefik's runtime — Argo CD will overwrite.
  • Adding a new public hostname requires three pieces: an Ingress with ingressClassName: traefik, the cert-manager.io/cluster-issuer: letsencrypt-prod annotation, and a route in External-DNS's domainFilters. The first two go in the workload's manifests; the third (tomoda.life) is already configured.