Skip to content

Network Policies

Pod-to-pod network isolation enforced by Kubernetes NetworkPolicy resources.

Current posture

The tomoda app defines three NetworkPolicies in k8s/apps/tomoda/base/network-policy.yaml:

Policy Target Allowed ingress
tomoda-api-policy Pods with app: tomoda-api TCP 8080 from namespace traefik-system only
tomoda-async-policy Pods with app: tomoda-async None (default-deny) — async pods only initiate connections (Redis, Postgres). The kubelet's /health probe is not subject to NetworkPolicy.
frontend-policy Pods with app: frontend TCP 8081 from namespace traefik-system only
# Excerpt
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: tomoda-api-policy
spec:
  podSelector:
    matchLabels:
      app: tomoda-api
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: traefik-system
      ports:
        - protocol: TCP
          port: 8080

Effect: no pod outside traefik-system can reach the api or frontend pods, and no pod can reach the async pods at all. Only the Traefik ingress controller can talk to api/frontend. If someone breaches another workload in the cluster, they cannot pivot directly into the backend pools.

What is not restricted

Egress

No egress NetworkPolicy is defined. Pods can:

  • Reach external APIs (Stripe, Google, Apple, LINE, OpenAI) directly
  • Connect to Postgres in the data namespace
  • Connect to Redis in the data namespace
  • Resolve DNS via kube-dns
  • Reach the GCP metadata server (which is how Workload Identity works)

If zero-trust egress becomes a requirement, you would add policies that whitelist specific FQDNs (via FQDN NetworkPolicy controllers like Cilium) or CIDR blocks.

Database access

There is no NetworkPolicy in front of the CNPG clusters. Backend pods in tomoda and prod reach Postgres in data via the ExternalName aliases:

  • postgres-postgresql.data.svc.cluster.local (dev)
  • prod-postgres-postgresql.data.svc.cluster.local (prod)

Any pod in any namespace could in principle connect to the Postgres service. We rely on:

  1. Namespace separation — dev and prod workloads cannot see each other's secrets, but they share the cluster network
  2. Postgres authenticationpg_hba requires md5 password auth, and the password is in K8s Secrets only mounted into the backend pod
  3. DB_SSLMODE=require in prod — connections must use TLS

If you want belt-and-braces, add an ingress NetworkPolicy to the data namespace allowing only tomoda and prod namespaces on port 5432.

Frontend service port

The frontend NetworkPolicy specifies port 8081 — confirm this matches the Service targetPort and the container containerPort before relying on the policy. A mismatched port silently allows nothing.

Adding a NetworkPolicy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: postgres-ingress
  namespace: data
spec:
  podSelector:
    matchLabels:
      cnpg.io/cluster: postgres-prod
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: prod
      ports:
        - protocol: TCP
          port: 5432

Drop into k8s/envs/prod/middleware/postgres/manifests/, commit, push. Argo CD syncs.

Test additions in dev first

A misconfigured NetworkPolicy can take an app down hard — the default-deny posture kicks in as soon as any policy selects a pod for the given policyType. Apply to dev, smoke-test, then promote.

Verifying enforcement

# Inside a debug pod in a namespace that should be BLOCKED
kubectl run shell --rm -it --image=busybox -n default -- /bin/sh
# Inside:
wget -T 3 -O- http://backend-service.tomoda.svc.cluster.local:8080/health
# Expected: "wget: download timed out"

# From inside Traefik's namespace (should succeed)
kubectl run shell --rm -it --image=busybox -n traefik-system -- /bin/sh
wget -T 3 -O- http://backend-service.tomoda.svc.cluster.local:8080/health
# Expected: 200 OK

If both succeed, the GKE cluster might not have NetworkPolicy enforcement enabled (Calico or Cilium). Check:

gcloud container clusters describe gke-tomoda \
  --zone asia-east1-a --project development-485000 \
  --format="value(networkPolicy.enabled)"

Should print True. If False, enforcement is not happening — fix in Terraform and reapply.