Skip to content

Loki — retention and Promtail pipeline

The deployment is documented in kubernetes/system/loki.md. This page covers two things that changed alongside the Tempo + backend instrumentation rollout:

  1. The PVC was sized up to 20 Gi and retention pinned to 7 days.
  2. The Promtail pipeline now parses the tomoda backend's Zap JSON logs and surfaces trace_id for trace-to-log navigation from Grafana / Tempo.

Retention math

Knob Value Why
loki.persistence.size 20Gi See sizing below
loki.config.limits_config.retention_period 168h (7d) User-chosen window
loki.config.compactor.retention_enabled true Without this, retention is a best-effort table-manager job
loki.config.compactor.retention_delete_delay 2h Soft window before old chunks are actually deleted

Estimated write rate (current cluster):

  • ~10 backend / system pods continuously logging at ~5 lines/sec each
  • ≈ 4.3 M lines/day raw
  • ≈ 1.5 GB/day uncompressed
  • ≈ 150 MB/day after Loki's gzip chunk compression
  • 7 × 150 MB ≈ 1.05 GB for the actual log data

20 Gi gives ~13× headroom for:

  • Traffic spikes (a Traefik log flood during an attack or a debug-level rollout)
  • The chunks cache + index + WAL overhead
  • Future onboarding of additional services without immediate resizing

If you see disk pressure on the Loki PVC, two things to check before bumping size:

  • Compactor is actually running. kubectl logs -n monitoring -l app.kubernetes.io/name=loki | grep compactor. If it's silent for hours, retention isn't being enforced.
  • One pod isn't spamming. Run topk(10, sum by (namespace, pod) (rate({__name__=~".+"}[5m]))) in Grafana to find the loudest pod and either fix it or drop its logs at the Promtail level.

Promtail pipeline

Two match stages run in order:

Traefik (unchanged behavior)

Selector: {container="traefik"}. Parses the JSON access log and promotes entryPointName, request_Host, status, method, level to labels. This is what powers the Traefik logs Grafana dashboard (grafana.com 13702).

Tomoda backend (new)

Selector: {app=~"tomoda-(api|async)"}. The backend uses Zap with a JSON encoder, so each log line is shaped like:

{
  "level": "info",
  "ts": 1716480000,
  "caller": "main.go:45",
  "msg": "request handled",
  "trace_id": "8a4f5e0e9b1b9c1f1e1d1c1b1a191817",
  "span_id": "0123456789abcdef"
}

The pipeline:

  1. Parses the JSON.
  2. Promotes level to a label. Cardinality is bounded (debug, info, warn, error, fatal, panic), so this is safe and gives free filtering: {app="tomoda-api", level="error"}.
  3. trace_id and span_id stay in the raw log body — not promoted to a label, not lifted into structured metadata. They're queryable at query time via LogQL's json stage.
- match:
    selector: '{app=~"tomoda-(api|async)"}'
    stages:
      - json:
          expressions:
            level: level
      - labels:
          level:

Why not structured_metadata?

Loki 2.9.x (shipped by loki-stack 2.10.x) at schema v11 doesn't support the structured_metadata pipeline stage — that's a Loki 3.x + schema v13 feature. Promoting trace_id to a Loki label would multiply the index by per-request cardinality and is the canonical way to wreck a Loki cluster.

The interim pattern is query-time JSON parsing:

{app="tomoda-api"} | json | trace_id="8a4f5e0e..."

Slightly slower than indexed lookup but correct, and the same query syntax works in Grafana's tracesToLogsV2 link (configured to use this exact filter). When we bump to Loki 3.x, this stage gets a one-line upgrade.

Label strategy — what's safe and what's not

Field Promote to label? Why
namespace yes (Promtail does it automatically) bounded by namespace count
app yes (auto) bounded by app count
level yes bounded enum (debug/info/warn/error/fatal/panic)
status (Traefik) yes bounded HTTP status codes
method yes bounded HTTP method set
trace_id NO — query-time JSON parse per-request, unbounded
span_id NO — query-time JSON parse per-request, unbounded
caller NO — query-time JSON parse many call sites; not useful as a label
user_id (if ever added) NO — query-time JSON parse per-user, unbounded
request_id NO — query-time JSON parse per-request

Rule of thumb: if a field has fewer than ~100 distinct values cluster-wide, label is OK. Anything per-request, per-user, or per-trace either gets queried via | json | <field>="..." (today, on Loki 2.x) or via structured_metadata (after the Loki 3.x bump). It never becomes a label.

Trace-to-log jump

This is what makes the Tempo integration useful. From a Tempo span:

{namespace="tomoda"} | json | trace_id="8a4f5e0e9b1b9c1f1e1d1c1b1a191817"

is run automatically when you click "View logs" on a span. The tracesToLogsV2 config in monitoring/values.yaml wires this — see tempo.md. The | json stage is required because trace_id isn't a label (see retention sizing note above); it's pulled from the JSON body at query time.

Going the other direction, the Loki data source has a derivedFields rule:

- name: TraceID
  matcherRegex: '"trace_id":"(\w+)"'
  url: '${__value.raw}'
  datasourceUid: tempo

Any log line containing "trace_id":"..." gets a clickable link in Grafana that opens the matching trace in Tempo.

Debugging the pipeline

# Is Promtail seeing tomoda pods?
kubectl logs -n monitoring -l app.kubernetes.io/name=promtail | grep "tomoda" | head -20

# Tail a recent tomoda log line and check the extracted labels
kubectl port-forward -n monitoring svc/loki 3100:3100
curl -s -G 'http://localhost:3100/loki/api/v1/labels' | jq

# Sample query for a known trace
curl -s -G 'http://localhost:3100/loki/api/v1/query_range' \
  --data-urlencode 'query={app="tomoda-api"} | trace_id="abc..."' \
  --data-urlencode 'start='$(date -d '1 hour ago' +%s)000000000 | jq '.data.result[0]'

If a tomoda log line shows up in Grafana but level is missing as a label, it means the JSON parse failed — usually because the line isn't JSON (e.g. a panic stack trace, or a third-party library logging with a different format). The pipeline doesn't drop those — they're just queryable by {app="tomoda-api"} without structured filtering.