Loki — retention and Promtail pipeline¶
The deployment is documented in kubernetes/system/loki.md. This page covers two things that changed alongside the Tempo + backend instrumentation rollout:
- The PVC was sized up to 20 Gi and retention pinned to 7 days.
- The Promtail pipeline now parses the tomoda backend's Zap JSON logs and surfaces
trace_idfor trace-to-log navigation from Grafana / Tempo.
Retention math¶
| Knob | Value | Why |
|---|---|---|
loki.persistence.size |
20Gi |
See sizing below |
loki.config.limits_config.retention_period |
168h (7d) |
User-chosen window |
loki.config.compactor.retention_enabled |
true |
Without this, retention is a best-effort table-manager job |
loki.config.compactor.retention_delete_delay |
2h |
Soft window before old chunks are actually deleted |
Estimated write rate (current cluster):
- ~10 backend / system pods continuously logging at ~5 lines/sec each
- ≈ 4.3 M lines/day raw
- ≈ 1.5 GB/day uncompressed
- ≈ 150 MB/day after Loki's gzip chunk compression
- 7 × 150 MB ≈ 1.05 GB for the actual log data
20 Gi gives ~13× headroom for:
- Traffic spikes (a Traefik log flood during an attack or a debug-level rollout)
- The chunks cache + index + WAL overhead
- Future onboarding of additional services without immediate resizing
If you see disk pressure on the Loki PVC, two things to check before bumping size:
- Compactor is actually running.
kubectl logs -n monitoring -l app.kubernetes.io/name=loki | grep compactor. If it's silent for hours, retention isn't being enforced. - One pod isn't spamming. Run
topk(10, sum by (namespace, pod) (rate({__name__=~".+"}[5m])))in Grafana to find the loudest pod and either fix it or drop its logs at the Promtail level.
Promtail pipeline¶
Two match stages run in order:
Traefik (unchanged behavior)¶
Selector: {container="traefik"}. Parses the JSON access log and promotes entryPointName, request_Host, status, method, level to labels. This is what powers the Traefik logs Grafana dashboard (grafana.com 13702).
Tomoda backend (new)¶
Selector: {app=~"tomoda-(api|async)"}. The backend uses Zap with a JSON encoder, so each log line is shaped like:
{
"level": "info",
"ts": 1716480000,
"caller": "main.go:45",
"msg": "request handled",
"trace_id": "8a4f5e0e9b1b9c1f1e1d1c1b1a191817",
"span_id": "0123456789abcdef"
}
The pipeline:
- Parses the JSON.
- Promotes
levelto a label. Cardinality is bounded (debug, info, warn, error, fatal, panic), so this is safe and gives free filtering:{app="tomoda-api", level="error"}. trace_idandspan_idstay in the raw log body — not promoted to a label, not lifted into structured metadata. They're queryable at query time via LogQL'sjsonstage.
- match:
selector: '{app=~"tomoda-(api|async)"}'
stages:
- json:
expressions:
level: level
- labels:
level:
Why not structured_metadata?
Loki 2.9.x (shipped by loki-stack 2.10.x) at schema v11 doesn't support the structured_metadata pipeline stage — that's a Loki 3.x + schema v13 feature. Promoting trace_id to a Loki label would multiply the index by per-request cardinality and is the canonical way to wreck a Loki cluster.
The interim pattern is query-time JSON parsing:
{app="tomoda-api"} | json | trace_id="8a4f5e0e..."
Slightly slower than indexed lookup but correct, and the same query syntax works in Grafana's tracesToLogsV2 link (configured to use this exact filter). When we bump to Loki 3.x, this stage gets a one-line upgrade.
Label strategy — what's safe and what's not¶
| Field | Promote to label? | Why |
|---|---|---|
namespace |
yes (Promtail does it automatically) | bounded by namespace count |
app |
yes (auto) | bounded by app count |
level |
yes | bounded enum (debug/info/warn/error/fatal/panic) |
status (Traefik) |
yes | bounded HTTP status codes |
method |
yes | bounded HTTP method set |
trace_id |
NO — query-time JSON parse | per-request, unbounded |
span_id |
NO — query-time JSON parse | per-request, unbounded |
caller |
NO — query-time JSON parse | many call sites; not useful as a label |
user_id (if ever added) |
NO — query-time JSON parse | per-user, unbounded |
request_id |
NO — query-time JSON parse | per-request |
Rule of thumb: if a field has fewer than ~100 distinct values cluster-wide, label is OK. Anything per-request, per-user, or per-trace either gets queried via | json | <field>="..." (today, on Loki 2.x) or via structured_metadata (after the Loki 3.x bump). It never becomes a label.
Trace-to-log jump¶
This is what makes the Tempo integration useful. From a Tempo span:
{namespace="tomoda"} | json | trace_id="8a4f5e0e9b1b9c1f1e1d1c1b1a191817"
is run automatically when you click "View logs" on a span. The tracesToLogsV2 config in monitoring/values.yaml wires this — see tempo.md. The | json stage is required because trace_id isn't a label (see retention sizing note above); it's pulled from the JSON body at query time.
Going the other direction, the Loki data source has a derivedFields rule:
- name: TraceID
matcherRegex: '"trace_id":"(\w+)"'
url: '${__value.raw}'
datasourceUid: tempo
Any log line containing "trace_id":"..." gets a clickable link in Grafana that opens the matching trace in Tempo.
Debugging the pipeline¶
# Is Promtail seeing tomoda pods?
kubectl logs -n monitoring -l app.kubernetes.io/name=promtail | grep "tomoda" | head -20
# Tail a recent tomoda log line and check the extracted labels
kubectl port-forward -n monitoring svc/loki 3100:3100
curl -s -G 'http://localhost:3100/loki/api/v1/labels' | jq
# Sample query for a known trace
curl -s -G 'http://localhost:3100/loki/api/v1/query_range' \
--data-urlencode 'query={app="tomoda-api"} | trace_id="abc..."' \
--data-urlencode 'start='$(date -d '1 hour ago' +%s)000000000 | jq '.data.result[0]'
If a tomoda log line shows up in Grafana but level is missing as a label, it means the JSON parse failed — usually because the line isn't JSON (e.g. a panic stack trace, or a third-party library logging with a different format). The pipeline doesn't drop those — they're just queryable by {app="tomoda-api"} without structured filtering.