Technical Decisions¶

This page is an ADR-lite, short opinionated notes on the load-bearing choices that shape the codebase. Each entry follows the same shape: Decision, Context, Rationale, Trade-offs.

Vertical domain slices¶

Decision. The backend is organized as vertical slices under backend/internal/services/<domain>/. Each slice owns its full stack in one package: store.go (persistence), service.go (business logic), handler.go (HTTP), routes.go (route mounting), plus wire.go/deps.go and colocated *_test.go. Multi-aggregate domains prefix files and types by aggregate (event_service.go / EventService). Constructors are New<Thing>; the object graph is composed in backend/internal/wiring. See Domains.

Context. A feature (events, chat, plans) touches persistence, business rules, and an HTTP surface together. Grouping code by technical layer spreads one feature across four sibling trees, so a single change fans out and the blast radius of a package is unclear.

Rationale. A slice is the unit of ownership: everything a feature needs sits in one directory, and the package boundary is the domain boundary. Response shapes live in the slice that serves them, so small cross-cutting briefs (UserBrief, EventBrief) are duplicated per domain rather than shared through a central package. Cross-domain needs are expressed as narrow interfaces (ports) declared in the consumer's deps.go and implemented by the provider, which keeps import edges one-way and mockable.

Trade-offs. A little duplication (the briefs) in exchange for decoupled packages. Genuinely shared types still live in backend/internal/models; the discipline is to not promote a type there just because two slices reference it.

Wire for dependency injection¶

Decision. Use Google's Wire to assemble the backend's object graph at compile time. The provider set lives in backend/internal/wiring/providers.go; InitializeApp is generated and called once from main.go.

Context. A typical request touches a handler, a service, one or more repositories, the Redis client, the WebSocket Hub, the Asynq client, and configuration. Hand-wiring this in main becomes unreadable; a runtime container hides errors until startup.

Rationale. Wire validates the entire graph at go generate time. Missing or ambiguous dependencies surface as compile errors, with zero runtime overhead. The output is plain Go that a human can read.

Trade-offs. Requires running go generate after provider changes. The generated file is committed and reviewed like any other code. Constructor signatures become the public contract — refactors ripple, which is by design.

sqlc + pgx + Postgres + PostGIS¶

Decision. Persist all durable state in a single PostGIS-enabled Postgres database, accessed through sqlc-generated, type-safe queries over pgx/v5 behind a per-domain Store (the persistence layer of each vertical slice under backend/internal/services/<domain>/). There is no ORM. Each domain's generated query code lives in backend/internal/services/<domain>/<domain>db/, and the Store is the only layer that runs it, mapping pgx rows to *models.X via internal/database/pgconv. The schema is defined by goose migrations in backend/db/migrations/; backend/db/schema/ is a generated per-domain snapshot that sqlc targets.

Context. The product is geospatial (events, moments, locations) and relational (users, friendships, participants). A single ACID store keeps reasoning simple while leaving room to scale reads. Queries need to be fast on the hot paths and hard to get wrong at the type level.

Rationale. PostGIS handles point geometry and proximity queries natively. sqlc compiles hand-written SQL into type-safe Go at codegen time, so column and parameter mismatches are compile errors rather than runtime surprises, with none of the reflection overhead of an ORM query path. pgx/v5 is the driver. Goose migrations are the schema source of truth and run at boot; db.Migrate (which wraps goose ApplySchema plus idempotent RunManual post-steps) keeps a fresh database current. The per-domain Store keeps service and handler code free of pgtype and SQL, returning plain *models.X structs.

Trade-offs. Schema changes are a multi-step flow: write the goose migration, regenerate the db/schema snapshot, edit the domain's queries.sql, and regenerate the <domain>db package, all committed together with a drift guard in CI. In exchange, the query layer is type-checked end to end and free of ORM reflection cost. Logic is concentrated in services rather than the database, so there are no triggers carrying business meaning. Raw SQL is limited to a few PostGIS queries, since pgx cannot decode the geometry column: the Store reads scalar lat/lng and writes geometry via ST_SetSRID(ST_MakePoint(...)).

Asynq for background work¶

Decision. Use Asynq on Redis for all background processing. Three queues — critical (6), default (3), low (1) — are configured in backend/internal/async/server.go.

Context. Email sends, webhook fan-out, cleanup sweeps, status transitions, and notification dispatch must not block HTTP handlers, and must survive process restart.

Rationale. Asynq gives us retries with exponential backoff, dead-letter queues, weighted priorities, scheduled jobs, and a usable web UI — all on top of Redis we already operate. The scheduler (backend/internal/async/scheduler.go) enqueues recurring jobs (status updates every 5 min, message expiry every 30 s, daily purges, etc.) so the worker is the single execution point.

Trade-offs. Couples background processing to Redis availability. Tasks must be idempotent — Asynq guarantees at-least-once. Stateful long-running work (>~30 s) belongs elsewhere; Asynq is for short, retryable units.

WebSocket Hub with Redis pub/sub fanout¶

Decision. Each pod runs two in-process WebSocket hubs: SessionHub (per-chat map[roomID]→Room) and ClientHub (per-user connections), each with a Redis-backed PSUBSCRIBE (session:* and client:*) for cross-pod fanout. Both live in the platform/ws bundle (backend/internal/platform/ws/session_hub.go, client_hub.go).

Context. Chat in Tomoda is per-room: each chat has a single room, and the cardinality of concurrent rooms is moderate. The per-user ClientHub carries notifications, presence, and multi-device sync on its own channel. The backend runs with multiple replicas (see Horizontal scaling).

Rationale. Each WS connection terminates at the pod it dialled, so some per-pod state is unavoidable. Redis pub/sub is the cheapest way to bridge those pods: one PUBLISH per broadcast, one PSUBSCRIBE per pod, no ingress session affinity, no separate broker to operate. An origin pod-ID tag in the wire envelope drops the echo that would otherwise round-trip back to the sender.

Trade-offs. Best-effort delivery (at-most-once) across pods — Redis pub/sub doesn't persist or retry. We accept this because every chat message is also persisted via ChatService.SendMessage, so a missed pub/sub frame is recoverable on reconnect. Every pod's subscriber receives every event's traffic (one cheap channel-name match per inbound message); if a single Redis instance ever becomes the bottleneck, the next step is to shard the channel namespace, not change the data model. See Real-time.

Single image, multiple modes¶

Decision. Build one backend image. Select what to start at runtime via --mode (or SERVER_MODE env): full, multi-hub, api-hub, ws-hub, or async. See backend/cmd/server/main.go.

Context. Once horizontal scaling was on the table we needed separate scaling profiles for the API surface (CPU/RPS) and the async worker pool (queue depth) — and didn't want to maintain two Dockerfiles or two CI pipelines for it.

Rationale. Modes are cheap: one flag, a few booleans, a handful of if guards around Hub.Run, WorkerServer.Start, SchedulerManager.Start, and route registration. /health is always served so k8s probes work in every mode, including async.

Trade-offs. The Wire-built dependency graph is still wired in full on every startup — unused components are constructed but never started. This costs a few hundred milliseconds of init and a slightly larger heap, in exchange for not splitting the binary or DI graph along mode lines. Reserved modes (api-hub, ws-hub) ship now even though no deployment uses them yet, so the day we split WS out we don't need a code change.

PostGIS-only spatial indexing (H3 retired)¶

Decision. Spatial dedup and viewport clustering run entirely on the PostGIS geography indexes; the shared Spatial mixin adds denormalized lat/lng float columns for reads that don't need PostGIS. No H3 cell columns exist.

Context. Earlier revisions stored an Uber H3 cell per Location (resolution 12, ~5 m) for dedup, and a map prototype added coarser r6/r8 parent cells for cell-keyed clustering endpoints. The frontend map settled on bbox viewport queries (/discovery/map), so the cell-keyed endpoints never gained a client and the columns had no readers.

Rationale. With no reader, the H3 columns were write-only cost: a cgo dependency (h3-go), extra indexes to keep aligned with the geometry, and a second spatial vocabulary beside PostGIS. ST_DWithin on the GiST geography indexes already serves dedup ("is this lat/lng near an existing place?") with per-category radii, and bbox queries serve clustering at every zoom the app uses.

Trade-offs. Constant-time cell-equality lookups are gone; anything that would have bucketed by fixed-resolution cells now pays a radius or bbox query. If a future surface needs cell bucketing, the columns can be recomputed from the stored geometry.

Photon as Google Places fallback¶

Decision. Use self-hosted Photon (OSM-backed) as the primary geocoder, with Google Places available as an optional secondary source. The dev stack runs Photon on port 2322 (docker-compose.dev.yml).

Context. Geocoding and autocomplete are hit on nearly every interactive map screen. Google Places is excellent but billed per-request; coverage and quality vary by region.

Rationale. Self-hosting Photon caps the marginal cost at zero and gives us latency control. For users where OSM coverage is thin we still fall through to Google. The split keeps the bill bounded while preserving result quality where it matters.

Trade-offs. Operational ownership — Photon must be deployed, refreshed, and monitored. OSM data updates are not as crisp as Google's commercial dataset for points-of-interest discovery; we accept that on the autocomplete path.

JWT + refresh tokens¶

Decision. Authenticate stateless requests with a 24-hour JWT (HS256), paired with a server-stored refresh token mirrored into Redis. See Authentication (system view) and Backend → Auth service (implementation).

Context. Clients are mobile-first and need to survive long offline windows. Session lookup on every request would push extra load onto Postgres or Redis at high QPS.

Rationale. A short-lived JWT lets the API authorise from the bearer alone — no per-request DB hit on the happy path. The refresh token gives us revocation: deleting the row + Redis key invalidates the chain on the next refresh. Sessions and login history still live in Postgres for user-facing controls.

Trade-offs. JWTs cannot be invalidated mid-lifetime — a compromised access token is valid until expiry. We mitigate by keeping expiry to 24 h, supporting forced refresh, and offering per-session revocation. The system is more complex than pure server sessions; we have judged the latency win worth it.

Expo Router (file-based, single codebase)¶

Decision. Ship iOS, Android, and the web from one Expo SDK 55 codebase with Expo Router for navigation. Routes live under frontend/app/.

Context. The product surface is identical across platforms — a small team cannot afford three navigators or three feature implementations.

Rationale. File-based routing matches the way the team mentally groups screens ((moments), (social), auth/, home/, etc.) and removes a class of imperative navigation bugs. React 19 + react-native-web gives us a usable web build of the same screens.

Trade-offs. A few platform-specific concerns (passkey UX, native sign-in flows, map providers) require Platform.OS branching. Some libraries lag Expo SDK releases.

Context-only state¶

Decision. Manage app-wide state with React Context. Nine providers live in frontend/contexts/ — Auth, Theme, Friends, Location, CreateEvent, PageHeader, Toast, Sheet, MapDock. No Redux, Zustand, or React Query.

Context. Most cross-screen state is small (auth user, current location, ephemeral UI), and React Query's caching layer was deemed unnecessary against a small set of imperative fetches.

Rationale. Keeps the bundle small, the learning curve flat, and the data flow easy to grep. Providers are composed at the root layout (app/_layout.tsx).

Trade-offs. Re-render scope is the developer's responsibility — Contexts are memoised manually with useMemo and useCallback. There is no built-in cache for server data; each service handles its own fetch and refresh.

Fetch (no axios)¶

Decision. Use the platform fetch API directly, wrapped in frontend/utils/api.ts (handleResponse) and frontend/services/*.ts thin per-domain wrappers.

Context. Adding axios introduces a non-trivial dependency, a parallel error model, and an interceptor system that ends up duplicating logic that React Native/web already provide.

Rationale. fetch is universal across iOS, Android, and the web. The shared handleResponse centralises the bits we actually need: 401 → session-expired event, JSON parsing, error-message extraction. Token caching lives in frontend/utils/tokenManager.ts.

Trade-offs. No request/response interceptors as a first-class concept — we hand-roll them. No built-in cancellation cookies — we rely on AbortController where it matters.

YAML + env layering¶

Decision. Backend config is loaded from layered YAML — config.yaml (base) plus config.<env>.yaml (overrides) — with environment variables and GCP Secret Manager filling in secrets via scripts/pull-secrets.sh.

Context. A pure .env approach scales poorly when configuration grows nested (Stripe price IDs, S3 endpoints, WebAuthn RP settings). Mixed-shape config also makes typed loading awkward.

Rationale. YAML expresses nested config naturally and maps cleanly onto the Go config structs in backend/config/config.go. Environment-specific files keep production overrides reviewable. Secrets stay out of files entirely — they are injected at runtime from Secret Manager.

Trade-offs. Two layers (file + env) to mentally merge. Local developers need a config.local.yaml for ergonomics. The split is deliberately worth the clarity at the cost of one additional load path.

Centralized access policies in one package¶

Decision. Authorization decisions live in backend/internal/access/. Handlers mount pre-built named policies (access.Require(access.TomodaAdmin), access.RequirePartnerAdmin(repo)) rather than constructing role checks inline or scattering middleware across the codebase.

Context. The first iteration of admin gating was a single AdminAuth middleware that hardcoded Role == "admin". Adding curator / operator / support / auditor roles would have meant five more bespoke middlewares — each with the same shape but slightly different checks. Partner scoping would have introduced yet another shape entirely.

Rationale. Two layers, one package: claim-based Policy for global (account-type, role) pairs that ride on the JWT; membership-based RequirePartner* for per-tenant gating. The named set is small (~5 Tomoda policies + 3 partner policies) so the entire authorization surface is auditable in two files. JWT claims carry account_type so policy checks are DB-free for global routes.

Trade-offs. Claim-based policies mean role demotions take effect on the next token refresh (≤24 h), not instantly. For routes where revocation latency matters, we'll add a RequireFresh(userRepo, policy) helper that does the DB check — kept out of the initial design so we only pay the cost where it's needed.

URL-scoped partner context¶

Decision. Partner routes are scoped by URL — /api/v1/partner/:partner_id/... on the backend, /(partner)/[partner_id]/... on the frontend. There is no "current partner" state in the JWT, the session, or the frontend context.

Context. A user can belong to multiple partners (a personal account that helps run two coffee shops). The system needs to know which partner an action is for, and the answer needs to be auditable in logs.

Rationale. URL-scoping makes every partner-bound action self-describing — the route template names the tenant. Backend middleware reads :partner_id and verifies membership; frontend reads useLocalSearchParams().partner_id from the route. Switching partners is a router navigation, not a state mutation. Audit logs ("who did what to which partner") are trivial because the partner ID is in the request line.

Trade-offs. Routes are slightly longer. A user who belongs to many partners needs the picker ((partner)/index.tsx) to navigate between them — there's no implicit "stay in the last partner I used" memory. Worth it for the symmetry between client and server, and the absence of stale-state bugs.

Discovery is the read-aggregation domain¶

Decision. All query and read-aggregation across domains is served by one discovery slice under /discovery: viewport map (/discovery/map), nearby-people radar (/discovery/radar), location detail (/discovery/locations/:id), unified natural-language search (/discovery/search) plus kind-scoped search (/discovery/search/{events,locations,users}), the saved-area pin (/discovery/pin), the taxonomy vocabulary (/discovery/taxonomy, /discovery/taxonomy/resolve), and the full aggregated profile card (/discovery/profiles/:id, optional auth). It queries the DB directly and reuses taxonomy.ParseQuery, the platform/llm semantic resolver, and location geocoding.

Context. Read paths that fan across events, locations, users, and moments do not belong to any single write-owning domain. Scattering them (a find surface, a discovery-map surface, a taxonomy surface, a per-domain profile endpoint) duplicated the geocode-then-rank pipeline and gave the client several vocabularies for one job.

Rationale. A dedicated read domain lets aggregation queries preload and rank across aggregates without threading a write-owner's service. One /discovery prefix is the client's single entry for "what is near here / what matches this query / who is this person", so the intent parser and geocoder are wired once. Write-owning domains stay lean: user serves only the lean card (GET /users/:userId → GetUserCard: {id, name, username, avatar_url, bio, is_friend}), while the full aggregated profile card is a discovery read.

Trade-offs. Discovery reads tables it does not own, so it must track schema changes in several domains. We accept that coupling on the read side because the alternative (an aggregation call graph across five services per map paint) is slower and harder to reason about.

Snapshot event items at plan promotion¶

Decision. When a plan is promoted to an event, its curated board items are copied into event-owned models.EventItem rows (backend/internal/models/event_item.go), with photos and links frozen as JSONB snapshots. GET /events/:id/items renders that snapshot self-contained; it never reaches back into the plan or item domain.

Context. A plan's board is a live, collaborative, editable canvas. An event's event items are a settled record of what the group agreed to. Rendering the event board by joining live plan items would let a later edit or deletion of the source plan silently rewrite the history of an event that already happened.

Rationale. Copy-at-promotion makes the event board immutable and self-contained: EventItemFromItem snapshots title, body, location, tags, and the photo/link JSON so the card renders without a cross-domain read. Editing the event copy never touches the source plan, and deleting the source plan leaves the event board intact.

Trade-offs. The board is a point-in-time copy: edits made to the plan after promotion do not flow into the event. That is the intended contract (the event is settled), but it means the two can diverge, which is acceptable because the plan and the event are different lifecycle stages.

Self-hosted small LLM behind a generic interface¶

Decision. The semantic resolution layer (entity-query extraction + place selection for link capture) calls a small quantized LLM over the OpenAI-compatible chat API, behind a generic SemanticResolver interface with an LLMClient transport. Default model is Qwen2.5-3B-Instruct served by Ollama. It is off by default and degrades to the deterministic Serper heuristic when disabled or unreachable.

Context. Serper search adds little value when fed a social caption verbatim: the query is noise and the top result is confidently wrong. The fix needs a model that can distill an entity and judge candidates, but we did not want to add another paid provider or leak captures to a third party.

Rationale. A self-hosted quantized model is compute, not a new vendor, so it satisfies the no-new-provider constraint. Speaking the OpenAI chat shape means self-hosted (Ollama, llama.cpp, vLLM) and most hosted providers are a config swap (base URL, model, key), and a non-compatible provider is a second LLMClient impl behind the same interface, the same pattern as the Photon and Google location adapters. The layer runs in the async enrich worker, so CPU inference is acceptable and a slow or missing model server just falls back.

Trade-offs. Operational ownership of a model server (pull, pin, monitor). A 3B quantized model needs defensive JSON parsing and a confidence gate because it can be unreliable or overconfident; the deterministic G1/G2 guards stay as the floor so the model can only improve on, never replace, the safe path.

Cost-tiered location enrichment¶

Decision. Every location row gets a cheap backbone (lexical search_text, spatial index, category, Photon localization gap-fill) unconditionally, but the paid tier (Serper /maps, then a budget-capped Google Places call) fires only for rich-detail categories that are not part of the bulk backbone and are not already enriched. isBulkImport excludes the density backbone; the daily enrich:upgrade:google budget caps the billed fallback. See Locations.

Context. A planet-scale POI catalog is tens of millions of rows. Paid-enriching all of them, or embedding all of them, costs real money for rows almost no query reaches, while the interactive experience needs rich detail (photos, hours, ratings) on the venues users actually open.

Rationale. Splitting the spend from the backbone means the marginal row costs near zero (self-hosted Photon + Postgres) and the paid budget concentrates on demand-activated, high-value rows. The backbone stays fully findable lexically, spatially, and by category; it just does not carry commercial detail until a category or an engagement earns it.

Trade-offs. A cold backbone row shows sparse detail until someone touches it, and the promotion path (below) adds a write on first meaningful engagement. Both are acceptable: the alternative is an unbounded enrichment bill for coverage users never see.

Demand-driven embeddings via `shouldEmbed`¶

Decision. A row carries a semantic vector only when shouldEmbed passes: prominence at or above a floor, curated/promoted provenance, rich text, or any engagement. The cold long tail stays lexical + spatial + category-searchable with no vector. A backbone row a user meaningfully engages with is promoted (bulk_import → promoted), which clears the enrichment guard and earns it an embedding on demand.

Context. Embedding an entire multi-million-row backbone would need as many vectors and an ANN index over all of them, at real cost, for rows almost no conceptual query reaches.

Rationale. Gating the vector to high-value + engaged rows keeps the embedded set small enough that an in-memory HNSW serves it well, and keeps the embedding-model calls bounded. Lexical recall covers the long tail; only a pure meaning query needs a vector, and those land on the rows that have one.

Trade-offs. A conceptual query cannot reach an un-embedded cold row until it earns a vector. Accepted: those rows are, by definition, the ones no one has engaged and that carry no prominence signal.

Two-stage binary-quantized vector retrieval¶

Decision. The ANN index is an HNSW over binary_quantize(embedding)::bit(1024) with Hamming distance, not over the full halfvec. Retrieval pulls a wide candidate pool by Hamming distance on the bit index, then re-ranks that pool by full cosine distance on the retained halfvec column. See Locations.

Context. A 1024-dim halfvec HNSW is sixteen bits per dimension; at catalog scale the index stops fitting comfortably in memory, which is where HNSW earns its speed.

Rationale. One bit per dimension makes the index ~16x smaller, so it stays in RAM as the embedded set grows. The bit index does the fan-out cheaply; the precise cosine re-rank sees only a couple hundred rows, recovering full-precision ordering. It is pgvector-native, no new extension.

Trade-offs. Recall depends on the re-rank pool size (rerank_k), which is not yet measured against exact cosine on real data (a near-term item). If the gated embedded set ever outgrows in-memory HNSW, the next lever is a disk-backed ANN (pgvectorscale / DiskANN), deliberately not adopted now.

Ancestry-based provider→taxonomy category mapping¶

Decision. Overture's ~2100 leaf categories map to the Tomoda taxonomy through 22 top-level group-default rules plus ~155 selective leaf overrides, with a committed leaf→group ancestry table (backend/data/overture_category_ancestry.json) filling the gap: a leaf with no targeted rule climbs to its group's default. "Other Location" is the fail-safe for a category outside the ancestry set.

Context. Overture ships thousands of leaf categories and bumps them monthly. Hand-mapping every leaf is unmaintainable; ignoring the long tail drops most rows to an uncategorized bucket.

Rationale. Mapping the 22 groups plus the high-value leaves that deserve a more specific Tomoda leaf, and resolving everything else through ancestry, covers the whole release with a small, reviewable rule set. Because every leaf's group has a default, the ancestry path resolves the entire current release; the "Other Location" fallback only catches a category entirely absent from the ancestry file, and the seeder fails loud if the ancestry data cannot load rather than silently degrading every row.

Trade-offs. The ancestry file must be regenerated when Overture bumps its taxonomy, and the release string is a constant today (auto-latest is a roadmap item).

`go:embed` of taxonomy + ancestry data¶

Decision. The place/event taxonomy JSON and the Overture ancestry map are embedded into the binary with go:embed (backend/data), read through a helper that honors a TAXONOMY_DATA_DIR override but defaults to the embedded copy. A read/parse failure is treated as a real bug and surfaced (fail-loud), not swallowed.

Context. The taxonomy is loaded at boot and from the seeder, which run from different working directories. A relative-path read is fragile across those entrypoints and across container layouts.

Rationale. Embedding makes the data CWD-independent and guarantees it ships with the binary, so boot and the seeder resolve the same vocabulary regardless of where they run. The env override keeps local iteration possible without a rebuild.

Trade-offs. Regenerating the data requires a rebuild to pick up the new embedded copy (or the env override for local testing). Acceptable for data that changes on a monthly-or-slower cadence.

Soft-close instead of hard-delete for closed places¶

Decision. When a report-outdated refresh concludes a place is permanently closed, the row is tombstoned (is_active=false, business_status=CLOSED_PERMANENTLY, closed_at stamped) rather than deleted. It drops out of forward-facing search but stays fetchable by id. A single provider "not found" never closes a row on its own: not_found_count must cross a threshold, and a later operational refresh reverses the close.

Context. Locations are FK targets for check-ins, moments, events, saves, and passport visits. Deleting a closed place would dangle those references and erase history.

Rationale. A tombstone preserves every FK linkage and keeps deep links resolving with a "permanently closed" label, while excluding the place from search. Corroboration (the not-found counter) plus reversibility guards against false closures from renames, transient provider gaps, and obscure places.

Trade-offs. Closed rows accumulate in the table rather than being reclaimed. Accepted: they are excluded from the hot search paths by is_active, and their history value outweighs the storage.

Set-based load-time city linking¶

Decision. Each non-city POI's city_id self-FK is assigned in one set-based pass after the whole dataset is loaded, walking the table by primary-key cursor in chunks and running a LATERAL nearest-city KNN join per chunk, rather than resolving the nearest city per row during ingest.

Context. A planet-scale load is tens of millions of POIs. A per-row nearest-city lookup in the ingest loop is tens of millions of round-trips; it also couples city-linking to ingest order (a POI can load before its city).

Rationale. Deferring to one post-load sweep turns the work into a bounded series of index-served UPDATE statements (the KNN order-by rides the GiST coordinate index), runs after every city exists, and is idempotent (only NULL city_id rows are touched), so a re-load leaves linked rows alone. The same city_id then drives per-city highlight promotion and localized-city read assembly.

Trade-offs. City linkage is not available mid-load, only after the sweep. Acceptable: nothing reads city_id until the catalog is fully loaded and promotion runs.

Durable ranking labels + config-driven weights¶

Decision. Search ranking training labels are persisted to a durable Postgres sink (a sampled ranking_impression carrying each shown candidate's feature vector + every ranking_selection, joined by ranking_query_id), separate from the retention-bound observability logs, and the six rank weights are env-config (RANK_WEIGHT_*) defaulting to the shipped constants rather than compile-time constants.

Context. The offline weight tuner (cmd/rankertune) needs a stable history of what was shown and what users chose; Loki log retention is measured in days. Applying a retuned weight vector as compile-time constants requires a recompile + redeploy.

Rationale. A durable sink gives the retune a growing, joinable label set that survives log retention; writes are best-effort off the hot path (a detached goroutine, errors swallowed) so a search or selection never blocks or fails. Config-driven weights let a tuned vector deploy without a code change.

DB-vector hot-refresh (now landed). The previously-future DB-stored weight vector step now ships: a ranking_weights table holds the applied vectors with exactly one active (a partial unique index on active). Effective weights resolve active DB row > env config > shipped default. The API ranker re-reads the active row on an interval (RANK_WEIGHTS_REFRESH_INTERVAL, default 2m, 0 disables) and atomically swaps it into the live blend, so an approved retune applies without a redeploy. An admin endpoint (POST /admin/locations/ranking-weights, gated by the TomodaAdmin capability, input-validated to [0,1] and not all-zero) applies a cmd/rankertune -harvest recommendation. An empty table falls back to config weights byte-for-byte, and a DB read failure keeps the current vector, so the read never blocks startup or drops the ranker to defaults.

Trade-offs. Two label tables to prune (a 24h cron bounds them at 180 days), plus the weights table (append-only history, one active row). The refresh is eventually-consistent within one interval; that window is well inside the human-gated retune loop.