Photon Indexer¶
Infrastructure for the multilingual Photon geocoder index: a GCS bucket that holds versioned index tarballs, plus a service account the indexer uses to upload them. The actual indexer job (a Kubernetes CronJob) is currently suspended and indexes are built manually on a developer's machine.
The Photon GCS bucket + service account + Workload Identity binding live outside Terraform — they're created manually via the bootstrap doc so terraform destroy can never delete the planet index (each rebuild costs ~$500 of compute). The K8s side is still GitOps-managed: workload manifests at k8s/apps/photon-indexer/, build script scripts/photon-index-local.sh.
Why this exists¶
rtuszik/photon-docker (the image we run for Photon itself) downloads its index over plain HTTP with no authentication. We can't serve indexes from a private GCS bucket, and we don't want to. So we host them at a public URL on GCS and rely on the fact that they are 100% derived from OSM data (no privacy concerns).
The bucket holds two kinds of objects:
- Versioned tarballs —
photon-db-planet-multilang-2026-06.tar.bz2, etc. latest-*aliases — copies of the most-recent good build, used by Photon pods that pin to "latest stable".
Bucket¶
| Field | Value |
|---|---|
| Name | ${project_id}-photon-index (currently development-485000-photon-index) |
| Location | asia-east1 |
| Storage class | STANDARD |
| Uniform bucket-level access | On |
| Versioning | Off — we keep filenames versioned manually |
| Retention policy | Effectively infinite (100 years) — set on the bucket itself, so even accidental gcloud storage rm calls are blocked |
| Not Terraform-managed | Deliberately — moves the bucket outside of terraform destroy's reach |
Lifecycle¶
| Age | Action |
|---|---|
| 35 days | Transition to Nearline (cost optimisation) |
No delete lifecycle. Planet indexes cost ~$500 of compute each to rebuild — old versions stay forever. If you need to manually clear cruft someday, it's a deliberate gcloud storage rm against specific objects, not a lifecycle-driven sweep.
Public read¶
allUsers is granted roles/storage.objectViewer on the bucket. This is intentional and required because rtuszik/photon-docker fetches over unauthenticated HTTP.
The org has the iam.allowedPolicyMemberDomains constraint set, which normally blocks allUsers bindings. The bootstrap doc provisions an org-policy override at the project level (iam.allowedPolicyMemberDomains set to allowAll: true, scoped to this project only):
spec:
inheritFromParent: false
rules:
- allowAll: true
Org policy override is project-scoped
The override applies to the entire development-485000 project, not just this bucket. Any other bucket in this project could also be made publicly readable now. We accept that trade-off because the project is single-tenant. If we ever split prod into its own project, do not copy this override blindly — re-evaluate first.
The org policy API is enabled by Terraform too (google_project_service.orgpolicy_api). Disabling it with disable_on_destroy = false is deliberate — pulling the API would not roll back the override, just make it un-manageable.
Service account¶
resource "google_service_account" "photon_indexer" {
account_id = "photon-indexer"
display_name = "Photon Indexer"
}
- Bound to the bucket with
roles/storage.objectAdminso it can both upload new tarballs and overwrite thelatest-*aliases. - Bound via Workload Identity to the K8s SA
data/photon-indexer, ready for the CronJob — even though the CronJob is currently suspended, the binding is in place so you don't have to touch Terraform when un-suspending it.
serviceAccount:${project_id}.svc.id.goog[data/photon-indexer]
CronJob: currently SUSPENDED¶
Don't expect the cluster to be building indexes
The CronJob in k8s/apps/photon-indexer/ is suspended awaiting Nominatim setup. As of this writing, no automated index builds are happening. Until we stand up Nominatim and un-suspend the CronJob, indexes are built and uploaded by hand from a developer's machine using scripts/photon-index-local.sh. That script authenticates as the photon-indexer SA via ADC.
The CronJob was suspended (not deleted) because:
- The SA, IAM bindings, and bucket are all in place — un-suspending should be a one-line change.
- Re-creating the CronJob from scratch later is more risk than leaving it dormant.
See the Photon multilang rollout runbook for how local builds work today, and the Photon K8s deployment page for how Photon pods consume the bucket.
Resource names (deterministic — no lookup needed)¶
Local scripts and K8s manifests hardcode these directly since they're stable contracts (never renamed):
| Resource | Value |
|---|---|
| Bucket | development-485000-photon-index |
| Public base URL | https://storage.googleapis.com/development-485000-photon-index |
| Indexer service account | photon-indexer@development-485000.iam.gserviceaccount.com |
The public base URL is what Photon pods are configured to fetch from.