Skip to content

feat(db): nightly pg_cron refresh of pygeoapi materialized views#727

Merged
jirhiker merged 9 commits into
stagingfrom
claude/trusting-khayyam-8b5f6e
Jun 17, 2026
Merged

feat(db): nightly pg_cron refresh of pygeoapi materialized views#727
jirhiker merged 9 commits into
stagingfrom
claude/trusting-khayyam-8b5f6e

Conversation

@jirhiker

Copy link
Copy Markdown
Member

What

Adds a nightly pg_cron job that refreshes the pygeoapi materialized views in production. The schedule is registered through an alembic migration so it's traceable in version control.

Why

The pygeoapi materialized views (ogc_*) were only refreshed manually / on deploy. This automates a nightly refresh so the OGC API serves current data without operator intervention.

How

  • Migration x2y3z4a5b6c7 creates the pg_cron extension, a public.refresh_pygeoapi_materialized_views() helper, and a cron job refresh-pygeoapi-materialized-views on 0 9 * * * (server timezone ≈ 02:00–03:00 MT). Idempotent — unschedules any same-named job before re-registering.
  • Production-only, no-op elsewhere. The migration does nothing unless ENABLE_PG_CRON is truthy. The dev/test/CI Postgres image doesn't preload pg_cron, so alembic upgrade head keeps working there untouched.
  • Single source of truth. New services/materialized_views.py holds the view list, imported by both the migration and the oco refresh-pygeoapi-materialized-views CLI command.
  • Production DB image. docker/db/Dockerfile installs postgresql-17-cron and starts Postgres with shared_preload_libraries=pg_cron and cron.database_name pointed at the app DB. Dev docker-compose.yml is untouched.
  • CD. CD_staging.yml and CD_production.yml set ENABLE_PG_CRON=1 on the migration step.
  • Docs. docs/pg_cron-nightly-refresh.md covers self-hosted Docker + Cloud SQL setup, verification queries, and the non-concurrent REFRESH rationale.

Reviewer notes / deploy prerequisite

⚠️ Cloud SQL instance config must be set before the next staging/prod deploy, or the migration step will fail at CREATE EXTENSION pg_cron:

  • Flag cloudsql.enable_pg_cron=on
  • Flag cron.database_name=<CLOUD_SQL_DATABASE> (the app DB that holds the matviews — not postgres)
  • The migration's CLOUD_SQL_USER needs privilege to create the extension

These are instance-level settings, not in the repo.

The cron helper uses plain (non-concurrent) REFRESH because REFRESH ... CONCURRENTLY can't run inside a PL/pgSQL transaction; the brief nightly lock is acceptable. The CLI still offers --concurrently for daytime manual refreshes.

🤖 Generated with Claude Code

jirhiker and others added 2 commits June 17, 2026 10:21
Register a pg_cron job that refreshes the pygeoapi materialized views
once a night in production, with the schedule traceable in version
control via an alembic migration.

- alembic migration x2y3z4a5b6c7 creates the pg_cron extension, a
  public.refresh_pygeoapi_materialized_views() helper, and a nightly
  cron job (0 9 * * *, server timezone). Idempotent: it unschedules any
  same-named job before re-registering.
- pg_cron is production-only. The migration is a no-op unless
  ENABLE_PG_CRON is truthy, so alembic upgrade head still works on the
  dev/test/CI Postgres image (which does not preload pg_cron).
- services/materialized_views.py is the single source of truth for the
  view list, shared by the CLI refresh command and the migration.
- docker/db/Dockerfile (production image) installs pg_cron and preloads
  it with cron.database_name pointed at the app database.
- CD_staging.yml and CD_production.yml set ENABLE_PG_CRON=1 on the
  migration step.
- docs/pg_cron-nightly-refresh.md documents setup (self-hosted Docker
  and Cloud SQL), verification, and the non-concurrent REFRESH rationale.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5ad3f25d7c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread docker/db/Dockerfile Outdated
Comment thread alembic/versions/x2y3z4a5b6c7_schedule_nightly_matview_refresh_pg_cron.py Outdated
jirhiker and others added 7 commits June 17, 2026 10:55
Staging refreshes the materialized views on each deploy (the existing
"Refresh materialized views" CD step), so it does not need the nightly
pg_cron job. Drop ENABLE_PG_CRON from CD_staging.yml; only production
registers the cron job.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Migration helper now discovers ogc_* materialized views from the
  catalog at run time instead of importing the mutable view tuple. Keeps
  the versioned migration immutable and self-contained, and auto-includes
  views added by later migrations. Resolves the cross-environment drift
  concern from review.
- Production DB image derives cron.database_name from POSTGRES_DB via a
  start-postgres.sh entrypoint wrapper, so pg_cron tracks the same
  database the migration connects to even when POSTGRES_DB is overridden.
- services/materialized_views.py is now the CLI's curated default only;
  docs updated to match.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Refresh every materialized view nightly, including transducer_daily_data.

- Migration helper drops the ogc_* filter and refreshes all public-schema
  materialized views discovered from the catalog at run time.
- Rename PYGEOAPI_MATERIALIZED_VIEWS -> MATERIALIZED_VIEWS and add
  transducer_daily_data to the CLI's default list.
- Update CLI refresh test expectations (8 views) and docs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The refresh covers all materialized views, not just the ogc_* pygeoapi
views, so rename the pygeoapi-specific identifiers to generic ones.

- CLI command refresh-pygeoapi-materialized-views -> refresh-materialized-views
  (function refresh_pygeoapi_materialized_views -> refresh_materialized_views).
- SQL helper public.refresh_pygeoapi_materialized_views() ->
  public.refresh_materialized_views(); rename it in d5e6f7a8b9c0 too, and
  have the pg_cron migration drop the legacy function on databases that
  already created it.
- Cron job name refresh-pygeoapi-materialized-views -> refresh-materialized-views.
- Update CD workflows (staging/testing/production), tests, and docs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The docker/db image is used only for development, so it should not carry
the production pg_cron setup. Revert it to the stock postgis image, remove
the start-postgres.sh wrapper, and document pg_cron as Cloud SQL-only in
production.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The cron job was never deployed, so the migration does not need to drop a
previously created refresh_pygeoapi_materialized_views helper. Remove the
LEGACY_FUNCTION_NAME constant, the DROP in upgrade, and the related note.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jirhiker jirhiker merged commit 59d7eae into staging Jun 17, 2026
9 checks passed
@jirhiker jirhiker deleted the claude/trusting-khayyam-8b5f6e branch June 17, 2026 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant