perf(ci): select impacted tests via testmon on PR builds by phernandez · Pull Request #945 · basicmachines-co/basic-memory

phernandez · 2026-06-10T15:20:41Z

Summary

The testmon cache has existed in CI since #928 — branch-scoped keys with fallback to main's baseline, wired into all five job families — but BASIC_MEMORY_TESTMON_FLAGS was pinned to --testmon-noselect, so every PR build recorded testmon data and still ran the full suite. The cache never bought any wall-clock.

This flips PR builds to --testmon --testmon-forceselect (impacted tests only, selected against the restored baseline), matching basic-memory-cloud's CI policy. Pushes to main keep --testmon-noselect, running the full suite and refreshing the baseline PR builds select from.

Expected effect: a repeat build of a PR re-runs only tests impacted by the new commits; even a PR's first build selects against main's baseline. Combined with #938 this should take typical PR rounds from ~15 min to a few minutes.

Trade-off (deliberate, same as cloud): required PR checks no longer execute the full matrix every push — main pushes still do, so regressions testmon misses surface on the merge commit.

Test plan

YAML validated; flag values mirror the justfile's TESTMON_SELECT/REFRESH defaults and cloud's test.yml
CI on this PR is itself the live test (first selective run)

🤖 Generated with Claude Code

Reviewed SHA: unknown
Verdict: invalid
Status: failure - BM Bossbot review output was invalid

Summary:
No summary provided.

Blocking findings:

None

Non-blocking findings:

None

The testmon cache (branch-scoped, falling back to main's baseline) has been in place since #928, but the flags pinned --testmon-noselect, so every PR build recorded data and still ran the full suite. Flip PR builds to --testmon --testmon-forceselect like basic-memory-cloud; pushes to main keep --testmon-noselect to refresh the baseline. Signed-off-by: phernandez <paul@basicmachines.co>

tests/scripts and tests/ci exercise bossbot scripts and workflow guards — pure CI tooling. They were running on all unit matrix legs (3 Pythons x 2 backends x 3 OSes). Move them to a single test-ci-tooling run inside the Static Checks job. Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 99ea2514dd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

CI tooling doesn't need product-suite tests burning time on every matrix leg. The bossbot status script is exercised by every PR run. Signed-off-by: phernandez <paul@basicmachines.co>

Unit legs intermittently hang mid-suite (FastMCP/asyncpg cleanup-hang family) and sit until the runner gives up, eating 20+ minutes per occurrence. pytest-timeout turns a hang into a fast failure with a stack dump naming the test. Signed-off-by: phernandez <paul@basicmachines.co>

The Postgres unit suite is the CI long pole. pytest-split divides the collection across a group matrix axis (3 shards x 3 Pythons), each shard a full job with its own Postgres service. Exit code 5 is treated as success in the recipe because a testmon-selected PR build can leave a shard empty. Testmon cache keys gain the shard group. Signed-off-by: phernandez <paul@basicmachines.co>

SQLite jobs carry the Python-version matrix; Postgres jobs carry backend coverage on 3.14 only. Postgres unit: 3 shards x 1 Python instead of 9 jobs; Postgres integration: 1 job instead of 3. Signed-off-by: phernandez <paul@basicmachines.co>

GitHub-hosted runners are free for public repos; Depot bills per minute. With testmon-selected PR builds, sharded Postgres units, and the semantic-search fixture fix, Depot's speed premium no longer justifies the spend. Signed-off-by: phernandez <paul@basicmachines.co>

…aseline A full-run .testmondata is a valid superset baseline for any shard: testmon selects impacted tests from it and pytest-split takes the shard's slice. Without this fallback every shard starts cold until the first post-merge main push records group-keyed baselines. Signed-off-by: phernandez <paul@basicmachines.co>

A change-detection job gates every test job on code paths (src, tests, test-int, alembic, pyproject, uv.lock, justfile, the workflow itself). Docs-only rounds finish in under a minute with all jobs skipped, while the workflow still concludes successfully so the BM Bossbot gate keeps firing and the PR stays mergeable. Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector · 2026-06-10T16:05:27Z

💡 Codex Review

basic-memory/.github/workflows/test.yml

Line 45 in 3ba50e5

- '.github/workflows/test.yml'

Include CI script changes in the test filter

When a PR branch only changes an executable CI helper such as scripts/testmon_cache.py, scripts/bm_bossbot_status.py, or scripts/generate_pr_infographic.py, this new paths filter leaves needs.changes.outputs.code false, so every static/test job is skipped even though the Tests workflow still succeeds and triggers the BM Bossbot workflow_run gate. These scripts are invoked by the justfile and BM Bossbot workflow, so script-only regressions can now merge without the Python checks that previously covered them; add the relevant scripts/** entries to this filter or route those scripts through another required check.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The Tests workflow only triggers on push; PR-branch rounds are push events, so the pull_request conditional never fired and selection was dead on arrival. Branch pushes now select; main pushes record. The paths-filter gets an explicit main base for branch pushes. Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector · 2026-06-10T18:11:03Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

The LLM review gate burned API tokens, failed unrecoverably during the GitHub auth outage, and ended up deadlocking its own replacement PR. The workflow is disabled and its required check removed from the main ruleset; this deletes the workflow, the status/infographic scripts, and the review prompt/schema. Merge discipline (green tests + zero unresolved review threads) is enforced by the merge tooling. Signed-off-by: phernandez <paul@basicmachines.co>

The batch-indexing race has now flaked three CI rounds today. Skipped under CI only (still runs locally); #940 tracks the root cause. Signed-off-by: phernandez <paul@basicmachines.co> (cherry picked from commit 513fef7)

chatgpt-codex-connector · 2026-06-10T19:03:46Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

phernandez added 2 commits June 10, 2026 10:20

chatgpt-codex-connector Bot reviewed Jun 10, 2026

View reviewed changes

Comment thread .github/workflows/test.yml Outdated

phernandez added 7 commits June 10, 2026 10:21

chore(ci): delete bossbot script and workflow guard tests

859c945

CI tooling doesn't need product-suite tests burning time on every matrix leg. The bossbot status script is exercised by every PR run. Signed-off-by: phernandez <paul@basicmachines.co>

phernandez and others added 2 commits June 10, 2026 12:08

Merge branch 'main' into perf/ci-testmon-select-on-prs

73abe4e

phernandez added 2 commits June 10, 2026 14:03

This was referenced Jun 10, 2026

feat(ci): make BM Bossbot a deterministic merge gate #942

Closed

docs(ci): explain bm bossbot pr flow #939

Closed

phernandez merged commit 2f7ef13 into main Jun 10, 2026
27 checks passed

phernandez deleted the perf/ci-testmon-select-on-prs branch June 10, 2026 19:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(ci): select impacted tests via testmon on PR builds#945

perf(ci): select impacted tests via testmon on PR builds#945
phernandez merged 13 commits into
mainfrom
perf/ci-testmon-select-on-prs

phernandez commented Jun 10, 2026 •

edited by github-actions Bot

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Jun 10, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 10, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

phernandez commented Jun 10, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Jun 10, 2026

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot commented Jun 10, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

phernandez commented Jun 10, 2026 •

edited by github-actions Bot

Loading