feat(core): add observation category filter to search#908
Conversation
Add a `categories` filter that scopes observation results to exact category matches, mirroring the existing entity_types/note_types filter end-to-end. Previously `entity_types=["observation"]` only narrowed to the observation row type, so a text search for "requirement" still returned [decision] observations whose body merely mentions the word. The new `categories` parameter adds a parameterized `search_index.category IN (...)` predicate so only rows whose indexed category matches exactly survive (in FTS, vector, and hybrid modes). The search_notes docstrings advertised a 'category:observation' token that was never implemented (no token parsing in code); those misleading lines are removed and replaced with the real `categories=[...]` usage. Plumbing: - SearchQuery gains `categories` (included in no_criteria()). - SearchService threads it through _PreparedSearchQuery, has_criteria, _prepared_has_filters, _search_repository, and _count_repository. - SearchRepository protocol + base abstract search/count, plus _search_vector_only and _search_hybrid (filter re-scan path) carry the new param. - SQLite and Postgres _build_fts_query_parts add the parameterized IN condition. - search_notes MCP tool exposes `categories` (with `category` alias) and wires it into SearchQuery and the search_all_projects fan-out. Tests (integration-first, both backends): - tests/repository/test_search_repository.py + test_postgres_search_repository.py: exact-category match excludes a [decision] obs that mentions "requirement". - tests/services/test_search_service.py: categories propagates through _prepare_query/has_criteria and count agrees with search. - tests/mcp/test_tool_search.py: end-to-end via the MCP tool (FTS + count path). Closes #430 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 46s —— View job PR Review: feat(core): add observation category filter to searchTasks
Code Quality & Standards
Testing & Documentation
Basic Memory Architecture
Security & Performance
Compatibility
Findings🔴 Bug:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2df4e737fd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if categories: | ||
| search_query.categories = categories |
There was a problem hiding this comment.
Don't force entity results when filtering categories
When an MCP caller supplies categories without also supplying entity_types, this assignment makes SearchQuery.no_criteria() pass, but the later default still adds entity_types=[entity]. The repository then searches with both type = 'entity' and category IN (...); because entity rows have NULL categories, category-only searches such as search_notes(project="p", categories=["requirement"]) return no results even though the schema/service/repository now support categories as a standalone criterion and the no-criteria message advertises categories as sufficient. Infer observation or skip the entity default when a category filter is present.
Useful? React with 👍 / 👎.
search_notes defaults entity_types to 'entity' when unset (issue #31), but categories only exist on observations — so search_notes(categories=[...]) without entity_types ANDed the category against entity rows (NULL category) and returned nothing. When categories is supplied without an explicit entity_types, default to observations instead. Adds a regression test for the categories-only path (Codex review on #908). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 1m 29s —— View job PR Review: feat(core): add observation category filter to searchTasks
Code Quality & Standards
Testing & Documentation
Basic Memory Architecture
Security & Performance
Compatibility
Findings✅ Bug fixed:
|
… param Adding the categories param to search_notes changed its stable signature and added a category_filter_count telemetry attribute; update test_tool_contracts.py and test_tool_telemetry.py expectations accordingly. These suites weren't run in the original PR check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 1m 27s —— View job PR Review: feat(core): add observation category filter to searchTasks
Code Quality & Standards
Testing & Documentation
Basic Memory Architecture
Security & Performance
Findings✅ Commit
|
Summary
Adds a
categoriesfilter to search so observation results can be scoped to an exact category, e.g.search_notes(..., entity_types=["observation"], categories=["requirement"])returns only observations whose category is exactlyrequirement— not every row mentioning the word.Previously
entity_types=["observation"]only narrowed to the observation row type, so a text search for "requirement" still surfaced[decision]observations whose body merely mentions the word. This mirrors the existingentity_types/note_typesfilter end-to-end through schema → service → repository (FTS, vector, and hybrid modes).Also fixes the misleading
category:observationtoken that thesearch_notesdocstrings advertised but that was never implemented in code (there is no token parsing anywhere) — those lines now show the realcategories=[...]usage.What changed
SearchQuery.categories: Optional[List[str]], included inno_criteria()._PreparedSearchQuery,_prepare_query,has_criteria,_prepared_has_filters,_search_repository,_count_repository.search/count,_dispatch_retrieval_mode,_search_vector_only(incl. thefilter_requestedre-scan), and_search_hybrid.search_index.category IN (:category_0, ...)condition in_build_fts_query_parts, threaded throughsearch/count.categoriesparam (withcategoryalias,coerce_list), wired intoSearchQueryand thesearch_all_projectsfan-out; misleadingcategory:docstring lines fixed.categoriesincluded in thehas_filtersspan signal. No router signature change needed (it passesSearchQuery.model_dump()).Testing (commands + results)
Integration-first, both backends.
uv run pytest tests/repository/test_search_repository.py tests/repository/test_hybrid_fusion.py tests/repository/test_vector_threshold.py tests/repository/test_vector_pagination.py tests/repository/test_semantic_search_base.py tests/services/test_search_service.py tests/mcp/test_tool_search.py -q→ 189 passed (SQLite)BASIC_MEMORY_TEST_POSTGRES=1 uv run pytest tests/repository/test_postgres_search_repository.py tests/services/test_search_service.py tests/mcp/test_tool_search.py -q→ 127 passed (Postgres, testcontainers)uv run ruff check .→ clean on changed files;uv run ruff format --check→ cleanuv run ty check src tests test-int→ All checks passed!New tests:
test_search_categories_exact_match(SQLite) /test_postgres_search_categories_exact_match(Postgres): a[decision]observation that mentions "requirement" is excluded whencategories=["requirement"]; standalone filter +count()agree; multi-category union works.test_search_categories_filter/test_search_categories_only_is_not_no_criteria(service): propagation through_prepare_query/has_criteria,countagrees withsearch,no_criteria()correctness.test_search_with_categories_filter(MCP tool): end-to-end viasearch_notes(FTS + exact-count path), decision-scoped search does not leak requirement rows.Risk
Low. Purely additive optional parameter; existing callers (context_service, routers, clients) are unaffected since they use keyword args and
categoriesdefaults toNone. No schema/migration change — thecategorycolumn is already indexed.Closes #430
🤖 Generated with Claude Code