Skip to content

fix(mcp): tighten search_notes tags input and normalize for direct callers#941

Open
phernandez wants to merge 5 commits into
mainfrom
fix/910-tags-strict-normalization
Open

fix(mcp): tighten search_notes tags input and normalize for direct callers#941
phernandez wants to merge 5 commits into
mainfrom
fix/910-tags-strict-normalization

Conversation

@phernandez

@phernandez phernandez commented Jun 10, 2026

Copy link
Copy Markdown
Member

Follow-up to #932, which switched the search_notes tags parameter from BeforeValidator(coerce_list) to BeforeValidator(parse_tags) but merged with two unresolved P2 Codex review findings. Refs #910, follow-up to #932.

Finding 1 — Reject malformed tag values before parsing

parse_tags stringifies unsupported types: tags=42 became ["42"], tags={"a": 1} became junk tags. The old coerce_list path let Pydantic reject non-list/str values, so #932 weakened the tool boundary — caller type mistakes turned into silent no-result searches.

Fix: new strict_search_tags wrapper in utils.py used as the BeforeValidator. It only normalizes str / list / None (delegating to parse_tags) and passes any other type through unchanged so Pydantic fails validation with a clear error. parse_tags itself is untouched, so its frontmatter / write_note callers keep their existing behavior.

Finding 2 — Normalize tags for direct tool callers too

The comma splitting lived only in the BeforeValidator, which runs only through the MCP validation layer. src/basic_memory/cli/commands/tool.py imports search_notes and calls it directly with Typer's collected list, so bm tool search-notes --tag alpha,beta forwarded ["alpha,beta"] as one literal tag and silently matched nothing.

Fix: the search_notes function body now runs parse_tags(tags) alongside the other filter-param normalization (before the tag: shorthand merge and SearchQuery construction), so direct callers get the same comma-split/list semantics. The BeforeValidator is retained for the MCP boundary strictness of finding 1; parse_tags is idempotent, so MCP-validated input passes through unchanged.

Test plan

All new tests were verified to fail on unfixed code (src changes stashed) before the fix was applied.

  • tests/test_coerce.py::TestStrictSearchTags — unit coverage of the new coercer: None/comma-string/list/JSON-array normalization, plus int/dict passthrough for Pydantic rejection
  • tests/mcp/test_tool_search.py::test_search_notes_tags_annotation_rejects_non_string_typestags=42 and tags={"a": 1} raise ValidationError through the annotation's TypeAdapter
  • tests/mcp/test_tool_search.py::test_search_notes_tags_invalid_type_rejected_via_mcp — same inputs through a real fastmcp Client raise ToolError
  • tests/mcp/test_tool_search.py::test_search_notes_direct_call_splits_comma_tags — direct search_notes(tags=["alpha,beta"]) call (the CLI path) matches notes tagged alpha, with a negative control
  • Existing [BUG] search_notes(tags="a,b") ignores comma-separated tags (inconsistent with the tag: query shorthand) #910 tests still pass: tags="a,b", ["a","b"], '["a","b"]', "a", None
  • uv run pytest tests/mcp/test_tool_search.py tests/test_coerce.py tests/utils/ -x -q → 239 passed
  • ruff format / ruff check / ty check clean on all touched files

🤖 Generated with Claude Code

Reviewed SHA: cb4a4ad7c64a19ad84cf648e323dcce8d6cfc90c
Verdict: approve
Status: success - BM Bossbot approved this head SHA

Summary:
Reviewed the full provided diff for the latest head SHA. The prior JSON-array string validation gap is covered by the updated strict_search_tags logic and regression tests, and I did not find remaining merge-blocking correctness, boundary, or compatibility issues. I could not run the project tests locally because the environment lacks installed Python dependencies, so this approval is based on static review of the supplied diff.

Blocking findings:

  • None

Non-blocking findings:

  • None

BM Bossbot image for PR #941

BM Bossbot image choices
  • Pull request: #941
  • Generated asset: /home/runner/work/basic-memory/basic-memory/docs/assets/infographics/pr-941.webp
  • Image model: gpt-image-2
  • Size: 1536x1024
  • Quality: high
  • Image mode: editorial-image
  • Theme source: auto

Theme / visual direction:

brutalist civic planning: concrete signage, zoning blocks, transit diagrams, infrastructure maps, and stern public-service clarity

…llers

Follow-up to #932, addressing two P2 Codex review findings that merged
unresolved:

1. Reject malformed tag values at the tool boundary. The bare
   parse_tags BeforeValidator stringified unsupported types (tags=42
   became ["42"], dicts became junk tags), silently turning caller
   mistakes into no-result searches. New strict_search_tags wrapper in
   utils.py only normalizes str/list/None and passes other types
   through unchanged so Pydantic rejects them with a clear validation
   error. parse_tags itself is unchanged for its frontmatter and
   write_note callers.

2. Normalize tags for direct callers too. The comma splitting lived
   only in the BeforeValidator, which runs only through the MCP
   validation layer. cli/commands/tool.py calls search_notes directly
   with Typer's collected list, so `bm tool search-notes --tag a,b`
   forwarded ["a,b"] as one literal tag and matched nothing. The
   function body now runs parse_tags where the other filter params are
   normalized, so every path gets the same comma-split semantics.

Refs #910

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3a62bac4db

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/basic_memory/utils.py
A JSON array with malformed entries like tags=[42] or tags=[{"a": 1}]
was treated as a valid list and parse_tags stringified each element
([42] -> ["42"]) before Pydantic validated List[str], silently turning
caller type mistakes into no-result searches.

strict_search_tags now only normalizes lists whose elements are all
strings; non-conforming lists pass through unchanged so Pydantic rejects
them with a clear validation error. str and None behavior is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 72350e5bd4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/basic_memory/utils.py
strict_search_tags guarded native lists with non-string elements but still
delegated JSON-array strings directly to parse_tags, whose recursive JSON
handling stringified bad elements ('[42]' -> ["42"]) before Pydantic could
validate List[str] — preserving the silent no-result behavior for MCP
clients that serialize arrays as strings.

Mirror parse_tags' JSON-array detection at the boundary: if the parsed list
contains any non-string element, pass the original string through unchanged
so Pydantic rejects it with a clear validation error. Valid all-string JSON
arrays and plain comma strings keep working as before. parse_tags itself is
unchanged (frontmatter callers rely on its lenient behavior).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
@phernandez

Copy link
Copy Markdown
Member Author

Addressed the BM Bossbot blocking finding on the JSON-array-string tags path in f0290fe.

strict_search_tags now mirrors parse_tags' JSON-array detection for string inputs: when a string parses as a JSON array containing any non-string element (e.g. '[42]', '[{"a": 1}]', '["ok", 42]'), the original value passes through unchanged so Pydantic rejects it with a clear List[str] validation error instead of parse_tags recursively stringifying it into junk tags and a silent no-result search. Valid all-string JSON arrays ('["a","b"]') and plain comma strings still normalize exactly as before, and parse_tags itself is untouched for lenient frontmatter callers.

Coverage: unit tests in tests/test_coerce.py (TestStrictSearchTags) plus TypeAdapter and fastmcp Client tests in tests/mcp/test_tool_search.py; all three malformed inputs verified rejected at the MCP layer (ToolError) and the new tests fail with the fix reverted.

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant