Skip to content

fix(core): split comma-separated tags in parse_tags#911

Merged
phernandez merged 2 commits into
mainfrom
fix/parse-tags-comma
Jun 8, 2026
Merged

fix(core): split comma-separated tags in parse_tags#911
phernandez merged 2 commits into
mainfrom
fix/parse-tags-comma

Conversation

@phernandez

Copy link
Copy Markdown
Member

Summary

bm tool write-note --tags "a,b" stored a single mangled tag a,b instead of splitting into ["a", "b"], diverging from the MCP write_note(tags="a,b") convention. Typer collects a single --tags "a,b" value into the one-element list ["a,b"], and parse_tags() only split a bare string on commas — list elements were returned verbatim, so the comma survived as part of a single literal tag.

parse_tags() (src/basic_memory/utils.py) now flattens by splitting each list element on commas before stripping #/whitespace, so "a,b", ["a,b"], and ["a", "b"] all converge to ["a", "b"]. This is the single source of truth, so CLI --tags "a,b", CLI --tags a --tags b, and the MCP comma/list/JSON forms now agree.

The search_notes tags coercer is intentionally left unchanged — its comma behavior is tracked separately as #910.

Bugs fixed

Testing

  • uv run pytest test-int/bughunt_fixes/test_parse_tags_comma_split_integration.py -q --no-cov — 2 passed (CLI/MCP convergence + MCP baseline; the bug-hunt integration test that failed on main now passes).
  • uv run pytest tests/utils/test_parse_tags.py -q --no-cov — 23 passed (added list-comma parametrize cases).
  • uv run pytest tests/mcp/test_tool_write_note.py test-int/cli/ tests/mcp/test_tool_contracts.py tests/mcp/test_tool_telemetry.py -q --no-cov — 116 passed (no regressions; tool signatures unchanged).
  • uv run ruff check . --fix, uv run ruff format ., uv run ty check src tests test-int — all clean for scoped files.

Risk

Low. Behavior change is confined to the parse_tags() list branch and only affects list elements that themselves contain commas — previously these produced an invalid single-literal tag. No existing test relied on commas being preserved inside a list element. The search-notes coercer is untouched.

🤖 Generated with Claude Code

Bug #7 (integration bug hunt): `bm tool write-note --tags "a,b"` stored one
mangled tag "a,b" instead of splitting into ["a","b"], diverging from the MCP
write_note convention. Typer collects a single `--tags "a,b"` into the
one-element list ['a,b'], and parse_tags() only split a bare string on commas,
leaving list elements verbatim.

parse_tags() now flattens by splitting each list element on commas, so "a,b",
["a,b"], and ["a","b"] all converge to ["a","b"]. The search_notes 'tags'
coercer is intentionally left unchanged (tracked separately as #910).

Tests come from the integration bug hunt: a CLI/MCP convergence test under
test-int/bughunt_fixes/ plus unit parametrize cases in tests/utils/test_parse_tags.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Claude finished @phernandez's task in 1m 35s —— View job


PR Review: fix(core): split comma-separated tags in parse_tags

Tasks:

  • Gather context and understand the request
  • Read changed files
  • Review code against team checklist
  • Post review findings

Summary

This is a clean, minimal bug fix. The root cause is well-diagnosed: Typer wraps a single --tags "a,b" argument into ["a,b"], and the old parse_tags() list branch returned list elements verbatim — skipping the comma split that the string branch performed. The fix (splitting each list element on commas before stripping) is the right single source of truth.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory coding conventions
  • Python 3.12+ type annotations maintained
  • 100-character line limit respected
  • No commented-out code
  • Literate programming style followed — the new Trigger / Why / Outcome comment block in utils.py:529–533 is well-written and matches the project's documented convention

Testing & Documentation

  • Unit tests added — 2 new parametrize cases in tests/utils/test_parse_tags.py
  • Integration tests added — test-int/bughunt_fixes/test_parse_tags_comma_split_integration.py
  • Edge cases covered (single-element list with comma, mixed list, whitespace trimming)
  • 100% coverage goal maintained (no net-new uncovered lines)

Basic Memory Architecture

Security & Performance

  • No secrets or credentials
  • No sensitive data in logs
  • No Windows/unicode compatibility regressions introduced

Findings

No blocking issues. Three minor observations:

1. Defensive str(raw) cast is harmless but unnecessary (utils.py:537)

for tag in str(raw).split(",")

The type annotation is Union[List[str], str, None], so raw is already a str when we're inside the isinstance(tags, list) branch. The str() call is a no-op in all reachable paths. Not worth changing, but worth noting it's defensive.

2. Pre-existing duplicate import json inside parse_tags() (utils.py:545)

json is already imported at the module top (line 3). The inline import json inside the function was pre-existing and is not introduced by this PR — flagging only for awareness, not a blocker.

3. New bughunt_fixes/ test sub-directory (test-int/bughunt_fixes/)

This is a new directory without precedent in test-int/. That's fine for a deliberate regression-guard pattern, but the team may want to settle on a naming convention (e.g. test-int/regression/) if more bughunt tests accumulate. The __init__.py and module docstring are properly in place.


Conclusion

The fix is correct and well-scoped. The if tag and tag.strip() filter correctly handles empty segments from double-commas (a,,b) and whitespace-only splits. The integration test verifies end-to-end CLI ↔ MCP convergence, which is the most important behavioral guarantee. Ready to merge.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dfef6f4464

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/basic_memory/utils.py
return [
tag.strip().lstrip("#")
for raw in tags
for tag in str(raw).split(",")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip null list entries before stringifying

When frontmatter or a JSON tag array contains a null entry, e.g. tags: [alpha, null], normalize_frontmatter_metadata() preserves that None and passes the list into parse_tags(). The old list branch skipped falsy entries, but this change stringifies each raw item before filtering, so None becomes the literal tag "None" and can be indexed or written back as a real tag. Filter raw/None before splitting so comma-separated strings are still flattened without reviving previously ignored null tag values.

Useful? React with 👍 / 👎.

Codex review: parse_tags must not revive a null list entry (YAML 'tags: [alpha, null]') as the literal tag 'None' — skip None before stringifying. Also fix the comma-split integration test's frontmatter assertion to use splitlines() so it passes on Windows CRLF.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Claude finished @phernandez's task in 1m 35s —— View job


PR Review: fix(core): split comma-separated tags in parse_tags

Tasks:

  • Gather context and understand the request
  • Read changed files (current state after 2nd commit)
  • Review code against team checklist
  • Post review findings

Summary

This is a well-scoped, clean bug fix. The root cause (Typer collecting --tags "a,b" into ["a,b"] which bypassed the comma-split path) is correctly diagnosed and fixed at the single source of truth. The second commit (2be0731) proactively addressed the Codex finding about None list entries being stringified to the literal tag "None" — that's a good sign of attentive iteration.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory coding conventions — Trigger/Why/Outcome comment pattern used correctly (utils.py:529–535)
  • Python 3.12+ type annotations maintained — no regressions
  • 100-character line limit respected
  • No commented-out code blocks
  • SQLAlchemy / FastAPI / Typer conventions not touched (out of scope)

Testing & Documentation

  • Unit tests added — 3 new parametrize cases in tests/utils/test_parse_tags.py (list-comma split, mixed list+comma, None entry)
  • Integration tests added — test-int/bughunt_fixes/test_parse_tags_comma_split_integration.py with both CLI and MCP convergence assertions
  • Edge cases covered — double-comma ("a,,b"), trailing/leading whitespace, None entries, mixed lists
  • 100% coverage maintained — no net-new uncovered lines introduced

Basic Memory Architecture

Security & Performance

  • No hardcoded secrets or credentials
  • No sensitive data in logs
  • Input validation maintained (falsy-filter if tag and tag.strip() handles double-commas and whitespace-only splits)
  • No Windows/unicode compatibility regressions

Compatibility

  • No emoji/unicode in console output
  • Integration test uses content.splitlines() for line-ending agnosticism (CRLF-tolerant — addressed in 2nd commit)

Findings

No blocking issues. Three minor observations for awareness:

1. Type annotation doesn't reflect None list elements (utils.py:511)

def parse_tags(tags: Union[List[str], str, None]) -> List[str]:

The function now explicitly handles None inside the list (which the old code silently ignored via the falsy filter). The accurate annotation in Python 3.12 style would be list[str | None] | str | None. This is low-priority since static checkers won't flag callers passing None elements anyway (YAML frontmatter parsing produces these at runtime), but it's a documentation gap. Fix this →

2. Pre-existing inline import json inside parse_tags() (utils.py:547)

json is already imported at module top (line 3). The inline import is pre-existing and not introduced by this PR — noting only for completeness, not a blocker.

3. New test-int/bughunt_fixes/ directory convention

This is a new directory pattern without precedent in test-int/. It's properly structured (with __init__.py and a module docstring). If more regression guards accumulate, the team may want to settle on a canonical name (e.g., test-int/regression/ or test-int/bughunts/). Not a blocker for this PR.


Conclusion

The fix is correct and the None-entry guard (added in the 2nd commit in response to the Codex review) closes the only realistic edge case. The integration test verifies CLI ↔ MCP convergence end-to-end, which is the key behavioral guarantee. Ready to merge.

@phernandez phernandez merged commit 85a8e59 into main Jun 8, 2026
25 checks passed
@phernandez phernandez deleted the fix/parse-tags-comma branch June 8, 2026 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant