Calendar Day
Wednesday, June 11, 2026
Planned Effort
8 story points (one PR) — combines sprint items #4 + #6:
| Sprint item |
Points |
Topic |
| #4 |
5 |
Fuzz/adversarial testing for JSONL parser |
| #6 |
3 |
macOS CI in matrix |
Depends on:
- Monday #1 — ruff + pip-audit job must run on
macos-latest once matrix expands
- Tuesday #3 — narrowed TypedDicts define fuzz oracle (merge TypedDict PR before fuzz PR)
Out of scope: performance benchmarks (Thursday #7), frontend changes, parser API redesign.
Problem
- Verification gap: ~4,500 lines of tests cover known-good JSONL and fixtures, but no fuzz or adversarial testing targets
utils/jsonl_parser.py. Schema drift from upstream Claude Code is discovered via user reports, not CI.
- Portability gap: CI runs on
ubuntu-latest and windows-latest (Week 1) but not macos-latest, despite macOS setup docs and POSIX fcntl locking that differs from Windows msvcrt.
Goal
One merged PR that:
- Adds Hypothesis fuzz tests proving
parse_session does not crash on adversarial inputs.
- Extends the CI matrix to
macos-latest for all cross-platform test jobs (and Monday’s lint/audit job).
- Keeps fuzz CI runtime under ~60 seconds via bounded example counts.
Scope
A — Fuzz / adversarial tests (5 pt)
Touch points: requirements-dev.txt, tests/test_parser_fuzz.py (new), optional tests/conftest.py / pyproject.toml Hypothesis profile, utils/jsonl_parser.py (fixes only if fuzz exposes bugs)
- Add
hypothesis>=6.100.0 to dev dependencies.
- Create
tests/test_parser_fuzz.py targeting parse_session(filepath) via temp .jsonl files.
Required strategy coverage:
| Category |
Approach |
| Malformed JSON lines |
Random text lines, invalid JSON |
| Truncated lines |
Partial JSON (concurrent-write simulation) |
| Unknown record types |
type not in user/assistant/system/progress |
| Missing / extra fields |
st.fixed_dictionaries with optional keys removed or added |
| Deep nesting |
Recursive JSON dict/list strategies |
| Long lines |
10k–50k char payloads inside JSON |
| Empty lines |
Blank lines between records |
| Null bytes |
Binary-safe write + UTF-8 decode with errors="replace" |
Invariants:
parse_session must not raise unhandled exceptions for any fuzzed file input.
- Acceptable: normal
SessionDict return; skipped malformed lines (existing behavior).
- If
SessionValidationError or similar structured errors appear, either harden processors or document as allowed — prefer hardening so fuzz passes cleanly.
Required explicit adversarial test:
- Unknown
type value (e.g. totally-new-claude-record) plus one valid user line → session parses, entry_counts reflects unknown type, no crash.
CI budget:
@settings(max_examples=200) per strategy (tune down if needed).
- CI profile via
conftest.py when CI=true (e.g. max_examples=100).
- Full fuzz module should complete in < 60s on
ubuntu-latest.
Ground truth for structured strategies: tests/test_jsonl_parser.py, parametrized tool fixtures, Tuesday TypedDict unions in models/.
B — macOS CI matrix (3 pt)
Touch points: .github/workflows/ci.yml, README.md or CONTRIBUTING.md
Add macos-latest to strategy.matrix.os for:
| Job |
macOS? |
pytest |
yes |
integration-tests |
yes |
js-tests |
yes |
lint-and-audit (Monday) |
yes |
mypy |
no (ubuntu-only) |
prod-install-smoke |
no (bash heredoc) |
Requirements:
- All existing tests pass on macOS without new platform skips (unless genuine difference documented).
fcntl.flock export lock path exercised on macOS (existing tests/test_export_state_store.py should suffice).
- Monday ruff + pip-audit steps pass on macOS runner.
- macOS job time ≤ 2× Ubuntu job (monitor first run; optimize example count if needed).
Docs: Note three-platform CI in README or CONTRIBUTING.
Acceptance Criteria
Fuzz (#4)
macOS CI (#6)
General
Calendar Day
Wednesday, June 11, 2026
Planned Effort
8 story points (one PR) — combines sprint items #4 + #6:
Depends on:
macos-latestonce matrix expandsOut of scope: performance benchmarks (Thursday #7), frontend changes, parser API redesign.
Problem
utils/jsonl_parser.py. Schema drift from upstream Claude Code is discovered via user reports, not CI.ubuntu-latestandwindows-latest(Week 1) but notmacos-latest, despite macOS setup docs and POSIXfcntllocking that differs from Windowsmsvcrt.Goal
One merged PR that:
parse_sessiondoes not crash on adversarial inputs.macos-latestfor all cross-platform test jobs (and Monday’s lint/audit job).Scope
A — Fuzz / adversarial tests (5 pt)
Touch points:
requirements-dev.txt,tests/test_parser_fuzz.py(new), optionaltests/conftest.py/pyproject.tomlHypothesis profile,utils/jsonl_parser.py(fixes only if fuzz exposes bugs)hypothesis>=6.100.0to dev dependencies.tests/test_parser_fuzz.pytargetingparse_session(filepath)via temp.jsonlfiles.Required strategy coverage:
typenot inuser/assistant/system/progressst.fixed_dictionarieswith optional keys removed or addederrors="replace"Invariants:
parse_sessionmust not raise unhandled exceptions for any fuzzed file input.SessionDictreturn; skipped malformed lines (existing behavior).SessionValidationErroror similar structured errors appear, either harden processors or document as allowed — prefer hardening so fuzz passes cleanly.Required explicit adversarial test:
typevalue (e.g.totally-new-claude-record) plus one validuserline → session parses,entry_countsreflects unknown type, no crash.CI budget:
@settings(max_examples=200)per strategy (tune down if needed).conftest.pywhenCI=true(e.g.max_examples=100).ubuntu-latest.Ground truth for structured strategies:
tests/test_jsonl_parser.py, parametrized tool fixtures, Tuesday TypedDict unions inmodels/.B — macOS CI matrix (3 pt)
Touch points:
.github/workflows/ci.yml,README.mdorCONTRIBUTING.mdAdd
macos-latesttostrategy.matrix.osfor:pytestintegration-testsjs-testslint-and-audit(Monday)mypyprod-install-smokeRequirements:
fcntl.flockexport lock path exercised on macOS (existingtests/test_export_state_store.pyshould suffice).Docs: Note three-platform CI in README or CONTRIBUTING.
Acceptance Criteria
Fuzz (#4)
tests/test_parser_fuzz.pyexists using Hypothesistypewith graceful degradationpytest -qpasses locally and in CImacOS CI (#6)
macos-latestin CI matrix for pytest, integration, js-tests, lint-and-auditfcntllocking path verified on macOS CIGeneral
mypyandruff check .pass