Skip to content

test: add Hypothesis fuzz coverage and macos-latest CI matrix#75

Open
clean6378-max-it wants to merge 6 commits into
masterfrom
test/jsonl-fuzz-macos-ci
Open

test: add Hypothesis fuzz coverage and macos-latest CI matrix#75
clean6378-max-it wants to merge 6 commits into
masterfrom
test/jsonl-fuzz-macos-ci

Conversation

@clean6378-max-it

@clean6378-max-it clean6378-max-it commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

closes #71

Summary

  • Add Hypothesis fuzz tests for parse_session (malformed/truncated JSONL, unknown types, adversarial fields)
  • Harden utils/jsonl_parser.py so fuzzed inputs do not raise unhandled exceptions
  • Extend CI matrix to macos-latest (pytest, integration, vitest, lint-and-audit)
  • Document Ubuntu + Windows + macOS CI in CONTRIBUTING

Sprint items

Closes Chen W2 #4, #6 (single PR)

Test plan

  • pytest tests/test_parser_fuzz.py -v
  • pytest -q
  • mypy -p api -p utils -p models
  • ruff check .
  • npm test
  • GitHub Actions: macos-latest matrix legs green for pytest, integration-tests, js-tests, lint-and-audit

Summary by CodeRabbit

  • Bug Fixes

    • Improved JSONL parser robustness to gracefully handle malformed data, including non-numeric token values, unknown record types, and invalid nested fields.
    • Enhanced timestamp parsing to strictly validate ISO format entries.
  • Tests

    • Added comprehensive fuzz testing suite for parser validation across edge cases and adversarial inputs.
  • Chores

    • Expanded CI test matrix to include macOS alongside Ubuntu and Windows.
    • Removed optional dependencies from build configuration.

Add tests/test_parser_fuzz.py with strategies for malformed/truncated JSONL, unknown record types, missing/extra fields, deep nesting, long lines, empty lines, and null bytes. Harden parse_session against adversarial inputs found by fuzz (non-dict JSON values, unhashable type keys, non-str metadata fields). Extend CI matrix to macos-latest for pytest, integration-tests, js-tests, and lint-and-audit. Document Ubuntu + Windows + macOS CI in CONTRIBUTING.
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR hardens the JSONL parser against malformed inputs with defensive type coercion and validation, adds comprehensive Hypothesis fuzz testing to verify crash resilience, extends CI matrix coverage to macOS, and updates deployment tooling for cross-platform Node compatibility.

Changes

Parser Robustness and Cross-Platform CI

Layer / File(s) Summary
CI Matrix Expansion and Hypothesis Configuration
requirements-dev.txt, tests/conftest.py, .github/workflows/ci.yml, package.json, CONTRIBUTING.md
Adds hypothesis>=6.100.0 to dev dependencies, registers Hypothesis dev and ci profiles in conftest with configurable max_examples, expands CI matrix to macos-latest for lint-and-audit, pytest, integration-tests, and js-tests jobs, introduces platform detection and explicit Rollup native binary installation in js-tests to address npm optional-dependencies issues, removes package.json optionalDependencies section, and updates CONTRIBUTING documentation to reflect three-platform CI coverage while preserving Ubuntu-only constraints for mypy and prod-install-smoke.
Defensive Parsing and Type Validation
utils/jsonl_parser.py
Implements _safe_int helper for defensive coercion of token and accounting fields with NaN/Infinity handling, skips non-dict JSONL entries, counts entry types as stringified keys in metadata, validates ISO timestamp strings before parsing wall-clock duration, restricts assistant model inclusion to non-empty non-synthetic strings, applies safe integer conversion throughout token/cache metrics, derives tool names with string guards and "unknown" fallback, and validates file-activity fields (path, bash command, web URL) to accept only non-empty strings.
Hypothesis Fuzz Testing Framework
tests/test_parser_fuzz.py
Introduces centralized FUZZ_SETTINGS configuration with function-scoped-fixture suppression, helper functions for crash-resilient parsing and UTF-8 JSONL writing with newline normalization, parameterized strategy building blocks for adversarial JSON values including NaN/Infinity support, a composite structured_entry strategy that conditionally generates record shapes with optional/missing/extra fields, and comprehensive Hypothesis-driven tests covering raw malformed lines, truncated JSON fragments, fuzzed structured records, deeply nested message content, long payloads, blank lines, embedded null bytes, plus deterministic tests for unknown record types, non-numeric token values, and non-finite JSON numerals ensuring all coerce gracefully to zero without exceptions.

Sequence Diagram(s)

sequenceDiagram
  participant GHA as GitHub Actions
  participant Matrix as os matrix<br/>(ubuntu/windows/macos)
  participant NPMInstall as npm ci
  participant RollupDetect as Rollup platform<br/>detection
  participant RollupInstall as native package<br/>install
  GHA->>Matrix: expand to macos-latest
  Matrix->>NPMInstall: run on each platform
  NPMInstall->>RollupDetect: invoke post-install step
  RollupDetect->>RollupInstall: select matching native binary
  RollupInstall-->>Matrix: complete with platform binary
Loading
sequenceDiagram
  participant Hypothesis as Hypothesis engine
  participant Strategy as fuzz strategies
  participant TempFile as temp JSONL
  participant Parser as parse_session
  participant Result as SessionDict
  Hypothesis->>Strategy: generate adversarial inputs
  Strategy->>TempFile: write malformed/fuzzed JSONL
  TempFile->>Parser: invoke with temp path
  Parser->>Result: return without unhandled exception
  Result-->>Hypothesis: validate session structure
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • cppalliance/claude-code-chat-browser#52: Hardens parse_session and _parse_tool_result with defensive field validation and fuzz robustness; overlaps with real-session fixture tests validating parse behavior and malformed-line handling.

Suggested reviewers

  • timon0305
  • wpak-ai

Poem

🐰 Hops through macOS with glee,
Fuzz tests ensure parser's spree,
No crashes from NaN,
Safe integers scan,
Cross-platform stability! 🎉

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.59% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive The PR addresses core fuzz testing (#4) and macOS CI matrix (#6) requirements. However, concerns exist about incomplete token arithmetic hardening and potential Hypothesis settings conflicts noted in review comments. Verify that _safe_int coercion is applied consistently to all token fields and that Hypothesis decorator settings do not override CI profile settings from conftest.py.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: adding Hypothesis fuzz coverage for the parser and expanding CI matrix to macos-latest.
Out of Scope Changes check ✅ Passed Changes align with stated objectives: fuzz tests, parser hardening, CI matrix expansion, docs updates, and dependencies. The removal of package.json optionalDependencies and rollup installation steps in CI support macOS platform support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch test/jsonl-fuzz-macos-ci

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
requirements-dev.txt (1)

8-8: Update: Hypothesis>=6.100.0 is compatible with current releases and shows no known advisories

  • PyPI latest Hypothesis is 6.155.2; the hypothesis>=6.100.0 lower bound will allow installing current versions unless pinned elsewhere.
  • No security advisories were returned for Hypothesis from the queried GitHub security feed (PIP ecosystem).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@requirements-dev.txt` at line 8, Update the dev requirements to allow modern
Hypothesis releases by ensuring the dependency line for hypothesis uses the
relaxed lower bound "hypothesis>=6.100.0" (package name: hypothesis) so current
PyPI versions (e.g., 6.155.2) can be installed; modify requirements-dev.txt to
replace any stricter pin for hypothesis with this >=6.100.0 bound and commit the
change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_parser_fuzz.py`:
- Around line 22-23: The helper _fuzz_jsonl_path uses tempfile.mkdtemp() which
leaks directories; replace it by using pytest's tmp_path fixture: remove or
change _fuzz_jsonl_path to accept a Path parameter (e.g., tmp_path) and return
tmp_path / name, then update each test that currently calls _fuzz_jsonl_path to
accept tmp_path in its signature and call the helper with tmp_path (or inline
use tmp_path / name) so temporary directories are managed and cleaned up by
pytest.

---

Nitpick comments:
In `@requirements-dev.txt`:
- Line 8: Update the dev requirements to allow modern Hypothesis releases by
ensuring the dependency line for hypothesis uses the relaxed lower bound
"hypothesis>=6.100.0" (package name: hypothesis) so current PyPI versions (e.g.,
6.155.2) can be installed; modify requirements-dev.txt to replace any stricter
pin for hypothesis with this >=6.100.0 bound and commit the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ce05ebaf-529d-411f-b105-1071ec83a6c6

📥 Commits

Reviewing files that changed from the base of the PR and between 8083f3e and 52db5dc.

📒 Files selected for processing (7)
  • .github/workflows/ci.yml
  • CONTRIBUTING.md
  • pyproject.toml
  • requirements-dev.txt
  • tests/conftest.py
  • tests/test_parser_fuzz.py
  • utils/jsonl_parser.py

Comment thread tests/test_parser_fuzz.py Outdated
Install platform Rollup binary after npm ci; drop Linux-only optionalDep.
Use pytest tmp_path in test_parser_fuzz.py instead of leaking mkdtemp dirs.
@clean6378-max-it

Copy link
Copy Markdown
Collaborator Author

@CodeRabbit. please review this PR in more depth

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -e

echo "=== tests/test_parser_fuzz.py ==="
cat tests/test_parser_fuzz.py

echo ""
echo "=== utils/jsonl_parser.py ==="
cat utils/jsonl_parser.py

echo ""
echo "=== tests/conftest.py ==="
cat tests/conftest.py

echo ""
echo "=== .github/workflows/ci.yml ==="
cat .github/workflows/ci.yml

echo ""
echo "=== pyproject.toml ==="
cat pyproject.toml

echo ""
echo "=== requirements-dev.txt ==="
cat requirements-dev.txt

echo ""
echo "=== package.json ==="
cat package.json

echo ""
echo "=== CONTRIBUTING.md ==="
cat CONTRIBUTING.md

Length of output: 40248



@clean6378-max-it — here is the detailed review.


Deep Review: PR #75 — Hypothesis Fuzz Tests + macOS CI

🔴 Critical / High

1. Three-way max_examples conflict — CI profile is effectively dead for fuzz tests

Hypothesis resolves settings in this priority order: per-test decorator > loaded profile > pyproject.toml default profile.

Source max_examples deadline Effective in CI?
pyproject.toml [tool.hypothesis] 200 default ❌ overridden
conftest.py CI profile (max_examples=100, deadline=None) 100 None ❌ overridden by decorator
FUZZ_SETTINGS decorator on every test 200 5000 ms ✅ wins

The conftest.py CI profile is completely overridden by @FUZZ_SETTINGS applied to every test. Each fuzz test will always run 200 examples, even on CI. The effort put into conftest.py and the CI-profile setup is wasted.

Fix: Either remove @FUZZ_SETTINGS from tests and rely on the profile + pyproject.toml, or use settings(parent=settings.get_profile("ci"), ...) in CI-aware settings, or don't apply deadline / max_examples in the decorator and let them come from the profile:

# Option A: remove the decorator entirely and rely on conftest.py profile
`@given`(st.lists(st.text(min_size=0, max_size=500), min_size=0, max_size=30))
def test_raw_line_soup_does_not_crash(tmp_path: Path, lines: list[str]) -> None: ...

# Option B: only suppress health checks (not override counts/deadlines)
FUZZ_SETTINGS = settings(suppress_health_check=[HealthCheck.function_scoped_fixture])

2. Token arithmetic TypeError — unhandled exception under fuzz

In utils/jsonl_parser.py, _process_assistant uses or 0 for token counting:

metadata["total_input_tokens"] += usage.get("input_tokens") or 0

If usage["input_tokens"] is a truthy non-numeric value (e.g., the string "five", or a non-empty list), then:

"five" or 0  →  "five"
int += "five"  →  TypeError: unsupported operand type(s) for +=: 'int' and 'str'

This is reachable via test_structured_entries_with_fuzzed_fields: the strategy can produce msg["usage"] = {"input_tokens": "non-empty-string"} (via st.dictionaries(st.text(), _json_value)). The same issue affects output_tokens, cache_read_input_tokens, cache_creation_input_tokens, and the two ephemeral cache fields in cache_creation.

Fix — coerce with int():

def _safe_int(val: Any) -> int:
    try:
        return int(val) if isinstance(val, (int, float)) else 0
    except (TypeError, ValueError):
        return 0

metadata["total_input_tokens"] += _safe_int(usage.get("input_tokens"))
metadata["total_output_tokens"] += _safe_int(usage.get("output_tokens"))
# ... and the four other token fields

🟡 Medium

3. ALLOWED_EXCEPTIONS — dead code branch

ALLOWED_EXCEPTIONS: tuple[type[BaseException], ...] = ()

def _parse_file_without_crash(path: str) -> None:
    try:
        parse_session(path)
    except Exception as exc:
        if ALLOWED_EXCEPTIONS and isinstance(exc, ALLOWED_EXCEPTIONS):  # always False
            return
        raise AssertionError(...) from exc

if ALLOWED_EXCEPTIONS on an empty tuple short-circuits to False immediately — the inner branch is never reachable. Either populate the tuple with actually-allowed exceptions or remove the conditional entirely to simplify the helper.


4. sys.path.insert anti-pattern in the test file

sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))

No other test file in the project does this (pytest resolves the path via testpaths = ["tests"] in pyproject.toml). This can shadow installed packages and makes the intent unclear. If conftest.py already sets up the path correctly for all other tests, this line is unnecessary and should be removed.


5. test_unknown_record_type_is_graceful — imprecise assertion

assert len(session["messages"]) >= 1

The JSONL has exactly 2 lines. The "totally-new-claude-record" type falls through all elif branches in parse_session and produces no message entry; only the "user" entry generates one. The count should be exactly 1. >= 1 would pass even if, say, the unknown-type path accidentally started inserting extra messages.

# More precise:
assert len(session["messages"]) == 1

6. CI profile deadline=None vs decorator deadline=5000 inconsistency

Even if the CI profile were effective, it sets deadline=None (disable per-example timeout) while FUZZ_SETTINGS sets deadline=5000 ms. The intent appears contradictory — CI should be less strict on timing, not more.


🟢 Low / Suggestions

7. _fuzz_jsonl_path is a trivial one-liner

def _fuzz_jsonl_path(tmp_path: Path, name: str) -> Path:
    return tmp_path / name

This adds no logic. Inline it (or keep it only if you plan to add validation logic later).


8. test_truncated_json_line — truncation logic is off-by-meaning

half = json.dumps(prefix)[: max(1, len(prefix) // 2)]

This truncates the JSON-encoded representation of prefix. For short or empty prefix, len(prefix) // 2 = 0max(1, 0) = 1, yielding just " (one double-quote). While this does exercise malformed input, it doesn't closely model real-world truncated writes (partial JSON objects mid-key or mid-value). A closer model would be directly slicing a serialized JSON object:

full_line = json.dumps({"type": "user", "message": {"content": prefix}})
truncated = full_line[: max(1, len(full_line) // 2)]

9. Rollup case fallback exits with exit 1 for unknown platforms

If a linux-arm64 or other non-listed runner is ever added to the matrix, the CI job will fail at the Rollup installation step rather than continuing. Consider a softer fallback:

*) echo "Warning: no pre-built Rollup binary for $(node -p "process.platform + '-' + process.arch"), skipping."; exit 0 ;;

10. Three coverage artifacts without a merge step

integration-tests now uploads coverage-report-ubuntu-latest, coverage-report-windows-latest, and coverage-report-macos-latest. If a Codecov upload or coverage comparison is added later, merging them will require extra steps. Consider whether macOS/Windows coverage needs to be uploaded at all (the Ubuntu artifact is sufficient for line-coverage reporting).


Summary Table

# Severity File Issue
1 🔴 High conftest.py, test_parser_fuzz.py, pyproject.toml Per-test @FUZZ_SETTINGS overrides CI profile — max_examples=200 always runs in CI
2 🔴 High utils/jsonl_parser.py Token += usage.get(key) or 0 crashes on truthy non-int values (e.g., strings from fuzz)
3 🟡 Medium tests/test_parser_fuzz.py ALLOWED_EXCEPTIONS = () makes inner exception branch dead code
4 🟡 Medium tests/test_parser_fuzz.py sys.path.insert anti-pattern — unnecessary in this project
5 🟡 Medium tests/test_parser_fuzz.py >= 1 assertion too loose in test_unknown_record_type_is_graceful
6 🟡 Medium conftest.py, test_parser_fuzz.py CI profile deadline=None vs. decorator deadline=5000 are contradictory
7 🟢 Low tests/test_parser_fuzz.py _fuzz_jsonl_path is a trivial wrapper — inline it
8 🟢 Low tests/test_parser_fuzz.py Truncation in test_truncated_json_line cuts JSON-encoded bytes, not a JSON object
9 🟢 Low .github/workflows/ci.yml Rollup case exits 1 for unknown platforms instead of graceful skip
10 🟢 Low .github/workflows/ci.yml 3 OS coverage artifacts uploaded with no consumer to merge them

The hardening changes in utils/jsonl_parser.py (skipping non-dict lines, stringifying type keys, guarding model/tier/stop-reason to non-empty strings) are solid improvements. Issue #2 above is the most important remaining gap — it's exactly the class of crash the fuzz suite is meant to prevent, and it's reachable with the existing strategies.

@clean6378-max-it

Copy link
Copy Markdown
Collaborator Author

CR #4 (sys.path anti-pattern): factually wrong — tests/test_jsonl_parser.py:8 does the identical sys.path.insert, so it's the existing project convention. Removing it risks import breakage.
CR #9 (Rollup exit 1 fallback): intentional fail-fast; every OS in the matrix is covered, so the branch is unreachable today and a hard failure is the correct signal if an uncovered runner is ever added.
CR #10 (three coverage artifacts): pre-existing per-OS artifact naming (coverage-report-${{ matrix.os }}), not introduced by this PR; no merge step is needed unless Codecov is added later.

@CodeRabbit

Add _safe_int() so non-numeric usage fields coerce to 0 instead of raising
TypeError during token accumulation; apply to all token fields. Drop
max_examples/deadline from the test decorator so the conftest ci/dev profile
actually governs fuzz runtime. Inline _fuzz_jsonl_path, tighten truncation
model and unknown-type assertion, simplify ALLOWED_EXCEPTIONS, and note macOS
in the CONTRIBUTING PR checklist.
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the user's claim about sys.path.insert in test_jsonl_parser.py
head -15 tests/test_jsonl_parser.py

echo ""
echo "=== Check pre-existing coverage artifact naming in ci.yml ==="
grep -n "coverage-report" .github/workflows/ci.yml

Length of output: 752



@clean6378-max-it — confirmed on all three points:

  • CR #4: Verified — tests/test_jsonl_parser.py line 8 has the exact same sys.path.insert call. It's established project convention; withdrawing that comment.
  • CR #9: Accepted — intentional fail-fast is the right signal for an unsupported runner. Withdrawing.
  • CR #10: Verified — coverage-report-${{ matrix.os }} is already at line 168 in the existing CI file; the pattern predates this PR. Withdrawing.

Apologies for the noise on those three.

🐇 (´• ω •`)


✏️ Learnings added
Learnt from: clean6378-max-it
URL: https://github.com/cppalliance/claude-code-chat-browser/pull/75

Timestamp: 2026-06-11T09:42:43.163Z
Learning: In the cppalliance/claude-code-chat-browser repository, using `sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))` at the top of test files is the established project convention (see `tests/test_jsonl_parser.py` line 8). Do not flag this as an anti-pattern in this repo.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@clean6378-max-it clean6378-max-it self-assigned this Jun 11, 2026
Comment thread utils/jsonl_parser.py Outdated
Comment thread tests/test_parser_fuzz.py Outdated
Comment thread pyproject.toml Outdated
Comment thread utils/jsonl_parser.py Outdated
Comment thread utils/jsonl_parser.py Outdated
Comment thread .github/workflows/ci.yml
Comment thread package.json
Comment thread tests/conftest.py
Guard _safe_int against NaN/Infinity (json.loads accepts these literals;
int(nan)/int(inf) raise), broaden fuzz floats to cover them, and add an
explicit non-finite regression test. Revert entry_counts to truthiness so
empty-string types stay skipped, wrap service_tier in the message payload,
and drop the no-op [tool.hypothesis] block from pyproject.toml.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@utils/jsonl_parser.py`:
- Around line 50-53: The helper _safe_int currently returns negative ints or
converts floats to ints without clamping, so update _safe_int to ensure any
returned numeric value is non-negative by clamping negatives to 0 (for
isinstance(val, int) and after converting floats with math.isfinite). Locate
_safe_int in utils/jsonl_parser.py and change both the int branch and the float
branch to return max(0, int(value)) (or 0 for non-finite floats); also search
the file for the same int/float handling patterns (the similar blocks noted in
the review) and apply the same non-negative clamping to those functions/branches
so token/accounting counters can never go below zero.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a9cae128-7856-4b1b-a828-9c9efb71d22b

📥 Commits

Reviewing files that changed from the base of the PR and between 6e3ae48 and e963fcf.

📒 Files selected for processing (2)
  • tests/test_parser_fuzz.py
  • utils/jsonl_parser.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_parser_fuzz.py

Comment thread utils/jsonl_parser.py Outdated
Negative usage token values from adversarial JSONL must not reduce session
metadata totals; apply max(0, ...) in _safe_int and add regression test.
@clean6378-max-it

Copy link
Copy Markdown
Collaborator Author

@CodeRabbit. do you think my last commits since @timon0305 's comments addressed all?

Currently they are not resolved yet so I am asking you to check thosee again and if they are addressed ,I will resolve t hose off clearly.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== utils/jsonl_parser.py (relevant sections) ==="
grep -n "_safe_int\|or 0\|input_tokens\|output_tokens\|cache_read\|cache_creation" utils/jsonl_parser.py

echo ""
echo "=== tests/test_parser_fuzz.py (full) ==="
cat tests/test_parser_fuzz.py

echo ""
echo "=== tests/conftest.py ==="
cat tests/conftest.py

echo ""
echo "=== pyproject.toml ==="
cat pyproject.toml

Length of output: 16679


🏁 Script executed:

#!/bin/bash
echo "=== _safe_int implementation ==="
sed -n '40,60p' utils/jsonl_parser.py

Length of output: 901



@clean6378-max-it — here is a complete tally of every outstanding item against the current branch.


Status of open review items

# Severity Issue Status
1 🔴 High @FUZZ_SETTINGS overrides CI profile — max_examples=200 always ran in CI AddressedFUZZ_SETTINGS now only suppresses function_scoped_fixture; max_examples and deadline are no longer set in the decorator, so the conftest.py CI/dev profiles govern them correctly.
2 🔴 High usage.get(key) or 0 crashes on truthy non-int values (string, list…) Addressed_safe_int is defined (lines 44–55) and applied to all six token fields (input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokens, and the two ephemeral fields). It also handles bools, non-finite floats, and negatives → 0.
3 🟡 Medium ALLOWED_EXCEPTIONS = () made the inner exception branch dead code ⚠️ Improved, still technically dead — the helper is now restructured as except ALLOWED_EXCEPTIONS: return which is a cleaner, extensible pattern. However with an empty tuple that except block is still unreachable. Not blocking, but worth a comment or removal.
5 🟡 Medium >= 1 assertion too loose in test_unknown_record_type_is_graceful Addressed — assertion is now == 1.
6 🟡 Medium CI profile deadline=None vs. decorator deadline=5000 contradictory Addresseddeadline=5000 removed from FUZZ_SETTINGS; conftest.py CI profile's deadline=None now applies.
7 🟢 Low _fuzz_jsonl_path is a trivial wrapper Addressed — removed from the file.
8 🟢 Low Truncation sliced the JSON-encoded prefix, not a JSON object Addressedtest_truncated_json_line now serialises a full object and slices the result.

One remaining minor note (issue #3)

ALLOWED_EXCEPTIONS: tuple[type[BaseException], ...] = ()

def _parse_file_without_crash(path: str) -> None:
    try:
        parse_session(path)
    except ALLOWED_EXCEPTIONS:   # ← never matches when tuple is empty
        return
    except Exception as exc:
        raise AssertionError(...)

except (): is valid Python and won't error at runtime, but the return is unreachable until something is added to the tuple. The structure is clearly forward-looking, so a short inline comment would make the intent explicit:

# Empty for now — add exception types here if graceful parser errors are introduced.
ALLOWED_EXCEPTIONS: tuple[type[BaseException], ...] = ()

That's the only remaining item. All high and medium issues from the deep review are resolved — you can safely mark those comments as resolved.

🐇 ─=≡Σ((( つ•̀ω•́)つ

@timon0305 timon0305 requested a review from wpak-ai June 11, 2026 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

claude-code-chat-browser: Add JSONL parser fuzz tests and macos-latest CI matrix

2 participants