chore(ci): add grammar version parity check between npm devDeps and Cargo.toml#1270
Conversation
Two divergences from the full build were causing watch-mode to insert more `calls` edges than a clean full build: 1. `buildCallEdges` had no `seenCallEdges` dedup set, so the same (caller, target) pair could be inserted multiple times within a single file's call list. 2. The global name fallback in `resolveCallTargets` returned ALL nodes matching the call name across the entire DB, while the full build's `resolveByMethodOrGlobal` filters to `computeConfidence >= 0.5`. Low-confidence cross-module matches were creating spurious edges that don't appear in the full build. Closes #1259 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep origin/main's resolveByMethodOrGlobal helper in resolveCallTargets instead of the PR's inline confidence filter; the helper is equivalent but also handles the type-aware receiver branch. Fix the duplicate seenCallEdges declaration that resulted from the merge.
…remental builds
In classifyNodeRolesIncremental, two full GROUP BY edge-table scans ran on
every incremental rebuild to recompute global fan-in/fan-out medians — even
for a 1-file change. On a large graph these dominated the ~20 ms roles phase.
Fix: write {fanIn, fanOut, edgeCount} to build_meta after each full
classification and read from cache in the incremental path. The cache is
invalidated when the edge count drifts by more than max(500, 10%),
which covers large structural changes but never a 1-file edit.
In classifyNodeRolesFull, medians are now computed from the already-loaded
classifierInput rows and passed as medianOverrides to classifyRoles, avoiding
a second in-memory recomputation and priming the cache in one pass.
Expected: incremental roles phase drops from ~20 ms to ~2-3 ms on a cache
hit (the common case after any full build).
|
Claude finished @carlos-alm's task —— View job PR Review: perf(roles): scope incremental role classification with median cacheTasks
Root Cause AnalysisProblem Being Solved: Why This Approach: The median doesn't meaningfully change for a single-file edit, so caching with a staleness check ( Risk Assessment: Low. The cache degrades gracefully — a miss falls back to the original full scan and primes the cache. The invalidation threshold is conservative and uses absolute+relative guards. Phase 2: Configuration ChangesNo linting rules changed, no TypeScript strict flags disabled, no test timeouts or coverage thresholds modified. ✓ Clean. Phase 3: Test IntegrityThe test file change ( All existing assertions remain identical or stricter. No skipped tests. ✓ Clean. Phase 4: Code QualityConcerns: 1. Stale median risk on incremental-only sessions ( The cache is populated by More importantly:
This asymmetry means that on the very first incremental build after a fresh full build, the cached median may differ from what 2. Double
3. Called on every incremental build — this is O(1) but involves a full-table count on the 4. The fix in The Backlog Compliance
Critical ConcernsNone that block merging. The asymmetry between The PR conflates two concerns (bug fix in call resolver + performance optimization in role classifier) in a single commit/PR, which the CLAUDE.md explicitly flags: "One PR = one concern." However, both changes target the same code path (incremental watch rebuild) and the bug fix was likely surfaced while writing the performance test, so splitting retroactively may not be practical. Final Recommendation
|
Greptile SummaryThis PR adds a lightweight CI check (
Confidence Score: 5/5Safe to merge — the new CI check is read-only and gated correctly, and the structure.ts changes address previously flagged median-caching concerns without touching any hot paths beyond what was already reviewed. All three changed files are additive or fixing prior review feedback. The grammar version check cannot break any build artifact and only gates CI. The structure.ts changes correctly scope the staleness check to the calls+imports-type edge subset and eliminate the redundant COUNT query. The single robustness note on the TOML parser is a future-proofing concern that does not affect the current 29 passing grammars. No files require special attention, though Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Read package.json devDeps\nRead Cargo.toml] --> B[For each grammar in GRAMMAR_NPM_PACKAGES]
B --> C{npm spec present?}
C -- No --> F1[FAIL: absent from devDeps]
C -- Yes --> D{Cargo crate present?}
D -- No --> F2[FAIL: crate absent from Cargo.toml]
D -- Yes --> E{extractNpmVersion}
E -- null\ngithub:/git+ ref --> S1[SKIP: no semver]
E -- version string --> G{parseMajor both sides}
G -- parse fails --> S2[SKIP: unparseable]
G -- both parsed --> H{npmMajor === cargoMajor?}
H -- Yes --> OK[checked++]
H -- No --> I{In KNOWN_EXCEPTIONS?}
I -- Yes --> S3[SKIP: excused mismatch]
I -- No --> F3[FAIL: major-version mismatch]
F1 & F2 & F3 --> EXIT1[exit 1 + remediation message]
OK & S1 & S2 & S3 --> DONE[Grammar version parity OK\nN checked, M skipped]
Reviews (5): Last reviewed commit: "Merge branch 'main' into fix/watch-calls..." | Re-trigger Greptile |
| const currentCount = (db.prepare('SELECT COUNT(*) AS cnt FROM edges').get() as { cnt: number }) | ||
| .cnt; | ||
| if ( | ||
| Math.abs(currentCount - cached.edgeCount) > | ||
| Math.max(MEDIAN_INVALIDATION_DELTA, cached.edgeCount * 0.1) | ||
| ) | ||
| return null; | ||
| return { fanIn: cached.fanIn, fanOut: cached.fanOut }; | ||
| } catch { | ||
| return null; | ||
| } |
There was a problem hiding this comment.
Edge-count staleness check uses total edges, medians use a subset
readCachedMedians compares SELECT COUNT(*) FROM edges (all kinds) against the cached count, but the medians were computed only from edges with kind IN ('calls', 'imports-type') (fan-in) and kind = 'calls' (fan-out). A restructuring that adds or removes many imports, reexports, or dynamic-imports edges — without touching calls — can push the total past the 500 / 10 % threshold and evict a still-valid cache entry. The inverse is also possible: a change that significantly shifts the calls distribution while keeping the total edge count stable would leave a stale cache in place. Aligning the invalidation count with SELECT COUNT(*) FROM edges WHERE kind IN ('calls', 'imports-type') would make the staleness signal track what it is actually protecting.
There was a problem hiding this comment.
Fixed — updated readCachedMedians to count only kind IN ('calls', 'imports-type') edges when checking staleness, matching the subset the medians are derived from. This ensures only relevant edge changes can evict the cache.
| function computeGlobalMediansFromEdges(db: BetterSqlite3Database): { | ||
| fanIn: number; | ||
| fanOut: number; | ||
| } { | ||
| const fanInDist = ( | ||
| db | ||
| .prepare( | ||
| `SELECT COUNT(*) AS cnt FROM edges WHERE kind IN ('calls', 'imports-type') GROUP BY target_id`, | ||
| ) | ||
| .all() as { cnt: number }[] | ||
| ) | ||
| .map((r) => r.cnt) | ||
| .sort((a, b) => a - b); | ||
| const fanOutDist = ( | ||
| db | ||
| .prepare(`SELECT COUNT(*) AS cnt FROM edges WHERE kind = 'calls' GROUP BY source_id`) | ||
| .all() as { cnt: number }[] | ||
| ) | ||
| .map((r) => r.cnt) | ||
| .sort((a, b) => a - b); | ||
| return { fanIn: median(fanInDist), fanOut: median(fanOutDist) }; |
There was a problem hiding this comment.
computeGlobalMediansFromEdges and the full-build median computation are subtly different
classifyNodeRolesFull derives medians from classifierInput, which only contains nodes whose kind is NOT file, directory, parameter, or property. computeGlobalMediansFromEdges computes from raw GROUP BY target_id without any node-kind filter, so if any such excluded-kind nodes appear as a target_id in calls/imports-type edges, the distribution will differ. On a cold cache (first incremental run, no prior full build, or after a large drift triggers invalidation) the fallback medians and the full-build medians diverge. Adding a join to nodes with the same kind filter would keep the two paths consistent.
There was a problem hiding this comment.
Fixed — computeGlobalMediansFromEdges now joins nodes and filters out the same excluded kinds (file, directory, parameter, property) that classifyNodeRolesFull excludes when computing medians from in-memory rows. Cold-cache and warm-cache paths now produce consistent distributions.
| /** Persist global role medians + current edge count to build_meta. */ | ||
| function writeMedianCache( | ||
| db: BetterSqlite3Database, | ||
| medians: { fanIn: number; fanOut: number }, | ||
| ): void { | ||
| const edgeCount = (db.prepare('SELECT COUNT(*) AS cnt FROM edges').get() as { cnt: number }).cnt; | ||
| setBuildMeta(db, { [ROLES_MEDIANS_KEY]: JSON.stringify({ ...medians, edgeCount }) }); | ||
| } |
There was a problem hiding this comment.
writeMedianCache runs a redundant SELECT COUNT(*) FROM edges on cache miss
On an incremental cache miss the call sequence is: readCachedMedians (runs SELECT COUNT(*) FROM edges), then computeGlobalMediansFromEdges (two GROUP BY queries), then writeMedianCache (runs SELECT COUNT(*) FROM edges again). The same COUNT is executed twice. Passing the count obtained in readCachedMedians through to the write path — or accepting it as an optional argument to writeMedianCache — would eliminate the duplicate query.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Fixed — writeMedianCache now accepts an optional edgeCount parameter. Both call sites pass the already-computed count: the incremental path passes the count from computeGlobalMediansFromEdges (which now returns it alongside the medians), and the full-build path derives it from in-memory rows (sum of fan_in values equals total calls+imports-type edges). The fallback COUNT query in writeMedianCache only fires when neither caller provides the count.
Codegraph Impact Analysis8 functions changed → 4 callers affected across 2 files
|
…argo.toml Adds scripts/check-grammar-versions.mjs, which compares the major version of every tree-sitter grammar in package.json devDependencies against its paired Cargo crate in crates/codegraph-core/Cargo.toml. A major-version divergence means the two engines ship grammars at incompatible API levels; minor drifts are allowed and already caught by the Rust `all_grammars_have_compatible_abi` test at build time. The script handles npm-to-Cargo name differences (kotlin → kotlin-sg, etc.), skips unversioned github:/tarball refs, and has a KNOWN_EXCEPTIONS table for tree-sitter-dart whose v1.x npm release has no matching crates.io publish yet. Wires the check into CI as a lightweight grammar-version-parity job (no npm install required) and adds it to the ci-pipeline gate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three issues with the median-cache helpers introduced in the prior commit: 1. readCachedMedians used SELECT COUNT(*) FROM edges (all kinds) to check staleness, but medians are derived from only calls+imports-type edges. Changes to imports/reexports/dynamic-imports edges could falsely evict a valid cache (or fail to evict a stale one). Now counts only the same subset used to compute the medians. 2. computeGlobalMediansFromEdges did not join nodes to filter node kinds, so on a cold-cache fallback it could include file/directory/parameter/ property nodes as edge targets — producing a different distribution than classifyNodeRolesFull which excludes those kinds. Added the same node-kind filter as the full-build path for consistency. 3. writeMedianCache ran a redundant SELECT COUNT(*) FROM edges on cache-miss paths (readCachedMedians had already counted, or the count was just computed by computeGlobalMediansFromEdges). Now accepts an optional edgeCount argument; both call sites pass the already-computed count to eliminate the duplicate query. The full-build path derives it from the in-memory rows (sum of fan_in values).
|
Addressed review feedback from Claude's review:
|
|
Claude finished @carlos-alm's task —— View job Re-review: verifying addressed concernsTasks
All three concerns from the previous review are correctly addressed. Here's what was verified: Concern #1 — median asymmetry on cold cache ( Concern #2 — double COUNT in Concern #3 — staleness scope ( Greptile concern — One minor note: The Rating: ⭐⭐⭐⭐⭐ (5/5) — LGTM All three previously flagged concerns are resolved correctly, with no regressions introduced. The implementation is clean, the fixes are minimal and targeted, and the code comments accurately document the invariants. |
Summary
scripts/check-grammar-versions.mjsthat compares the major version of every tree-sitter grammar inpackage.jsondevDependencies against its paired Cargo crate incrates/codegraph-core/Cargo.tomlgrammar-version-parityCI job (checkout + Node only, nonpm install) into the pipelineDesign notes
all_grammars_have_compatible_abitest at build timegithub:,git+) are skipped with a notice (clojure, erlang, gleam)KNOWN_EXCEPTIONStable handlestree-sitter-dart: the npm package is at v1.0.0 buttree-sitter-dartv1.x hasn't been published to crates.io yet (latest: 0.2.0). Remove the exception once the crate is updatedCurrent output
Closes #1269
Test plan
node scripts/check-grammar-versions.mjspasses locallygrammar-version-parityCI job passes in the PR checksci-pipelinegate