docs: add dogfood report for v3.11.0#1221
Conversation
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Claude finished @carlos-alm's task —— View job PR Review Complete ✅Reviewing PR #1221: docs: add dogfood report for v3.11.0 Review Checklist:
Root Cause AnalysisProblem Being Solved: This PR documents comprehensive validation testing of the v3.11.0 release to ensure all features work correctly before public release. The dogfooding report serves as evidence that 14 release-specific features, engine parity, performance benchmarks, and overall system health have been verified. Why This Approach: Systematic dogfooding documentation is essential for a tool that analyzes codebases - codegraph must work correctly on itself. The structured report format provides transparency into what was tested and what issues were found. Risk Assessment: Zero risk - this is pure documentation with no functional changes. Backlog Compliance
Code Quality AssessmentExceptional Documentation Quality:
Technical Rigor:
Documentation Standards:
Critical ConcernsNone identified. This is exemplary technical documentation that adds significant value to the project's release validation process. Final Recommendation |
Greptile SummaryThis PR adds the dogfooding report for
Confidence Score: 5/5This is a documentation-only PR adding a dogfood report — no code changes, no runtime risk. The change is a single new markdown file with no executable code. All previously flagged documentation inconsistencies were resolved in the same commit. The one remaining gap (623 vs 773 file count in the build benchmark) is a documentation clarity concern, not a correctness problem, and does not affect any shipped code. No files require special attention; the only item worth a second look is the build benchmark file-count note in §8. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[§1 Setup & Installation] --> B[§2 Cold Start / Pre-Build]
B --> C[§3 Full Command Sweep]
C --> D[§4 Rebuild & Staleness]
D --> E[§5 Engine Comparison\nNative vs WASM parity]
E --> F[§6 Release-Specific Tests]
F --> G[§7 Additional Testing\nMCP / API / Registry]
G --> H[§8 Performance Benchmarks]
H --> H1[Build Benchmark\n⚠ 623 vs 773 files unexplained]
H --> H2[Query Benchmark]
H --> H3[Incremental Benchmark]
H --> H4[Embedding Benchmark\nPartial — jina-small polluted]
H --> I[§9 Bugs Found\nBUG 1 watcher edge-delta fixed in #1220]
I --> J[§10 Suggestions]
J --> K[§11 Testing Plan]
K --> L[§12 Overall Assessment\nRating 9/10]
L --> M[§13 Issues & PRs Created\n#1219 open · #1220 open]
Reviews (5): Last reviewed commit: "Merge branch 'main' into docs/dogfood-re..." | Re-trigger Greptile |
|
|
||
| ## 10. Suggestions for Improvement | ||
|
|
There was a problem hiding this comment.
Bug reference points to this docs PR instead of #1220
The PR description explicitly states the watcher edge-delta fix was merged in #1220, but §9 says **PR:** This PR (see below) and §13 echoes (this) with the title fix(watch): report net edge delta in rebuild log. Since this PR only adds the dogfood report, the fix entry will permanently mis-attribute the change — anyone following the link will land here rather than the actual code change in #1220.
| - Native speedup over WASM holds at ~6× for full builds, matching the v3.10.x baseline. | ||
| - Complexity phase is no longer the bottleneck for native (28 ms vs WASM 1468 ms) — confirms the native binary version is correctly matched. | ||
| - 1-file rebuild on native is slightly slower than WASM (137 ms vs 97 ms). Both are dominated by the roles phase (~35 ms) and the constant overhead of orchestrator setup; this is unchanged from prior releases. | ||
| - No-op rebuild well under the 10 ms target on neither engine (35–38 ms), but parity between engines is good. |
There was a problem hiding this comment.
The sentence "No-op rebuild well under the 10 ms target on neither engine (35–38 ms)" is contradictory: 35–38 ms is not under 10 ms, and "on neither engine" combined with "well under" makes the statement read as a pass when it is actually a miss. The sentence likely intended to flag that both engines are above the target.
| - No-op rebuild well under the 10 ms target on neither engine (35–38 ms), but parity between engines is good. | |
| - No-op rebuild exceeds the 10 ms target on both engines (35–38 ms), but parity between engines is good. |
There was a problem hiding this comment.
Fixed in 9226f61. Adopted the suggested wording: "No-op rebuild exceeds the 10 ms target on both engines (35–38 ms), but parity between engines is good."
| | Build time | ~4 s | ~23 s | 5.7× native speedup | | ||
|
|
||
| Per-query parity check: | ||
|
|
||
| | Query | Native | WASM | Match | | ||
| |---|---|---|---| | ||
| | `fn-impact buildGraph` total impacted | 46 | 46 | ✓ | | ||
| | `cycles --functions` count | 7 | 7 | ✓ | | ||
| | `triage --json` top 5 names | identical | identical | ✓ | |
There was a problem hiding this comment.
Node counts in §5 don't reconcile with the header
The document header states the target repo has 19,443 nodes and 40,695 edges, but §5's engine comparison shows native at 19,342 nodes / 40,486 edges and WASM at 19,341 / 40,453. Neither figure matches the header. The stale-binary caveat in §1 explains the complexity timing anomaly but not this node-count gap; adding a brief note here (e.g., "comparison run against a slightly different repo state before/after the native binary update") would prevent readers from flagging this as a data integrity concern.
There was a problem hiding this comment.
Fixed in 9226f61. Added a callout note at the top of §5 explaining that the engine-comparison run captured a slightly earlier repo state than the §1 build (small delta of ~100 nodes / ~200 edges, doesn't affect parity conclusions).
| - `npx codegraph --version` → `3.11.0` ✓ | ||
| - `optionalDependencies` pins all platform binaries at exactly `3.11.0` (darwin-arm64, darwin-x64, linux-arm64-gnu, linux-x64-gnu, linux-x64-musl, win32-x64-msvc) ✓ | ||
| - `codegraph info` confirms `Active engine : native (v3.11.0)` ✓ | ||
| - Source-repo `node_modules/@optave/codegraph-darwin-arm64` was at `3.10.0` at session start (stale leftover from the prior release). Updated to `3.11.0` before any benchmarks ran — see Phase 4b note. |
There was a problem hiding this comment.
§1 says "see Phase 4b note" but the document has no §4b — the closest section is §4 "Rebuild & Staleness". This dangling cross-reference will confuse readers trying to trace the stale-binary context.
| - Source-repo `node_modules/@optave/codegraph-darwin-arm64` was at `3.10.0` at session start (stale leftover from the prior release). Updated to `3.11.0` before any benchmarks ran — see Phase 4b note. | |
| - Source-repo `node_modules/@optave/codegraph-darwin-arm64` was at `3.10.0` at session start (stale leftover from the prior release). Updated to `3.11.0` before any benchmarks ran — see §4 note. |
There was a problem hiding this comment.
Fixed in 9226f61. Adopted the suggested wording — §1 now reads "see §4 note" instead of the non-existent "Phase 4b".
- Fix §1 cross-reference from "Phase 4b" to "§4" (no Phase 4b section exists). - Add a note to §5 explaining the node/edge count gap vs the header (engine-comparison run captured a slightly earlier repo state than the build in §1). - Rewrite the contradictory "No-op rebuild well under the 10 ms target on neither engine" line to correctly state the target was exceeded. - Correct §9 and §13 attribution: the watcher edge-delta fix landed in #1220, not "this PR".
* fix(watch): report net edge delta in rebuild log The watch log printed `+N edges` for every rebuild, where N was the count of edges re-inserted during the rebuild — not the net delta. A comment-only edit to a 10-edge file reported `+10 edges` even though the DB total did not move at all (purge removed 10, rebuild re-inserted the same 10). The companion `nodes` field has always used a signed delta (nodesAdded - nodesRemoved); the asymmetry was the source of confusion. This change: - Tracks `edgesRemoved` in `rebuildFile` by counting the file's edges (and the outgoing edges of every reverse-dep) before purge. - Threads `edgesRemoved` through `RebuildResult` to the watcher. - Formats the edges field in the watcher log as a signed delta (`edgesAdded - edgesRemoved`), matching the nodes field. The `change-journal.ts` field name `edges.added` keeps its existing "count of insertions" semantics — only the user-facing watch log is adjusted. Closes #1219 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: add dogfood report for v3.11.0 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: move dogfood report to its own PR (#1221) * fix(watch): dedupe dep→file edges in edgesRemoved (#1220) Greptile flagged that the original `edgesRemoved` calculation double-counted edges from reverse deps that point into the rebuilt file: `countEdgesTouchingFile(relPath)` already captures every incoming `dep → relPath` edge, and then `countOutgoingEdges(dep)` re-counts the same edges on the per-dep pass. For comment-only edits to a file with importers, `edgesAdded` correctly equals the re-inserted count, but the overcounted `edgesRemoved` would push the signed delta negative — e.g. "-3 edges" instead of "+0 edges". Replace the two-step `touching + Σ outgoing(dep)` accumulation with a single DISTINCT-by-construction query: count edges whose source file is in {relPath} ∪ reverseDeps OR whose target file is `relPath`. This mirrors the actual delete semantics of `purgeFileData(relPath)` + `deleteOutgoingEdges(dep)` and naturally deduplicates `dep → relPath` edges. Add a regression test covering the two-file reverse-dep scenario that the original single-file test missed. * fix(watch): exclude unparseable reverse-deps from edgesRemoved (#1220) countEdgesRemovedOnRebuild previously included ALL outgoing edges of every reverse dep, but deleteOutgoingEdges(dep) only runs for deps that parseReverseDep returns non-null for. When a dep failed to parse (file deleted, unreadable, or unparseable), its outgoing edges to files other than relPath stayed in the DB yet were still counted in edgesRemoved. This made (edgesAdded - edgesRemoved) go negative in the watch log even though no edges were lost. Pre-parse reverse-deps up front, filter to the parseable set, and compute edgesRemoved from that subset so the displayed delta matches actual DB deletion semantics. The cascade loop is reorganized to consume the pre-parsed map directly. Adds a regression test that introduces b.js → a.js + b.js → c.js, deletes b.js, then rebuilds a.js. The b.js → c.js edges must survive the rebuild and must not appear in edgesRemoved. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Dogfooding report for v3.11.0. See
generated/dogfood/DOGFOOD_REPORT_v3.11.0.mdfor full details.Highlights
-nshort flag everywhere,build -d/--db,findDbPathcwd boundary fix, MCPfile_pattern,.fsisignature grammar, watch + embed FK crash fix, all 14 native extractor ports.Overall rating: 9/10
One point off only for the watcher log accuracy bug (long-standing, but visible to real watch users). Everything else lands cleanly.