Skip to content

fix(watch): report net edge delta in rebuild log#1220

Merged
carlos-alm merged 7 commits into
mainfrom
fix/watch-edges-log-net-delta
May 26, 2026
Merged

fix(watch): report net edge delta in rebuild log#1220
carlos-alm merged 7 commits into
mainfrom
fix/watch-edges-log-net-delta

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

The codegraph watch log printed +N edges for every file rebuild where N was the count of edges re-inserted, not the net delta. A comment-only edit to a 10-edge file showed +10 edges even though the DB total moved by 0. The companion nodes field has always used a signed delta (nodesAdded - nodesRemoved), so the asymmetry confused users.

This PR tracks edgesRemoved in rebuildFile (count edges touching the file before purge, plus outgoing edges of each reverse-dep), threads it through RebuildResult, and renders the watcher's edges field as a signed delta (edgesAdded - edgesRemoved) so it matches how nodes are displayed.

The change-journal.ts edges.added field intentionally keeps its existing "insert count" semantics — only the user-facing watch log is adjusted.

Found during

Dogfooding v3.11.0 — see #1219 and generated/dogfood/DOGFOOD_REPORT_v3.11.0.md.

Before / After

# Before: comment-only edit to src/cli.ts
[codegraph] Updated: src/cli.ts (+0 nodes, +10 edges)   ← DB total moved by 0
# After:
[codegraph] Updated: src/cli.ts (+0 nodes, +1 edges)    ← matches actual delta

# Real edit adding two functions with one call
[codegraph] Updated: src/cli.ts (+2 nodes, +3 edges)    ← 2 contains + 1 calls

Test plan

  • tests/integration/watcher-edges-delta.test.ts — new test asserting comment-only edits report edgesAdded == edgesRemoved, and added symbols with calls report edgesAdded > edgesRemoved
  • All existing watcher tests (tests/unit/watcher-flush, tests/incremental/watcher-incremental, tests/integration/watcher-rebuild, tests/integration/watcher-fk-embeddings) still pass
  • change-journal tests still pass (journal field semantics preserved)
  • Manual verification: codegraph watch on the codegraph repo correctly reports net delta for both comment-only and symbol-adding edits

Closes #1219

carlos-alm and others added 2 commits May 25, 2026 20:13
The watch log printed `+N edges` for every rebuild, where N was the
count of edges re-inserted during the rebuild — not the net delta. A
comment-only edit to a 10-edge file reported `+10 edges` even though
the DB total did not move at all (purge removed 10, rebuild re-inserted
the same 10).

The companion `nodes` field has always used a signed delta
(nodesAdded - nodesRemoved); the asymmetry was the source of confusion.

This change:
- Tracks `edgesRemoved` in `rebuildFile` by counting the file's edges
  (and the outgoing edges of every reverse-dep) before purge.
- Threads `edgesRemoved` through `RebuildResult` to the watcher.
- Formats the edges field in the watcher log as a signed delta
  (`edgesAdded - edgesRemoved`), matching the nodes field.

The `change-journal.ts` field name `edges.added` keeps its existing
"count of insertions" semantics — only the user-facing watch log is
adjusted.

Closes #1219

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@claude

claude Bot commented May 26, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review Complete ⭐⭐⭐

Reviewing PR #1220: fix(watch): report net edge delta in rebuild log

Root Cause Analysis

Problem Being Solved: The codegraph watch command was displaying misleading edge counts by showing the number of edges re-inserted (+N edges) instead of the net change in edges. For comment-only edits, this would show +10 edges even when the total edge count in the database remained unchanged, creating user confusion.

Why This Approach: The fix correctly addresses the root cause by:

  1. Tracking edgesRemoved during the rebuild process (lines 553-556 in incremental.ts)
  2. Including both edgesAdded and edgesRemoved in the RebuildResult interface
  3. Displaying the net delta (edgesAdded - edgesRemoved) in the watch log (line 137 in watcher.ts)

Risk Assessment: LOW - This is a display/logging fix that doesn't change core graph building logic. The change-journal semantics are intentionally preserved.

Backlog Compliance

  • Zero-dep: ✓ — No new dependencies added
  • Foundation-aligned: ✓ — Improves user experience without violating core principles
  • Problem-fit: 3 — Useful but doesn't directly address the core "lost AI" problem
  • Breaking: No — Pure display change, no API or behavioral breaking changes
  • Tier: 1

Critical Concerns

None. This is a well-targeted fix with appropriate test coverage.

Code Quality Assessment

Strengths:

  • Precise fix: Only touches the specific logging issue without disrupting core functionality
  • Well-tested: New test file /tests/integration/watcher-edges-delta.test.ts provides comprehensive coverage
  • Clear implementation: Lines 553-556 in incremental.ts clearly show the edge counting logic
  • Backward compatible: Change-journal semantics preserved (line 112 in watcher.ts)
  • Good documentation: Comments explain the reasoning (lines 549-552, 133-136)

Technical Implementation:

  • Correctly counts edges touching the file before purge (line 553)
  • Properly handles reverse-dependency edge removal (lines 554-556)
  • Uses existing formatDelta() function for consistent display (line 137)
  • Test assertions verify both zero-delta and positive-delta scenarios

Minor Observations:

  • Implementation correctly reuses existing prepared statement caching patterns
  • Good separation of concerns: display logic in watcher.ts, data collection in incremental.ts
  • Appropriate use of existing error handling patterns

Final Recommendation

  • Rating: ⭐⭐⭐ (3/5)
  • Action: APPROVE WITH CONDITIONS
  • Reasoning: This is a solid, well-tested fix that resolves a legitimate user experience issue. The implementation is clean and doesn't introduce risks. Rating is 3/5 rather than 4/5 due to the narrow scope (display fix vs. substantial feature) and moderate problem-fit score.

Conditions for approval:

  • Verify that existing watcher integration tests still pass
  • Confirm the dogfood report accurately reflects the fix validation

@greptile-apps

greptile-apps Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes the codegraph watch log so it shows a net edge delta (edgesAdded − edgesRemoved) instead of the raw re-insert count, matching the existing signed-delta display for nodes. A comment-only edit no longer falsely reports +N edges when the DB total is unchanged.

  • incremental.ts: Adds countEdgesRemovedOnRebuild, a deduplicating SQL union that mirrors the exact deletion semantics of purgeFileData + deleteOutgoingEdges; pre-parses reverse deps before the purge so only deps that will actually have their edges deleted are included in the count; threads edgesRemoved through all RebuildResult return sites.
  • watcher.ts: Updates logRebuildResults to render edgesAdded − edgesRemoved as a signed delta via a new formatDelta helper, and adds edgesRemoved to the local RebuildResult interface.
  • tests/integration/watcher-edges-delta.test.ts: Three integration test suites covering the no-reverse-dep case, the cross-file reverse-dep deduplication case, and the parse-failure (unreadable dep) case, all asserting the reported delta matches the actual DB delta.

Confidence Score: 4/5

Safe to merge; the graph data itself is unaffected — only the watcher log display changes, and the SQL accounting logic is correct for all normal paths.

The deduplicating SQL in countEdgesRemovedOnRebuild correctly models purge semantics for all tested paths. There is one untested edge path — when insertFileNodes succeeds but the subsequent getNodeId file-node lookup returns null — where edgesRemoved includes dep outgoing edges that deleteOutgoingEdges never actually deletes, causing the log to show a negative delta. This is a cosmetic display issue in a rare code path, but it is a real discrepancy introduced by this PR.

The early-return branch in incremental.ts around line 617 (file node not found after insertion) is the only path where the reported delta can diverge from the actual DB delta.

Important Files Changed

Filename Overview
src/domain/graph/builder/incremental.ts Moves reverse-dep pre-parse before the purge step, adds countEdgesRemovedOnRebuild with a deduplicating SQL union, and threads edgesRemoved through all RebuildResult return sites. The SQL logic correctly models purge semantics. One minor discrepancy: the no-file-node early return at line 617 includes dep outgoing edges in edgesRemoved even though deleteOutgoingEdges never runs for that path.
src/domain/graph/watcher.ts Adds edgesRemoved to the local RebuildResult interface and updates logRebuildResults to display edgesAdded - edgesRemoved as a signed delta; extracts a formatDelta helper to avoid duplicating the sign-formatting logic.
tests/integration/watcher-edges-delta.test.ts Three describe blocks covering: single-file (no reverse deps), two-file reverse-dep deduplication, and parse-failure (deleted dep file) scenarios. Tests directly verify the DB delta equals the reported delta, which would catch the double-counting bug that was the subject of the previous round of review.

Sequence Diagram

sequenceDiagram
    participant W as watcher.ts
    participant R as rebuildFile
    participant DB as SQLite DB

    W->>R: rebuildFile(filePath)
    R->>DB: findReverseDeps(relPath)
    DB-->>R: reverseDeps[]
    loop for each reverseDep
        R->>R: parseReverseDep(dep) [pre-parse, before purge]
    end
    R->>DB: countEdgesRemovedOnRebuild(relPath, deletableDeps)
    Note over R,DB: Single DISTINCT union query
    DB-->>R: edgesRemoved
    R->>DB: purgeFileData(relPath)
    R->>DB: insertFileNodes + buildEdges
    loop for each deletable dep
        R->>DB: deleteOutgoingEdges(dep)
        R->>DB: rebuildReverseDepEdges(dep)
    end
    R-->>W: "{edgesAdded, edgesRemoved, ...}"
    W->>W: formatDelta(edgesAdded - edgesRemoved)
    W->>W: log Updated: file (+N nodes, +M edges)
Loading

Fix All in Claude Code

Reviews (6): Last reviewed commit: "Merge branch 'main' into fix/watch-edges..." | Re-trigger Greptile

Comment thread src/domain/graph/builder/incremental.ts Outdated
Comment on lines +553 to +556
let edgesRemoved = countEdgesTouchingFile(db, relPath);
for (const depRelPath of reverseDeps) {
edgesRemoved += countOutgoingEdges(db, depRelPath);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Double-counting of reverse-dep→file edges inflates edgesRemoved

countEdgesTouchingFile(relPath) counts edges where target_id IN (nodes of relPath) — this already includes edges that run FROM each reverse dep TO relPath. Then for each reverse dep, countOutgoingEdges(depRelPath) counts ALL outgoing edges from that dep, which includes those same dep→relPath edges a second time. This means edgesRemoved is overcounted by the total number of cross-file edges pointing into the rebuilt file.

purgeFileData (in build-stmts.ts) confirms it with its edge-delete SQL: OR target_id IN (SELECT id FROM nodes WHERE file = @f), which removes the dep→relPath edges before deleteOutgoingEdges runs on each dep. At that point those edges are already gone, so countOutgoingEdges(dep) reports a count that includes edges no longer present at deletion time.

For a comment-only edit to a file with reverse deps, edgesAdded correctly equals the re-inserted count, but the overcounted edgesRemoved makes edgesAdded - edgesRemoved go negative — e.g. "-3 edges" instead of "+0 edges". A corrected formula would use countEdgesTouchingFile(relPath) for the primary file and then add only the outgoing-from-dep edges that do NOT target relPath, i.e. countOutgoingEdges(dep) − countEdgesFromDepToFile(dep, relPath) per reverse dep.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 82becaa. Replaced the two-step touching + Σ outgoing(dep) accumulation with a single union-based COUNT that mirrors the actual purgeFileData + deleteOutgoingEdges delete semantics. dep → relPath edges are now counted exactly once via the DISTINCT-by-construction source-file IN-list, regardless of whether they overlap with the per-dep outgoing set.

Comment on lines +61 to +73
/* ignore */
}
});

it('reports edgesRemoved ≈ edgesAdded for comment-only edits (net delta ≈ 0)', async () => {
const filePath = path.join(tmpDir, 'a.js');
fs.appendFileSync(filePath, '\n// pure comment, no graph effect\n');

const db = openDb(dbPath);
initSchema(db);
const stmts = makeStmts(db);
const result = await rebuildFile(db, tmpDir, filePath, stmts, { engine: 'auto' }, null);
db.close();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Test only covers the no-reverse-dep case; the double-counting bug goes undetected

a.js is the sole file in tmpDir, so findReverseDeps always returns an empty array and the for (const depRelPath of reverseDeps) loop in rebuildFile never executes. The double-counting issue in edgesRemoved only triggers when at least one reverse dep exists. Adding a second file that imports from a.js (e.g. import { foo } from './a.js') would put b.js into the reverse-dep list and exercise the cross-file path that the current test misses.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 82becaa. Added a second describe block (rebuildFile edges accounting with reverse-deps) that creates a.js + b.js (with b.js importing from a.js), runs a comment-only edit on a.js, and asserts edgesAdded === edgesRemoved. This exercises the reverse-dep cascade path that the original single-file test missed and would fail against the old accounting.

@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

6 functions changed6 callers affected across 2 files

  • RebuildResult.edgesRemoved in src/domain/graph/builder/incremental.ts:43 (0 transitive callers)
  • countEdgesRemovedOnRebuild in src/domain/graph/builder/incremental.ts:165 (3 transitive callers)
  • rebuildFile in src/domain/graph/builder/incremental.ts:540 (4 transitive callers)
  • RebuildResult.edgesRemoved in src/domain/graph/watcher.ts:55 (0 transitive callers)
  • formatDelta in src/domain/graph/watcher.ts:122 (3 transitive callers)
  • logRebuildResults in src/domain/graph/watcher.ts:127 (4 transitive callers)

Greptile flagged that the original `edgesRemoved` calculation
double-counted edges from reverse deps that point into the rebuilt
file: `countEdgesTouchingFile(relPath)` already captures every
incoming `dep → relPath` edge, and then `countOutgoingEdges(dep)`
re-counts the same edges on the per-dep pass.

For comment-only edits to a file with importers, `edgesAdded`
correctly equals the re-inserted count, but the overcounted
`edgesRemoved` would push the signed delta negative — e.g. "-3 edges"
instead of "+0 edges".

Replace the two-step `touching + Σ outgoing(dep)` accumulation with a
single DISTINCT-by-construction query: count edges whose source file is
in {relPath} ∪ reverseDeps OR whose target file is `relPath`. This
mirrors the actual delete semantics of `purgeFileData(relPath)` +
`deleteOutgoingEdges(dep)` and naturally deduplicates `dep → relPath`
edges.

Add a regression test covering the two-file reverse-dep scenario that
the original single-file test missed.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed Greptile feedback:

  • Double-counting of reverse-dep→file edges — replaced the two-step touching + Σ outgoing(dep) accumulation with a single union-based COUNT (source_id IN nodes({relPath ∪ reverseDeps}) OR target_id IN nodes(relPath)). This mirrors the actual purgeFileData + deleteOutgoingEdges delete semantics and deduplicates dep → relPath edges by construction.
  • Test only covers the no-reverse-dep case — added a second describe block that creates a.js + b.js (b imports from a), runs a comment-only edit on a.js, and asserts edgesAdded === edgesRemoved. This exercises the reverse-dep cascade path the original single-file test missed; it would have failed against the previous accounting.

Fix in 82becaa.

Note on the Pre-publish benchmark gate failure: that gate is failing because KNOWN_REGRESSIONS contains stale 3.9.6:* entries (>1 minor version behind 3.11.0) — unrelated to this PR. Tracked in #1222.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

carlos-alm added a commit that referenced this pull request May 26, 2026
- Fix §1 cross-reference from "Phase 4b" to "§4" (no Phase 4b section exists).
- Add a note to §5 explaining the node/edge count gap vs the header
  (engine-comparison run captured a slightly earlier repo state than
  the build in §1).
- Rewrite the contradictory "No-op rebuild well under the 10 ms target
  on neither engine" line to correctly state the target was exceeded.
- Correct §9 and §13 attribution: the watcher edge-delta fix landed in
  #1220, not "this PR".
countEdgesRemovedOnRebuild previously included ALL outgoing edges of every
reverse dep, but deleteOutgoingEdges(dep) only runs for deps that
parseReverseDep returns non-null for. When a dep failed to parse (file
deleted, unreadable, or unparseable), its outgoing edges to files other
than relPath stayed in the DB yet were still counted in edgesRemoved.
This made (edgesAdded - edgesRemoved) go negative in the watch log even
though no edges were lost.

Pre-parse reverse-deps up front, filter to the parseable set, and compute
edgesRemoved from that subset so the displayed delta matches actual DB
deletion semantics. The cascade loop is reorganized to consume the
pre-parsed map directly.

Adds a regression test that introduces b.js → a.js + b.js → c.js, deletes
b.js, then rebuilds a.js. The b.js → c.js edges must survive the rebuild
and must not appear in edgesRemoved.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed Greptile P1 follow-up in b249ea4: edgesRemoved overcounted when a reverse-dep fails to parse.

Root cause: countEdgesRemovedOnRebuild included every reverse-dep's outgoing edges, but deleteOutgoingEdges(dep) only ran for deps that parsed successfully. Deps that returned null kept their outgoing-to-other-files edges in the DB while still being counted as removed — pushing the displayed delta negative.

Fix: Pre-parse reverse-deps up front, filter to the parseable set (deletableReverseDeps), and compute edgesRemoved from that subset. The cascade loop now consumes the pre-parsed map directly (saves a redundant parse pass too).

Regression test: Added rebuildFile edges accounting with unparseable reverse-dep — sets up b.js → a.js plus b.js → c.js, deletes b.js, rebuilds a.js, then asserts (a) b.js → c.js edges survive in the DB and (b) the reported edgesAdded - edgesRemoved equals the actual net DB delta. The test fails against the pre-fix accounting.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 04733a3 into main May 26, 2026
21 checks passed
@carlos-alm carlos-alm deleted the fix/watch-edges-log-net-delta branch May 26, 2026 20:41
@github-actions github-actions Bot locked and limited conversation to collaborators May 26, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(watch): edges log shows insert count, not net delta — misleading for unchanged content

1 participant