Skip to content

refactor(parity): render orchestrator-drop summary as a per-extension table#1240

Merged
carlos-alm merged 30 commits into
mainfrom
refactor/orchestrator-drop-summary-tabular
May 28, 2026
Merged

refactor(parity): render orchestrator-drop summary as a per-extension table#1240
carlos-alm merged 30 commits into
mainfrom
refactor/orchestrator-drop-summary-tabular

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • Replaces the single wall-of-text WARN line (semicolon-joined ext (N: a, b, c) slices) with a multi-line tabular breakdown — each extension gets its own indented row with a right-aligned count column and up to 3 sample filenames
  • Header line now reports both the file count and the extension count (across N extension(s)) so the scale is visible before reading the table
  • Three call sites updated in the native orchestrator: stale-WASM-file purge, unsupported-by-native backfill, and native-extractor-failure warn

Before:

[codegraph WARN] Native orchestrator dropped 45 file(s) in natively-supported languages — likely a Rust extractor bug. Backfilling via WASM: .ts (3: a.ts, b.ts, c.ts, +28 more); .py (8: a.py, b.py, c.py, +5 more); +31 more extension(s)

After:

[codegraph WARN] Native orchestrator dropped 45 file(s) across 3 extension(s) in natively-supported languages — likely a Rust extractor bug. Backfilling via WASM:
  .ts    31  a.ts, b.ts, c.ts (+28 more)
  .py     8  a.py, b.py, c.py (+5 more)
  .go     6  a.go, b.go, c.go (+3 more)

Test plan

  • npx vitest run tests/parsers/native-drop-classification.test.ts — 17 tests covering empty input, sample capping, column alignment (right-pad ext, right-align count), sort order, extension cap, +N more suffix
  • Trigger a journal-vs-fresh-build collision locally and confirm the WARN output is tabular

carlos-alm and others added 26 commits May 25, 2026 22:59
… table

The native-orchestrator drop warning lived in a single wall-of-text WARN
line that grew unreadable when 30+ extensions were dropped at once
(easy to trigger via journal-vs-fresh-build collisions). Make the
per-extension breakdown scan like a table: header line keeps the count
and now also reports the extension total; each extension occupies its
own indented row with a right-aligned count column.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… collection

Adds shared utilities to src/extractors/helpers.ts in preparation for
adoption across language extractors (phase 2):

- nodeStartLine: companion to nodeEndLine for the ~108 hand-rolled
  startPosition.row + 1 literals scattered across extractors
- findFirstChildOfTypes: find first child matching any of N types
  (useful for grammar variants like string vs string_literal)
- iterChildren / PUNCTUATION_TOKENS: generator-based child iteration
  with punctuation skipping, used in elixir/gleam destructuring walks
- pushCall / pushImport: centralise Call/Import construction so line
  derivation stays consistent across extractors
- extractSimpleParameters / resolveParamName: uniform parameter
  extraction with optional type-map sink — collapses boilerplate in
  the ~16 per-language extractParams helpers

Phase 1 of the TS extractor refactor plan (sync.json clusters 1).
Additive only — no consumer adoption yet; existing helpers and
extractor behaviour unchanged. Consumers updated in phase 2.

docs check acknowledged: internal refactor, no doc updates needed.
Phase 2 of the TS extractor refactor plan (sync.json cluster 1). Adopts
the helpers extended in 9c8be55 (nodeStartLine, findFirstChildOfTypes,
pushCall, pushImport, extractSimpleParameters, stripQuotes) across six
language extractors:

- r.ts: drop local stripQuotes; use shared stripQuotes/pushCall/
  pushImport/findFirstChildOfTypes/nodeStartLine
- gleam.ts: use pushCall/pushImport/findFirstChildOfTypes/nodeStartLine;
  extract pushConstructor helper for the dual-branch data-constructor
  walk
- julia.ts: use pushCall/pushImport/nodeStartLine; collapse Julia param
  wrapper-type branches via JULIA_PARAM_WRAPPER_TYPES set
- java.ts: use pushCall/pushImport/nodeStartLine; collapse
  extractJavaParameters via extractSimpleParameters with typeMap sink;
  extract resolveJavaTypeText for the generic_type unwrap pattern
- gleam.ts and solidity.ts: extract qualifyWithParent helper in solidity
  to collapse 6 duplicated `parent ? \`\${parent}.\${name}\` : name`
  blocks
- solidity.ts: use pushCall/pushImport/findFirstChildOfTypes/
  nodeStartLine; collapse extractSolParams via extractSimpleParameters
- javascript.ts: bulk-replace 43 inline `XXX.startPosition.row + 1`
  literals with nodeStartLine() calls; replace one stray endPosition
  literal with nodeEndLine

Net -65 lines. No behaviour changes — only call-site collapsing onto the
shared helpers (semantics verified by careful inspection of each
replacement; pushImport's empty-names fallback matches the previous
ad-hoc defaults in each extractor).

docs check acknowledged: internal refactor, no doc updates needed.
Convert collectElixirParamIdentifiers from mutual-recursion with
collectElixirMapBindings into a single iterative worklist traversal.
Map/list/tuple/binary-operator dispatch is now done via three leaf
helpers that push child nodes onto the worklist instead of calling
back into the main function. This removes the function-level cycle
flagged by codegraph (9 -> 8 cycles) without changing extractor
semantics. docs check acknowledged: internal refactor only.
Phase 5 of the Rust extractor refactor plan (sync.json cluster 2). Adopts
the helpers extended in 0d687c4 (push_call, push_simple_call, push_import,
push_type_map_entry, extract_simple_parameters, match_c_family_type_map)
across eight language extractors:

- cpp.rs: collapse match_cpp_type_map to a one-line delegate of
  match_c_family_type_map; use push_import/push_simple_call/push_call
  for include and call sites
- cuda.rs: same delegation as cpp.rs; use push_import/push_simple_call/
  push_call across include and call_expression handlers
- java.rs: use push_type_map_entry for local-variable / formal-parameter
  bindings; use push_call/push_simple_call for method invocation and
  object creation; collapse extract_java_parameters to a one-shot
  extract_simple_parameters call; use push_import for import declaration
- javascript.rs: use push_simple_call for new_expression identifier
  branch; use push_type_map_entry for the confidence-0.9 type entries
- julia.rs: use push_simple_call/push_call across identifier and
  field_expression / scoped_identifier call branches
- objc.rs: use push_import for at_import; use push_call for c-call and
  message-expression handlers (drops redundant is_empty guards)
- r_lang.rs: use push_simple_call/push_call across identifier and
  namespace_operator call branches; use push_import for library/source
- solidity.rs: use push_call (drops redundant guard) for call sites;
  collapse extract_sol_params to a one-shot extract_simple_parameters

Net: -207 lines across 8 files, no behavior change. cargo check clean,
324 rust unit tests pass.

Pre-existing test failure: tests/engines/parity.test.ts has two failing
elixir cases unrelated to this commit (filed as #1227 — regression from
commit 5abe6ad in Phase 3).
Convert collect_elixir_param_identifiers from mutual-recursion with
collect_elixir_map_bindings into a single iterative worklist traversal.
Map/list/tuple/binary-operator dispatch is now done via three leaf
helpers (push_elixir_sequence_items, push_elixir_map_values,
push_elixir_binary_operator_operands) that push child nodes onto the
worklist instead of calling back into the main function. This removes
the function-level cycle flagged by codegraph (8 -> 7 cycles) and
mirrors the TS refactor in 5abe6ad without changing extractor
semantics. docs check acknowledged: internal refactor only.
…lection strategy

Extract the native-orchestrator path out of pipeline.ts into two new stage
modules:

  - stages/native-orchestrator.ts — tryNativeOrchestrator + post-native
    structure/analysis fallback + dropped-language detection/backfill.
  - stages/native-db-lifecycle.ts — shared rusqlite connection helpers
    (closeNativeDb, reopenNativeDb, suspendNativeDb, refreshJsDb).

This breaks the function-level cycle 'buildGraph <-> tryNativeOrchestrator'
caused by codegraph's name-based resolver conflating the local buildGraph
function with the ctx.nativeDb.buildGraph() method call. Once the
orchestrator lives in its own file, there is no longer a local buildGraph
in scope to collide with the method invocation.

Function-level cycles: 9 -> 5. No file-level cycle introduced (still 1,
unchanged — pre-existing MCP cycle). pipeline.ts shrinks from 1404 to 465
lines and now reads as a thin top-level controller: detect changes, try
native, fall back to JS stages.

computeWasmOnlyStaleFiles is re-exported from pipeline.ts so existing
unit tests (tests/builder/wasm-only-stale-files.test.ts) keep working
without changes.
docs check acknowledged — no doc-relevant changes (internal helper extraction).
docs check acknowledged - Rust internal helper extraction, no user-facing changes
…impact and dependencies

Split high-cognitive-complexity functions in the analysis domain into focused
helpers. Worst functions per gauntlet (cog/cyc/maxNesting/halstead) are now
below thresholds.

module-map.ts (statsData cog=31 -> below threshold):
- Extract buildStatsFromNative and buildStatsFromJs branches
- Share false-positive query and quality-score helpers between paths
- aggregateRolesFromNative pulls duplicated role-aggregation code out

fn-impact.ts (bfsTransitiveCallers cog=37 -> below threshold,
              impactAnalysisData cog=27 -> below threshold):
- Extract recordCaller, processFrontierNode, seedInterfaceImplementors
- Extract bfsImportDependents and groupDependentsByLevel

dependencies.ts (bfsShortestPath cog=29, bfsFilePath cog=30,
                 buildTransitiveCallers cog=24 -> all below threshold):
- Extract buildNextCallerFrontier from buildTransitiveCallers
- Extract buildNeighborStmt + visitNeighbor; state collected in struct
- Extract visitFileNeighbor + reconstructFilePath

docs check acknowledged - internal helper extraction, no user-facing changes
…, structure-query, and owners

Internal refactor — no public API or behaviour change, so docs check acknowledged.

- complexity.ts: split collectNativeBulkRows (cog=70) into classify/build/collect-file helpers;
  extract classifyHalsteadToken + summarizeHalsteadCounts from computeHalsteadMetrics.
- structure.ts: merge classifyNodeRolesFull/Incremental DRY via shared buildActiveFilesSet
  + buildClassifierInput helpers.
- graph-enrichment.ts: decompose prepareFileLevelData (cog=32, cyc=26) into loadFileLevelEdges,
  computeFileFanCounts, detectFileCommunities, buildFileVisNode, selectFileSeedNodes.
- structure-query.ts: split hotspotsData (cog=34, sloc=102) using a strategy pattern
  (HOTSPOT_ORDER_BY) and mapNative/JsHotspotRow helpers.
- owners.ts: split ownersData (sloc=158, bugs=1.55) into loadFilteredFiles, buildOwnerIndex,
  loadSymbolsForFiles, computeOwnerBoundaries, buildOwnersSummary.
Internal refactor — public APIs unchanged. docs check acknowledged.
…ir pushElixirSequenceItems

Replaces the inline childCount loop with the shared iterChildren generator
configured with PUNCTUATION_TOKENS, completing phase 1 of the TS extractor
refactor plan (sync.json cluster 1). Behaviour preserved — same nodes are
pushed onto the worklist, just via the shared helper.

docs check acknowledged: internal refactor, no doc updates needed.
Wire forge phase 4 helpers into their consumers:

- find_first_child_of_types: collapse find_child(x, A).or_else(|| find_child(x, B))
  in fsharp.rs handle_application
- iter_children + PUNCTUATION_TOKENS: replace inline punctuation-skip loop in
  javascript.rs first_arg_is_string_literal

Closes 3 dead-ffi helpers extracted by forge phase 4. Semantically identical.
@greptile-apps

greptile-apps Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR replaces the single-line semicolon-joined ext (N: a, b, c) drop-summary format with a multi-line per-extension table that right-aligns counts and right-pads extension names, and adds the extension count to each header line. The bulk of the diff is a large-scale function-extraction refactor across the builder pipeline (helpers, incremental, build-edges, build-structure, detect-changes, finalize, insert-nodes, cycles, journal) with no behavior changes intended.

  • Core feature: formatDropExtensionSummary in parser.ts rewritten to return \n-prefixed tabular rows; three call sites in the new native-orchestrator.ts updated to embed extension counts in their header messages.
  • New modules: native-orchestrator.ts and native-db-lifecycle.ts extracted from pipeline.ts to break a false-positive name-collision cycle in the graph resolver; pipeline.ts re-exports the public symbols for backward compatibility.
  • Refactoring: Monolithic functions across journal.ts, cycles.ts, helpers.ts, incremental.ts, and the five builder stages split into focused helpers — each extracted unit is a pure structural change with no logic alterations.

Confidence Score: 5/5

Safe to merge. The feature change is a pure log-output improvement with no DB reads/writes, and every refactored function was extracted unchanged from its original context.

The formatDropExtensionSummary change is isolated to log rendering with 17 unit tests covering all edge cases. The remaining ~1000 lines of diff are mechanical extractions: each helper function is identical to what was in the monolith, call sites are updated consistently, and the backward-compat re-exports in pipeline.ts preserve external contracts.

No files require special attention.

Important Files Changed

Filename Overview
src/domain/parser.ts Core feature change: formatDropExtensionSummary replaced with tabular multi-line renderer. New empty-map guard prevents Math.max(...[]) producing -Infinity for padEnd/padStart. Logic is correct and well-tested.
src/domain/graph/builder/stages/native-orchestrator.ts Extracted from pipeline.ts; three call sites updated to use the new tabular formatDropExtensionSummary. Logic is identical to the removed pipeline.ts code.
src/domain/graph/journal.ts Extracted tryFreshAcquire, isLockStale, parseJournalHeader, parseJournalBody. parseJournalBody correctly starts iteration at i=1 to skip the header line. Behavior is preserved.
src/domain/graph/builder/pipeline.ts ~900 lines removed: native orchestrator logic moved to native-orchestrator.ts, lifecycle helpers to native-db-lifecycle.ts. Re-exports public symbols for backward compatibility.
tests/parsers/native-drop-classification.test.ts Tests updated for tabular format; new alignment test added for right-pad/right-align columns.

Reviews (5): Last reviewed commit: "fix: resolve merge conflicts with main" | Re-trigger Greptile

Comment on lines +112 to +113
['.kt', ['a.kt']], // 100 files later — wider count column
['.tsx', new Array(100).fill('x.tsx')],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The inline comment on the .kt entry reads "100 files later — wider count column", but .kt has 1 file; it's the .tsx entry below it (100 files) that drives the wider count column. As written the comment reads as if .kt itself will accumulate 100 files, which is the opposite of what the test exercises. The comment should be on the .tsx line or reworded to make the causal direction clear.

Suggested change
['.kt', ['a.kt']], // 100 files later — wider count column
['.tsx', new Array(100).fill('x.tsx')],
['.kt', ['a.kt']],
['.tsx', new Array(100).fill('x.tsx')], // 100 files — sets wider count column

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved the comment from the .kt entry (1 file) to the .tsx entry that actually holds the 100 files, matching the suggestion exactly.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

…ap helpers

pushElixirSequenceItems and pushElixirMapValues were pushing items in
forward order onto the LIFO worklist stack, causing tuple/list/map
parameters to be emitted in reverse source order (e.g. {x, _y} → ['_y',
'x'] instead of ['x', '_y']). The fix collects items then pushes them in
reverse so the LIFO pop restores source order, matching the native engine.
@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

127 functions changed79 callers affected across 27 files

  • CollectContext.rootDir in src/domain/graph/builder/helpers.ts:81 (0 transitive callers)
  • CollectContext.includeRegexes in src/domain/graph/builder/helpers.ts:82 (0 transitive callers)
  • CollectContext.excludeRegexes in src/domain/graph/builder/helpers.ts:83 (0 transitive callers)
  • CollectContext.hasGlobFilters in src/domain/graph/builder/helpers.ts:84 (0 transitive callers)
  • CollectContext.extraIgnore in src/domain/graph/builder/helpers.ts:85 (0 transitive callers)
  • CollectContext.visited in src/domain/graph/builder/helpers.ts:86 (0 transitive callers)
  • isSymlinkLoop in src/domain/graph/builder/helpers.ts:90 (2 transitive callers)
  • readDirSafe in src/domain/graph/builder/helpers.ts:106 (2 transitive callers)
  • isCollectableSourceFile in src/domain/graph/builder/helpers.ts:116 (2 transitive callers)
  • walkCollect in src/domain/graph/builder/helpers.ts:123 (1 transitive callers)
  • collectFiles in src/domain/graph/builder/helpers.ts:170 (0 transitive callers)
  • emitTypeOnlySymbolEdges in src/domain/graph/builder/incremental.ts:311 (4 transitive callers)
  • emitEdgesForImport in src/domain/graph/builder/incremental.ts:341 (5 transitive callers)
  • buildImportEdges in src/domain/graph/builder/incremental.ts:367 (5 transitive callers)
  • buildDeletionResult in src/domain/graph/builder/incremental.ts:515 (3 transitive callers)
  • rebuildEdgesForTargetFile in src/domain/graph/builder/incremental.ts:536 (3 transitive callers)
  • parseReverseDeps in src/domain/graph/builder/incremental.ts:557 (3 transitive callers)
  • emitBarrelImportEdgesForReverseDeps in src/domain/graph/builder/incremental.ts:580 (3 transitive callers)
  • runReverseDepCascade in src/domain/graph/builder/incremental.ts:610 (3 transitive callers)
  • rebuildFile in src/domain/graph/builder/incremental.ts:633 (4 transitive callers)

@carlos-alm carlos-alm force-pushed the refactor/orchestrator-drop-summary-tabular branch from 22ba78f to 7cc64d7 Compare May 28, 2026 19:49
@carlos-alm carlos-alm merged commit 42e3b33 into main May 28, 2026
21 checks passed
@carlos-alm carlos-alm deleted the refactor/orchestrator-drop-summary-tabular branch May 28, 2026 20:34
@github-actions github-actions Bot locked and limited conversation to collaborators May 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant