refactor(extractors): shared helpers across language extractors (TS+Rust)#1230
Conversation
… collection Adds shared utilities to src/extractors/helpers.ts in preparation for adoption across language extractors (phase 2): - nodeStartLine: companion to nodeEndLine for the ~108 hand-rolled startPosition.row + 1 literals scattered across extractors - findFirstChildOfTypes: find first child matching any of N types (useful for grammar variants like string vs string_literal) - iterChildren / PUNCTUATION_TOKENS: generator-based child iteration with punctuation skipping, used in elixir/gleam destructuring walks - pushCall / pushImport: centralise Call/Import construction so line derivation stays consistent across extractors - extractSimpleParameters / resolveParamName: uniform parameter extraction with optional type-map sink — collapses boilerplate in the ~16 per-language extractParams helpers Phase 1 of the TS extractor refactor plan (sync.json clusters 1). Additive only — no consumer adoption yet; existing helpers and extractor behaviour unchanged. Consumers updated in phase 2. docs check acknowledged: internal refactor, no doc updates needed.
Phase 2 of the TS extractor refactor plan (sync.json cluster 1). Adopts the helpers extended in 9c8be55 (nodeStartLine, findFirstChildOfTypes, pushCall, pushImport, extractSimpleParameters, stripQuotes) across six language extractors: - r.ts: drop local stripQuotes; use shared stripQuotes/pushCall/ pushImport/findFirstChildOfTypes/nodeStartLine - gleam.ts: use pushCall/pushImport/findFirstChildOfTypes/nodeStartLine; extract pushConstructor helper for the dual-branch data-constructor walk - julia.ts: use pushCall/pushImport/nodeStartLine; collapse Julia param wrapper-type branches via JULIA_PARAM_WRAPPER_TYPES set - java.ts: use pushCall/pushImport/nodeStartLine; collapse extractJavaParameters via extractSimpleParameters with typeMap sink; extract resolveJavaTypeText for the generic_type unwrap pattern - gleam.ts and solidity.ts: extract qualifyWithParent helper in solidity to collapse 6 duplicated `parent ? \`\${parent}.\${name}\` : name` blocks - solidity.ts: use pushCall/pushImport/findFirstChildOfTypes/ nodeStartLine; collapse extractSolParams via extractSimpleParameters - javascript.ts: bulk-replace 43 inline `XXX.startPosition.row + 1` literals with nodeStartLine() calls; replace one stray endPosition literal with nodeEndLine Net -65 lines. No behaviour changes — only call-site collapsing onto the shared helpers (semantics verified by careful inspection of each replacement; pushImport's empty-names fallback matches the previous ad-hoc defaults in each extractor). docs check acknowledged: internal refactor, no doc updates needed.
Convert collectElixirParamIdentifiers from mutual-recursion with collectElixirMapBindings into a single iterative worklist traversal. Map/list/tuple/binary-operator dispatch is now done via three leaf helpers that push child nodes onto the worklist instead of calling back into the main function. This removes the function-level cycle flagged by codegraph (9 -> 8 cycles) without changing extractor semantics. docs check acknowledged: internal refactor only.
Phase 5 of the Rust extractor refactor plan (sync.json cluster 2). Adopts the helpers extended in 0d687c4 (push_call, push_simple_call, push_import, push_type_map_entry, extract_simple_parameters, match_c_family_type_map) across eight language extractors: - cpp.rs: collapse match_cpp_type_map to a one-line delegate of match_c_family_type_map; use push_import/push_simple_call/push_call for include and call sites - cuda.rs: same delegation as cpp.rs; use push_import/push_simple_call/ push_call across include and call_expression handlers - java.rs: use push_type_map_entry for local-variable / formal-parameter bindings; use push_call/push_simple_call for method invocation and object creation; collapse extract_java_parameters to a one-shot extract_simple_parameters call; use push_import for import declaration - javascript.rs: use push_simple_call for new_expression identifier branch; use push_type_map_entry for the confidence-0.9 type entries - julia.rs: use push_simple_call/push_call across identifier and field_expression / scoped_identifier call branches - objc.rs: use push_import for at_import; use push_call for c-call and message-expression handlers (drops redundant is_empty guards) - r_lang.rs: use push_simple_call/push_call across identifier and namespace_operator call branches; use push_import for library/source - solidity.rs: use push_call (drops redundant guard) for call sites; collapse extract_sol_params to a one-shot extract_simple_parameters Net: -207 lines across 8 files, no behavior change. cargo check clean, 324 rust unit tests pass. Pre-existing test failure: tests/engines/parity.test.ts has two failing elixir cases unrelated to this commit (filed as #1227 — regression from commit 5abe6ad in Phase 3).
Convert collect_elixir_param_identifiers from mutual-recursion with collect_elixir_map_bindings into a single iterative worklist traversal. Map/list/tuple/binary-operator dispatch is now done via three leaf helpers (push_elixir_sequence_items, push_elixir_map_values, push_elixir_binary_operator_operands) that push child nodes onto the worklist instead of calling back into the main function. This removes the function-level cycle flagged by codegraph (8 -> 7 cycles) and mirrors the TS refactor in 5abe6ad without changing extractor semantics. docs check acknowledged: internal refactor only.
|
Claude finished @carlos-alm's task —— View job PR Review Complete ✅Final AssessmentRoot Cause AnalysisProblem Being Solved: Code deduplication - extractors had 108 hand-rolled Why This Approach: Creates shared helper modules ( Risk Assessment: Low risk - pure internal refactoring with no behavior changes. Systematic cleanup of technical debt. Backlog Compliance
Critical ConcernsNone identified. This is a well-executed refactoring that: ✅ Passes all safety checks: No test modifications, no configuration changes, no masking of real issues Final Recommendation
Minor Note: Low problem-fit score (2/5) reflects that this is infrastructure work rather than user-facing features. That's appropriate for position #1 in a 10-PR cleanup series. |
Codegraph Impact Analysis141 functions changed → 153 callers affected across 18 files
|
Greptile SummaryThis PR introduces two new shared helper modules (
Confidence Score: 5/5Safe to merge; the refactoring is mechanical and the behavioural edge cases are correctly handled in the final commit. Every output-push helper preserves existing null/empty guards. The Elixir iterative worklist correctly pushes children in reverse order so document order is maintained on pop. Java's handleJavaLocalVarDecl uses Map.set directly rather than setTypeMapEntry, intentionally keeping last-wins semantics for local variable declarations. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph TS["TypeScript extractors"]
ETS[elixir.ts] --> HTS
GTS[gleam.ts] --> HTS
JTS[java.ts] --> HTS
JSTS[javascript.ts] --> HTS
JULTS[julia.ts] --> HTS
RTS[r.ts] --> HTS
SOLTS[solidity.ts] --> HTS
HTS[helpers.ts\npushCall / pushImport\nextractSimpleParameters\nsetTypeMapEntry\niterChildren]
end
subgraph RS["Rust extractors"]
ERS[elixir.rs] --> HRS
CPP[cpp.rs] --> HRS
CU[cuda.rs] --> HRS
JRS[java.rs] --> HRS
JSRS[javascript.rs] --> HRS
JULRS[julia.rs] --> HRS
ORS[objc.rs] --> HRS
RRS[r_lang.rs] --> HRS
SOLRS[solidity.rs] --> HRS
HRS[helpers.rs\npush_call / push_import\nextract_simple_parameters\npush_type_map_entry\nmatch_c_family_type_map]
end
HTS -. "mirrors" .-> HRS
Reviews (8): Last reviewed commit: "fix(extractors/java): use last-wins for ..." | Re-trigger Greptile |
…er patterns
The iterative-worklist refactor in both the TS and Rust engines pushed
list/tuple children onto the LIFO stack in forward (left-to-right) order,
causing pop() to emit them in reversed document order. For example, the
pattern {x, _y} was yielding ['_y', 'x'] instead of ['x', '_y'].
Fix: iterate children in reverse when pushing onto the stack so that
pop() restores the original document order. The same reverse-push
pattern is applied to pushElixirMapValues / push_elixir_map_values for
multi-binding map patterns.
Also remove the unused `bool` return from `match_c_family_type_map`
(both current callers discard the value).
Fixes the CI failure introduced by the cycle-break refactor.
|
All three Greptile concerns addressed in commit 4fc7913:
|
|
Greptile feedback addressed in two commits:
|
| function extractJavaParameters( | ||
| paramListNode: TreeSitterNode | null, | ||
| typeMap?: Map<string, TypeMapEntry>, | ||
| ): SubDeclaration[] { | ||
| const params: SubDeclaration[] = []; | ||
| if (!paramListNode) return params; | ||
| for (let i = 0; i < paramListNode.childCount; i++) { | ||
| const param = paramListNode.child(i); | ||
| if (!param) continue; | ||
| if (param.type === 'formal_parameter' || param.type === 'spread_parameter') { | ||
| const nameNode = param.childForFieldName('name'); | ||
| if (nameNode) { | ||
| params.push({ name: nameNode.text, kind: 'parameter', line: param.startPosition.row + 1 }); | ||
| if (typeMap) { | ||
| const typeNode = param.childForFieldName('type'); | ||
| if (typeNode) { | ||
| const typeName = | ||
| typeNode.type === 'generic_type' ? typeNode.child(0)?.text : typeNode.text; | ||
| if (typeName) typeMap.set(nameNode.text, { type: typeName, confidence: 0.9 }); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| return params; | ||
| return extractSimpleParameters(paramListNode, { | ||
| paramTypes: JAVA_PARAM_TYPES, | ||
| typeMap, | ||
| resolveType: resolveJavaTypeText, | ||
| }); | ||
| } |
There was a problem hiding this comment.
typeMap write semantics silently changed from last-wins to first-wins
extractJavaParameters now delegates to extractSimpleParameters → recordParamType → setTypeMapEntry, which only writes when confidence > existing.confidence. At the uniform confidence of 0.9, the first method processed wins and all later methods with a colliding parameter name are silently dropped. The old code used typeMap.set(nameNode.text, ...) which always overwrote, so the last method won.
handleJavaLocalVarDecl (line 277) was also changed from ctx.typeMap?.set(…) to setTypeMapEntry(…, 0.9), giving the same first-wins behaviour. If a parameter x: Foo is seen before a local x: Bar, the local variable type is now suppressed — the opposite of the previous behaviour. The PR claims no behavioural change, but this is a real semantic difference that can produce wrong type-resolution results in files with same-named parameters across methods.
There was a problem hiding this comment.
Fixed in commit abd318f. The handleJavaLocalVarDecl path now uses Map.set (last-wins) rather than setTypeMapEntry (first-wins at equal confidence), restoring the original pre-refactor behavior. Root cause: setTypeMapEntry at confidence 0.9 let the constructor parameter 'repo: UserRepository' block the local variable 'InMemoryUserRepository repo' in createDefault() from updating the typeMap. This caused repo.findById() etc. to incorrectly resolve to the interface method instead of the concrete implementation — 3 false-positive call edges, dropping Java precision from 70% to 66.7% on CI (all platforms). Local variable declarations are method-scoped and should override same-named entries from earlier scopes, so last-wins is the correct semantics here.
…LocalVarDecl setTypeMapEntry (first-wins on tie) blocked local variable types from overriding same-named constructor parameters at equal confidence (0.9). In UserService, the constructor parameter 'repo: UserRepository' was set first, preventing the local variable 'InMemoryUserRepository repo' in createDefault() from updating the typeMap, which caused repo.findById() etc. to resolve to the interface method (false positive) instead of the concrete implementation. Local variable declarations are method-scoped and should use last-wins semantics so a concrete type assigned in one method doesn't get blocked by an earlier abstract type from a different scope.
|
Fixed CI failure (Java precision 66.7% < 70% threshold) in commit abd318f. The root cause was the typeMap write-semantics change in handleJavaLocalVarDecl (commit 3daa2a1): switching from Map.set (last-wins) to setTypeMapEntry (first-wins at equal confidence) let the constructor parameter 'repo: UserRepository' block the local variable 'InMemoryUserRepository repo' in createDefault() from updating the typeMap. This caused repo.findById(), repo.save(), and repo.delete() in other methods to resolve to the interface methods (false positives) instead of the concrete implementation — 3 false-positive edges, dropping precision from 70% to 66.7% on all CI platforms. Fix: reverted handleJavaLocalVarDecl to use direct Map.set (last-wins). Local variable declarations are method-scoped and must be able to override same-named entries from other scopes. extractJavaParameters (via extractSimpleParameters/setTypeMapEntry) is left unchanged since parameter-to-parameter collisions across different methods don't cause this particular failure. |
Summary
helpers.ts,helpers.rs); no behavioural changeCommits
Context
Part of the Titan Paradigm cleanup pass (see
.codegraph/titan/TITAN_REPORT.md). Merge order: this PR is #1 of 10 (mergeOrder position: 1).Caveats
Test plan
codegraph stats)