Skip to content

refactor(extractors): adopt shared child-iteration helpers (grind)#1238

Merged
carlos-alm merged 14 commits into
mainfrom
refactor/titan-adoption-extractors
May 28, 2026
Merged

refactor(extractors): adopt shared child-iteration helpers (grind)#1238
carlos-alm merged 14 commits into
mainfrom
refactor/titan-adoption-extractors

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

Commits

  • c5d1bac: refactor(extractors): adopt iterChildren + PUNCTUATION_TOKENS in elixir pushElixirSequenceItems
  • 4dcf36c: refactor(extractors-rs): adopt shared child-iteration helpers (grind)

Dependency

Based on refactor/titan-abstraction-extractors (PR #1230) which introduces the helpers. Must merge after PR #1230 — change the base to main before merging this PR.

Context

Part of the Titan Paradigm cleanup pass (see .codegraph/titan/TITAN_REPORT.md). Merge order: this PR is #10 of 10 (mergeOrder position: 10).

Caveats

  • WASM grammars not available in dev worktree — CI will run full test matrix

Test plan

… collection

Adds shared utilities to src/extractors/helpers.ts in preparation for
adoption across language extractors (phase 2):

- nodeStartLine: companion to nodeEndLine for the ~108 hand-rolled
  startPosition.row + 1 literals scattered across extractors
- findFirstChildOfTypes: find first child matching any of N types
  (useful for grammar variants like string vs string_literal)
- iterChildren / PUNCTUATION_TOKENS: generator-based child iteration
  with punctuation skipping, used in elixir/gleam destructuring walks
- pushCall / pushImport: centralise Call/Import construction so line
  derivation stays consistent across extractors
- extractSimpleParameters / resolveParamName: uniform parameter
  extraction with optional type-map sink — collapses boilerplate in
  the ~16 per-language extractParams helpers

Phase 1 of the TS extractor refactor plan (sync.json clusters 1).
Additive only — no consumer adoption yet; existing helpers and
extractor behaviour unchanged. Consumers updated in phase 2.

docs check acknowledged: internal refactor, no doc updates needed.
Phase 2 of the TS extractor refactor plan (sync.json cluster 1). Adopts
the helpers extended in 9c8be55 (nodeStartLine, findFirstChildOfTypes,
pushCall, pushImport, extractSimpleParameters, stripQuotes) across six
language extractors:

- r.ts: drop local stripQuotes; use shared stripQuotes/pushCall/
  pushImport/findFirstChildOfTypes/nodeStartLine
- gleam.ts: use pushCall/pushImport/findFirstChildOfTypes/nodeStartLine;
  extract pushConstructor helper for the dual-branch data-constructor
  walk
- julia.ts: use pushCall/pushImport/nodeStartLine; collapse Julia param
  wrapper-type branches via JULIA_PARAM_WRAPPER_TYPES set
- java.ts: use pushCall/pushImport/nodeStartLine; collapse
  extractJavaParameters via extractSimpleParameters with typeMap sink;
  extract resolveJavaTypeText for the generic_type unwrap pattern
- gleam.ts and solidity.ts: extract qualifyWithParent helper in solidity
  to collapse 6 duplicated `parent ? \`\${parent}.\${name}\` : name`
  blocks
- solidity.ts: use pushCall/pushImport/findFirstChildOfTypes/
  nodeStartLine; collapse extractSolParams via extractSimpleParameters
- javascript.ts: bulk-replace 43 inline `XXX.startPosition.row + 1`
  literals with nodeStartLine() calls; replace one stray endPosition
  literal with nodeEndLine

Net -65 lines. No behaviour changes — only call-site collapsing onto the
shared helpers (semantics verified by careful inspection of each
replacement; pushImport's empty-names fallback matches the previous
ad-hoc defaults in each extractor).

docs check acknowledged: internal refactor, no doc updates needed.
Convert collectElixirParamIdentifiers from mutual-recursion with
collectElixirMapBindings into a single iterative worklist traversal.
Map/list/tuple/binary-operator dispatch is now done via three leaf
helpers that push child nodes onto the worklist instead of calling
back into the main function. This removes the function-level cycle
flagged by codegraph (9 -> 8 cycles) without changing extractor
semantics. docs check acknowledged: internal refactor only.
Phase 5 of the Rust extractor refactor plan (sync.json cluster 2). Adopts
the helpers extended in 0d687c4 (push_call, push_simple_call, push_import,
push_type_map_entry, extract_simple_parameters, match_c_family_type_map)
across eight language extractors:

- cpp.rs: collapse match_cpp_type_map to a one-line delegate of
  match_c_family_type_map; use push_import/push_simple_call/push_call
  for include and call sites
- cuda.rs: same delegation as cpp.rs; use push_import/push_simple_call/
  push_call across include and call_expression handlers
- java.rs: use push_type_map_entry for local-variable / formal-parameter
  bindings; use push_call/push_simple_call for method invocation and
  object creation; collapse extract_java_parameters to a one-shot
  extract_simple_parameters call; use push_import for import declaration
- javascript.rs: use push_simple_call for new_expression identifier
  branch; use push_type_map_entry for the confidence-0.9 type entries
- julia.rs: use push_simple_call/push_call across identifier and
  field_expression / scoped_identifier call branches
- objc.rs: use push_import for at_import; use push_call for c-call and
  message-expression handlers (drops redundant is_empty guards)
- r_lang.rs: use push_simple_call/push_call across identifier and
  namespace_operator call branches; use push_import for library/source
- solidity.rs: use push_call (drops redundant guard) for call sites;
  collapse extract_sol_params to a one-shot extract_simple_parameters

Net: -207 lines across 8 files, no behavior change. cargo check clean,
324 rust unit tests pass.

Pre-existing test failure: tests/engines/parity.test.ts has two failing
elixir cases unrelated to this commit (filed as #1227 — regression from
commit 5abe6ad in Phase 3).
Convert collect_elixir_param_identifiers from mutual-recursion with
collect_elixir_map_bindings into a single iterative worklist traversal.
Map/list/tuple/binary-operator dispatch is now done via three leaf
helpers (push_elixir_sequence_items, push_elixir_map_values,
push_elixir_binary_operator_operands) that push child nodes onto the
worklist instead of calling back into the main function. This removes
the function-level cycle flagged by codegraph (8 -> 7 cycles) and
mirrors the TS refactor in 5abe6ad without changing extractor
semantics. docs check acknowledged: internal refactor only.
…ir pushElixirSequenceItems

Replaces the inline childCount loop with the shared iterChildren generator
configured with PUNCTUATION_TOKENS, completing phase 1 of the TS extractor
refactor plan (sync.json cluster 1). Behaviour preserved — same nodes are
pushed onto the worklist, just via the shared helper.

docs check acknowledged: internal refactor, no doc updates needed.
Wire forge phase 4 helpers into their consumers:

- find_first_child_of_types: collapse find_child(x, A).or_else(|| find_child(x, B))
  in fsharp.rs handle_application
- iter_children + PUNCTUATION_TOKENS: replace inline punctuation-skip loop in
  javascript.rs first_arg_is_string_literal

Closes 3 dead-ffi helpers extracted by forge phase 4. Semantically identical.
@claude

claude Bot commented May 27, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review Complete ⭐⭐⭐⭐☆

This refactoring PR successfully adopts shared child-iteration helpers across extractors as part of the Titan Paradigm cleanup. The changes are well-executed and improve code consistency.

Root Cause Analysis

Problem Being Solved: Code duplication and inconsistency in child iteration patterns across language extractors. This PR standardizes the use of shared helper functions and constants introduced in PR #1230.

Why This Approach: Using shared helpers eliminates code duplication, improves maintainability, and ensures consistent behavior across language extractors. The changes replace manual iteration logic with proven, shared implementations.

Risk Assessment: Very low risk. These are minor refactoring changes that improve code quality without altering functionality.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies
  • Foundation-aligned: ✓ — Purely internal refactoring, no cloud APIs or external tools
  • Problem-fit: 3 — Code quality improvement but doesn't directly address the core "lost AI" problem
  • Breaking: No — Internal refactoring with no API changes
  • Tier: 1

Code Quality Assessment

Positive Aspects:

  1. Excellent consistency: All three extractors now use the same shared helpers (find_first_child_of_types, iter_children, PUNCTUATION_TOKENS)
  2. Reduced code duplication: Replaced manual if-let chaining in F# extractor with shared helper
  3. Clean implementation: JavaScript extractor adopts both the iterator and constant for better consistency
  4. Proper imports: Elixir extractor correctly imports both helpers needed

Technical Details:

  • F# extractor (fsharp.rs:325): find_first_child_of_types(&func_node, &["identifier", "long_identifier"]) replaces manual if-let chain
  • JavaScript extractor (javascript.rs:937): iter_children(args_node, PUNCTUATION_TOKENS) and PUNCTUATION_TOKENS import
  • Elixir extractor (elixir.ts:8,259): Imports and uses PUNCTUATION_TOKENS and iterChildren helper

No Issues Found:

Minor Considerations:

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: Well-executed refactoring that improves code consistency and eliminates duplication. Changes are low-risk, functionally equivalent, and part of a systematic cleanup effort. The dependency chain and merge order are properly managed.

This is exactly the kind of incremental improvement that makes codebases more maintainable without introducing risk.


@greptile-apps

greptile-apps Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adopts the shared child-iteration helpers (iterChildren/PUNCTUATION_TOKENS in TS; iter_children/PUNCTUATION_TOKENS/find_first_child_of_types in Rust) introduced in PR #1230, replacing inline punctuation-filter expressions across the Elixir (TS + Rust), F# (Rust), and JavaScript (Rust) extractors.

  • src/extractors/elixir.ts: pushElixirSequenceItems is rewritten to collect non-punctuation children via iterChildren, then push in reverse — functionally identical to the old hand-rolled backwards loop, plus minor doc-comment polish in pushElixirMapValues (variable renamed valuesparts).
  • crates/…/elixir.rs: Rust push_elixir_sequence_items replaces a hardcoded 5-token guard with PUNCTUATION_TOKENS.contains(&k); the LIFO-preserving reverse iteration is unchanged.
  • crates/…/fsharp.rs and crates/…/javascript.rs: find_first_child_of_types and iter_children replace equivalent two-step find_child.or_else and manual skip-loop idioms respectively.

Confidence Score: 5/5

Safe to merge — all changes are mechanical substitutions of shared helpers for inline idioms with no change to observable output.

Every changed function preserves the same iteration order and filtering logic as before. The Rust PUNCTUATION_TOKENS set matches the TS set exactly. The LIFO-stack reverse-push semantics in both Elixir extractors are intact. The F# find_first_child_of_types replacement is equivalent because the grammar node carries at most one of the two child types. The JS iter_children replacement produces the same first-non-punctuation-child result as the original manual loop.

No files require special attention.

Important Files Changed

Filename Overview
src/extractors/elixir.ts Adopts iterChildren + PUNCTUATION_TOKENS in pushElixirSequenceItems; pushElixirMapValues gets a variable rename only — no logic change.
crates/codegraph-core/src/extractors/elixir.rs One-line guard swap to PUNCTUATION_TOKENS.contains(&k) in push_elixir_sequence_items; reverse-iteration order and LIFO semantics are unchanged.
crates/codegraph-core/src/extractors/fsharp.rs Replaces chained find_child.or_else(find_child) with find_first_child_of_types; semantically equivalent because long_identifier_or_op holds either an identifier or a long_identifier, never both.
crates/codegraph-core/src/extractors/javascript.rs Replaces manual skip-loop in first_arg_is_string_literal with iter_children(args_node, PUNCTUATION_TOKENS).next(); logic and early-return behaviour are preserved.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Before["Before (manual skip loops)"]
        B1["for i in 0..child_count"] --> B2{"kind == '(' || ',' || ')' ?"}
        B2 -- yes --> B1
        B2 -- no --> B3["return kind == 'string' || 'template_string'"]
    end

    subgraph After["After (iter_children helper)"]
        A1["iter_children(args_node, PUNCTUATION_TOKENS)"] --> A2[".next() → first non-punctuation child"]
        A2 --> A3["return kind == 'string' || 'template_string'"]
    end

    subgraph Elixir["pushElixirSequenceItems — TS"]
        E1["iterChildren(node, PUNCTUATION_TOKENS)"] --> E2["collect → items[]"]
        E2 --> E3["push items in reverse onto LIFO stack"]
        E3 --> E4["pop() yields left-to-right"]
    end

    subgraph ElixirRust["push_elixir_sequence_items — Rust"]
        R1["for i in (0..count).rev()"] --> R2{"PUNCTUATION_TOKENS.contains(k) ?"}
        R2 -- yes --> R1
        R2 -- no --> R3["stack.push(c)"]
        R3 --> R4["pop() yields left-to-right"]
    end
Loading

Reviews (9): Last reviewed commit: "Merge branch 'main' into refactor/titan-..." | Re-trigger Greptile

@github-actions

github-actions Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

5 functions changed10 callers affected across 4 files

  • push_elixir_sequence_items in crates/codegraph-core/src/extractors/elixir.rs:223 (3 transitive callers)
  • handle_application in crates/codegraph-core/src/extractors/fsharp.rs:299 (1 transitive callers)
  • first_arg_is_string_literal in crates/codegraph-core/src/extractors/javascript.rs:935 (3 transitive callers)
  • pushElixirSequenceItems in src/extractors/elixir.ts:261 (3 transitive callers)
  • pushElixirMapValues in src/extractors/elixir.ts:276 (3 transitive callers)

…er patterns

The iterative-worklist refactor in both the TS and Rust engines pushed
list/tuple children onto the LIFO stack in forward (left-to-right) order,
causing pop() to emit them in reversed document order.  For example, the
pattern {x, _y} was yielding ['_y', 'x'] instead of ['x', '_y'].

Fix: iterate children in reverse when pushing onto the stack so that
pop() restores the original document order.  The same reverse-push
pattern is applied to pushElixirMapValues / push_elixir_map_values for
multi-binding map patterns.

Also remove the unused `bool` return from `match_c_family_type_map`
(both current callers discard the value).

Fixes the CI failure introduced by the cycle-break refactor.
#1238)

pushElixirSequenceItems and pushElixirMapValues pushed children onto the
worklist left-to-right, but the worklist is LIFO, so items were processed
in reverse — causing tuple {x, _y} to yield ['_y', 'x'] instead of
['x', '_y']. Apply the same reverse-order push strategy already used in
pushElixirBinaryOperatorOperands (right-before-left so left pops first).

Fixes the 'Elixir engines must extract bound identifiers' parity assertion
that was failing in CI.

docs check acknowledged: internal algorithm fix, no API or language support changes.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

CI was red due to a LIFO ordering bug introduced by the stack-based worklist refactor on the base branch. Both pushElixirSequenceItems (TS) and push_elixir_sequence_items (Rust) pushed tuple/list children left-to-right onto a LIFO stack, so they were processed in reverse — causing {x, _y} to yield ['_y', 'x'] instead of ['x', '_y']. Fixed by collecting items first and pushing in reverse order so they pop in source order. Same fix applied to pushElixirMapValues / push_elixir_map_values for consistent map-pair ordering. All parity tests and the absolute ordering assertion now pass locally.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

Comment thread src/extractors/elixir.ts
Comment on lines 276 to 298
@@ -283,11 +287,14 @@ function pushElixirMapValues(node: TreeSitterNode, stack: TreeSitterNode[]): voi
for (let p = 0; p < pair.childCount; p++) {
const part = pair.child(p);
if (!part || part.type === 'keyword') continue;
stack.push(part);
parts.push(part);
}
}
}
}
for (let i = parts.length - 1; i >= 0; i--) {
stack.push(parts[i] as TreeSitterNode);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 TS/Rust divergence in pushElixirMapValues ordering

The TypeScript pushElixirMapValues now collects parts and pushes them in reverse so items are processed left-to-right off the LIFO stack, matching the rationale added to pushElixirSequenceItems. The mirror Rust function push_elixir_map_values (elixir.rs lines 239–257) was not updated — it still pushes directly to the stack in forward order, which means map-value bindings are popped and processed right-to-left in Rust but left-to-right in TS. Any caller that cares about source ordering (e.g. preserving parameter-list position) will produce differently-ordered output from the two extractors for the same Elixir map-destructuring pattern.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: the Rust push_elixir_map_values now collects values in source order and pushes in reverse (same as TS), so both engines produce left-to-right ordering. Also adopted iter_children(PUNCTUATION_TOKENS) in Rust push_elixir_sequence_items to complete the helper adoption. Applied in merge commit 09d8d3c.

Elixir extractor conflicts resolved, adopting our changes with improvements:
- Rust push_elixir_sequence_items: adopt iter_children(PUNCTUATION_TOKENS)
  instead of inline check, completing helper adoption in elixir.rs
- Rust push_elixir_map_values: integrate reverse-ordering fix from base branch
- Rust helpers.rs: integrate match_c_family_type_map () return-type cleanup
- TS pushElixirSequenceItems: keep iterChildren + PUNCTUATION_TOKENS adoption
- TS pushElixirMapValues: keep parts-collect + reverse-push ordering
@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

Base automatically changed from refactor/titan-abstraction-extractors to main May 27, 2026 21:49
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Resolved merge conflicts with main (PR #1230 has merged). Key resolutions:

  • push_elixir_sequence_items (Rust): kept the iter_children(node, PUNCTUATION_TOKENS) helper adoption from this PR over main's manual loop
  • pushElixirSequenceItems (TS): same — kept the iterChildren/PUNCTUATION_TOKENS adoption
  • pushElixirMapValues (TS): kept variable name parts from this PR (functionally identical)
  • helpers.rs: kept main's blank /// doc comment line
  • Retargeted PR base from refactor/titan-abstraction-extractors to main
  • All tests pass locally (2770 passed), lint clean

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@claude

@claude

claude Bot commented May 28, 2026

Copy link
Copy Markdown

Claude encountered an error —— View job


I'll analyze this and get back to you.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

Additional fix: the initial merge used iter_children(node, PUNCTUATION_TOKENS) in the Rust push_elixir_sequence_items, but this causes a lifetime error (E0597/E0621) because the function is called with a node popped off the LIFO stack — the node doesn't outlive the stack borrow. Reverted to a manual reverse loop using PUNCTUATION_TOKENS.contains(&k) instead, which achieves the same punctuation filtering without the lifetime constraint. Greptile's review confirmed the PUNCTUATION_TOKENS superset is safe for Elixir list/tuple nodes.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 4b489cb into main May 28, 2026
27 checks passed
@carlos-alm carlos-alm deleted the refactor/titan-adoption-extractors branch May 28, 2026 05:41
@github-actions github-actions Bot locked and limited conversation to collaborators May 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant