Skip to content

refactor(domain): decompose parser, analysis, and search modules#1236

Merged
carlos-alm merged 6 commits into
mainfrom
refactor/titan-decomposition-domain
May 28, 2026
Merged

refactor(domain): decompose parser, analysis, and search modules#1236
carlos-alm merged 6 commits into
mainfrom
refactor/titan-decomposition-domain

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • Parser: extracts LANGUAGE_REGISTRY iteration + worker boundary helpers
  • Analysis: decomposes module-map; reduces complexity in fn-impact and dependencies
  • Search: decomposes generator; reduces complexity in semantic and hybrid search

Commits

  • d2eab30: refactor(parser): extract LANGUAGE_REGISTRY iteration and worker boundary helpers
  • 6819cd6: refactor(analysis): decompose module-map and reduce complexity in fn-impact and dependencies
  • 4f34404: refactor(search): decompose generator and reduce complexity in semantic and hybrid search

Context

Part of the Titan Paradigm cleanup pass (see .codegraph/titan/TITAN_REPORT.md). Merge order: this PR is #8 of 10 (mergeOrder position: 8).

Note: Plan listed PR #1 (extractors) as a dependency because the helper signatures landed there. Cherry-pick applied cleanly on top of main since the parser changes are independent of the extractor helpers — review separately, but merge order can match plan if reviewers prefer.

Caveats

  • WASM grammars not available in dev worktree — CI will run full test matrix

Test plan

  • CI passes (lint, build, full test matrix)
  • Verify no new cycles introduced (codegraph stats)

…impact and dependencies

Split high-cognitive-complexity functions in the analysis domain into focused
helpers. Worst functions per gauntlet (cog/cyc/maxNesting/halstead) are now
below thresholds.

module-map.ts (statsData cog=31 -> below threshold):
- Extract buildStatsFromNative and buildStatsFromJs branches
- Share false-positive query and quality-score helpers between paths
- aggregateRolesFromNative pulls duplicated role-aggregation code out

fn-impact.ts (bfsTransitiveCallers cog=37 -> below threshold,
              impactAnalysisData cog=27 -> below threshold):
- Extract recordCaller, processFrontierNode, seedInterfaceImplementors
- Extract bfsImportDependents and groupDependentsByLevel

dependencies.ts (bfsShortestPath cog=29, bfsFilePath cog=30,
                 buildTransitiveCallers cog=24 -> all below threshold):
- Extract buildNextCallerFrontier from buildTransitiveCallers
- Extract buildNeighborStmt + visitNeighbor; state collected in struct
- Extract visitFileNeighbor + reconstructFilePath

docs check acknowledged - internal helper extraction, no user-facing changes
@claude

claude Bot commented May 27, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


Root Cause Analysis

Problem Being Solved: Titan Paradigm cleanup pass addressing complex function decomposition to improve code maintainability and cognitive load across parser, analysis, and search modules.

Why This Approach: The PR implements function extraction and complexity reduction as part of a systematic refactoring initiative. The changes decompose large modules by extracting helper functions and reducing cognitive complexity in hot-path functions.

Risk Assessment: Low risk refactoring - pure extraction with no semantic changes to existing functionality.


Phase 1: Root Cause Analysis ✅

This PR addresses code complexity as part of the "Titan Paradigm cleanup pass" - a systematic refactoring to reduce cognitive complexity in the codebase. Based on the Titan Paradigm documentation, this targets functions with complexity > 7 as failures that need decomposition.

The changes are legitimate refactoring - extracting helper functions to reduce complexity without changing behavior.


Phase 2: Configuration Change Detection ✅

No concerning configuration changes detected:

  • ✅ No biome/linting rule modifications
  • ✅ No test timeout or coverage threshold changes
  • ✅ No build configuration modifications
  • ✅ No TypeScript strict mode changes
  • ✅ No module resolution changes

Phase 3: Test Integrity ✅

No test integrity concerns:

  • ✅ No test files modified
  • ✅ No assertions weakened or removed
  • ✅ No test expectations lowered
  • ✅ No eslint-disable comments added

The PR mentions "WASM grammars not available in dev worktree — CI will run full test matrix" which is expected for worktree environments.


Phase 4: Code Quality Assessment ✅

Positive refactoring patterns:

  1. Parser module (parser.ts):

    • ✅ Extracts collectBackfillPending, mergeAnalysisData helper functions
    • ✅ Reduces complexity in ensureWasmTrees by decomposing into smaller functions
    • ✅ Maintains same API contracts
  2. Analysis modules:

    • dependencies.ts: Extracts BFS helpers (buildNextCallerFrontier, buildTransitiveCallers)
    • fn-impact.ts: Decomposes bfsTransitiveCallers with helper functions
    • module-map.ts: Extracts stats aggregation helpers for native fast-path
  3. Search modules:

    • generator.ts: Extracts embedding text preparation helpers
    • semantic.ts: Better separation of concerns for multi-search RRF
    • hybrid.ts: Cleaner fusion logic organization
  4. Worker entry: Better visitor setup organization

No concerning patterns found:

  • ✅ No new dependencies added
  • ✅ No security issues introduced
  • ✅ No performance regressions expected
  • ✅ Maintains backward compatibility

Phase 5: Backlog Compliance

Zero-dep: ✅ - No new runtime dependencies added. Pure refactoring using existing tree-sitter/SQLite/AST infrastructure.

Foundation-aligned: ✅ - Fully compliant with all principles:

  • P1: No cloud API calls, maintains always-current analysis
  • P4: Zero-cost core maintained
  • P5: Programmatic API contract preserved
  • P8: No code generation, pure analysis tool

Problem-fit: 4/5 - This directly addresses the core problem by reducing cognitive complexity that causes AI agents to waste tokens re-orienting in complex functions. Decomposed functions are easier for AI to understand and modify correctly.

Breaking: No - Pure internal refactoring with no API signature changes, DB schema changes, or MCP tool contract modifications.

Tier: 1 (zero-dep ✓ + foundation-aligned ✓)


Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE WITH CONDITIONS
  • Reasoning: High-quality refactoring that reduces complexity without changing behavior. Part of systematic Titan Paradigm cleanup. Minor concern about verifying no functional regressions in complex extraction logic.

Conditions for approval:

  1. ✅ CI passes with full test matrix
  2. ✅ No new cycles introduced (mentioned in test plan)

This is exactly the type of systematic complexity reduction that makes codebases more maintainable for both humans and AI agents.

@greptile-apps

greptile-apps Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR decomposes three large domain modules into focused private helper functions — parser, analysis (module-map, fn-impact, dependencies), and search (generator, semantic, hybrid) — as part of the Titan Paradigm cleanup pass. The changes are pure refactors: no public API signatures change and no new functionality is introduced.

  • Parser: mergeAnalysisData is split into four single-purpose merge helpers; parseAndExtract and several visitor-builder functions are extracted from the monolithic handleParse; ingestNativeResults and backfillNativeDrops are pulled out of parseFilesAuto.
  • Analysis: BFS loop bodies in dependencies.ts become visitNeighbor/visitFileNeighbor/buildNextCallerFrontier; impact-analysis BFS is extracted into bfsImportDependents/groupDependentsByLevel; statsData in module-map.ts delegates to buildStatsFromNative/buildStatsFromJs.
  • Search: buildEmbeddings is decomposed into resolveRoot, loadNodesByFile, prepareEmbeddingTexts, and persistEmbeddings; fuseResults and multiSearchData in hybrid/semantic search receive similarly extracted helpers.

Confidence Score: 5/5

Safe to merge — all extracted helpers are pure decompositions with no observable behavioral changes.

Each extracted function is a mechanical lift of existing inline code with identical logic. Tree disposal in wasm-worker-entry correctly propagates through both the parseAndExtract early-return paths and the handleParse finally block. BFS traversal semantics in dependencies and fn-impact are preserved exactly. The native/JS stat paths in module-map produce the same output shape. No public signatures changed.

No files require special attention.

Important Files Changed

Filename Overview
src/domain/analysis/dependencies.ts BFS helpers extracted (buildNextCallerFrontier, visitNeighbor, visitFileNeighbor, reconstructFilePath, buildNeighborStmt); behavior is preserved — target-found early-return semantics match the old continue/break pattern exactly.
src/domain/analysis/fn-impact.ts recordCaller, processFrontierNode, seedInterfaceImplementors, bfsImportDependents, groupDependentsByLevel extracted; resolveImplementors guard on expandImplementors call preserved correctly in processFrontierNode.
src/domain/analysis/module-map.ts statsData body split into buildStatsFromNative and buildStatsFromJs; queryFalsePositiveRows/buildFalsePositiveWarnings/computeQualityScore deduplicate the formerly duplicated FP logic across both paths; NativeDatabase type imported.
src/domain/parser.ts mergeAnalysisData split into mergeScalarMetadata/mergeAnalysisArrays/mergeTypeMap/mergeDefinitionAnalysis; ingestNativeResults and backfillNativeDrops extracted from parseFilesAuto; IMPORT_FIELD_RENAMES data table replaces 15 if-chains in patchImports.
src/domain/search/generator.ts buildEmbeddings decomposed into resolveRoot, loadNodesByFile, prepareEmbeddingTexts, persistEmbeddings; node count log now computed from byFile map instead of raw nodes array — produces the same value.
src/domain/search/search/hybrid.ts fuseResults decomposed with createFusionEntry, mergeRankedItem, toHybridResult; type narrowing cast dropped in favour of direct property access on the existing RankedItem union.
src/domain/search/search/semantic.ts rowVector, checkDimensionMismatch, warnOnSimilarQueries, rankRowsForQuery, fuseRankedHits extracted from multiSearchData; similarityWarnThreshold now read once into a variable before the loop; logic unchanged.
src/domain/wasm-worker-entry.ts handleParse decomposed into parseAndExtract, runVisitorWalk, serializeExtractorOutput, disposeTree, and four visitor-builder functions; disposeTree correctly placed in both parseAndExtract early-return paths and handleParse finally block.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph parser["parser.ts — parseFilesAuto"]
        PA[parseFilesAuto] --> IR[ingestNativeResults]
        PA --> BND[backfillNativeDrops]
        IR --> BTM[backfillTypeMapBatch]
    end

    subgraph wasm["wasm-worker-entry.ts — handleParse"]
        HP[handleParse] --> PAE[parseAndExtract]
        PAE -->|"null: extractor fails"| DT1[disposeTree]
        PAE -->|"success: tree + symbols"| HP2[handleParse try/finally]
        HP2 --> SV[setupVisitorsLocal]
        HP2 --> RVW[runVisitorWalk]
        HP2 --> SEO[serializeExtractorOutput]
        HP2 -->|"finally"| DT2[disposeTree]
    end

    subgraph analysis["analysis — statsData"]
        SD[statsData] --> jsSections["jsSections"]
        SD -->|"nativeDb present"| BSN[buildStatsFromNative]
        SD -->|"fallback"| BSJ[buildStatsFromJs]
        BSN --> QFP[queryFalsePositiveRows]
        BSN --> CQS[computeQualityScore]
        BSJ --> CMQ[computeQualityMetrics]
        BSJ --> CQS
    end

    subgraph search["search/generator.ts — buildEmbeddings"]
        BE[buildEmbeddings] --> RR[resolveRoot]
        BE --> LNF[loadNodesByFile]
        BE --> PET[prepareEmbeddingTexts]
        BE --> PE[persistEmbeddings]
    end
Loading

Reviews (4): Last reviewed commit: "Merge branch 'main' into refactor/titan-..." | Re-trigger Greptile

Comment on lines +711 to +724
let symbols: ExtractorOutput | null;
try {
const query = _queries.get(entry.id);
// tree-sitter's Tree/Query are structurally compatible with
// TreeSitterTree/TreeSitterQuery at runtime — same cast style as
// parser.ts::wasmExtractSymbols (parser.ts:789).
symbols = entry.extractor(tree as any, filePath, query as any) ?? null;
} catch {
return null;
}
if (!symbols) {
return null;
}
return { tree, symbols };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 WASM tree memory leak when extractor fails

The refactor moved tree creation into parseAndExtract, but when the extractor throws (line 718) or returns null (line 722), the function returns null without calling disposeTree. In handleParse, the finally { disposeTree(tree) } block is only reached when parseAndExtract succeeds — the if (!parsed) return null exit on line 830 bypasses it entirely. In the old code, tree was scoped to handleParse and the outer finally block covered all exit paths. Files with unsupported or crashing extractors will now silently accumulate leaked WASM linear memory in long-running workers.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ab0b7b3 — added disposeTree(tree) before each early-return null path in parseAndExtract (extractor-throws catch block and symbols-null guard). The tree is now always released when the extractor fails, preventing WASM linear memory accumulation in long-running workers.

@github-actions

github-actions Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

75 functions changed75 callers affected across 32 files

  • buildNextCallerFrontier in src/domain/analysis/dependencies.ts:64 (3 transitive callers)
  • buildTransitiveCallers in src/domain/analysis/dependencies.ts:84 (3 transitive callers)
  • buildNeighborStmt in src/domain/analysis/dependencies.ts:292 (3 transitive callers)
  • visitNeighbor in src/domain/analysis/dependencies.ts:311 (3 transitive callers)
  • bfsShortestPath in src/domain/analysis/dependencies.ts:341 (4 transitive callers)
  • visitFileNeighbor in src/domain/analysis/dependencies.ts:534 (3 transitive callers)
  • reconstructFilePath in src/domain/analysis/dependencies.ts:558 (3 transitive callers)
  • bfsFilePath in src/domain/analysis/dependencies.ts:574 (3 transitive callers)
  • recordCaller in src/domain/analysis/fn-impact.ts:87 (7 transitive callers)
  • processFrontierNode in src/domain/analysis/fn-impact.ts:106 (11 transitive callers)
  • seedInterfaceImplementors in src/domain/analysis/fn-impact.ts:127 (11 transitive callers)
  • bfsTransitiveCallers in src/domain/analysis/fn-impact.ts:143 (16 transitive callers)
  • bfsImportDependents in src/domain/analysis/fn-impact.ts:195 (3 transitive callers)
  • groupDependentsByLevel in src/domain/analysis/fn-impact.ts:227 (3 transitive callers)
  • impactAnalysisData in src/domain/analysis/fn-impact.ts:241 (2 transitive callers)
  • computeQualityMetrics in src/domain/analysis/module-map.ts:166 (3 transitive callers)
  • queryFalsePositiveRows in src/domain/analysis/module-map.ts:336 (5 transitive callers)
  • buildFalsePositiveWarnings in src/domain/analysis/module-map.ts:354 (5 transitive callers)
  • computeQualityScore in src/domain/analysis/module-map.ts:363 (5 transitive callers)
  • aggregateRolesFromNative in src/domain/analysis/module-map.ts:372 (3 transitive callers)

…Extract (#1236)

When the extractor throws or returns null, the tree allocated by
parser.parse is now disposed before returning null, preventing WASM
linear memory accumulation in long-running workers.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 8ccafa0 into main May 28, 2026
21 checks passed
@carlos-alm carlos-alm deleted the refactor/titan-decomposition-domain branch May 28, 2026 03:52
@github-actions github-actions Bot locked and limited conversation to collaborators May 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant