refactor(scripts): separate config from execution in benchmarking scripts by carlos-alm · Pull Request #1235 · optave/ops-codegraph-tool

carlos-alm · 2026-05-27T00:32:43Z

Summary

Separates config from execution in scripts/token-benchmark.ts
Extracts scripts/lib/bench-config.ts helper module
Addresses the worst scripts/ offenders from gauntlet

Commits

7f99ece: refactor(scripts): separate config from execution in benchmarking scripts

Context

Part of the Titan Paradigm cleanup pass (see .codegraph/titan/TITAN_REPORT.md). Merge order: this PR is #7 of 10 (mergeOrder position: 7).

Caveats

WASM grammars not available in dev worktree — CI will run full test matrix

Test plan

CI passes (lint, build, full test matrix)
Verify no new cycles introduced (codegraph stats)

…ipts

claude · 2026-05-27T00:32:58Z

Claude finished @carlos-alm's task —— View job

Code Review Analysis

Analyzing PR #1235: refactor(scripts): separate config from execution in benchmarking scripts

Todo List

Read changed files to understand the refactoring
Phase 1: Root Cause Analysis
Phase 2: Configuration Change Detection
Phase 3: Test Integrity Check
Phase 4: Code Quality Assessment
Phase 5: Backlog Compliance Evaluation
Provide structured review feedback and rating

Root Cause Analysis

Problem Being Solved: This PR extracts common benchmark configuration logic from token-benchmark.ts into a shared scripts/lib/bench-config.ts module. The refactoring addresses technical debt identified in the Titan Paradigm cleanup (PR #7 of 10).

Why This Approach: The approach is justified - token-benchmark.ts originally contained 583 lines of mixed config/execution logic. Separating concerns into a dedicated config module allows reuse across the 4 benchmark scripts mentioned in comments. This follows good separation of concerns principles.

Risk Assessment: Low risk. This is a pure refactoring with no behavioral changes. The extracted config logic maintains identical functionality while improving code organization and enabling reuse.

Backlog Compliance

Zero-dep: ✓ — No new runtime dependencies added
Foundation-aligned: ✓ — Pure refactoring that improves code organization without violating any principles
Problem-fit: 3 — Useful but doesn't directly address the core "lost AI" problem. This is internal cleanup
Breaking: No — Pure refactoring with no external API changes
Tier: 1

Code Quality Assessment

✅ Strengths

Clean separation of concerns: Configuration logic properly extracted from execution logic
Security-conscious: Strong input validation with PKG_NAME_RE and PKG_VERSION_RE patterns to prevent shell injection
Error handling: Robust error handling with retries, exponential backoff, and proper cleanup
Documentation: Well-documented functions with clear JSDoc comments
Type safety: Full TypeScript with proper type annotations
Consistent patterns: Follows established codebase patterns for CLI argument parsing

🔍 Technical Details

Line reduction: Net reduction of 22 lines (285 deletions vs 263 additions) shows effective consolidation
Reusability: The extracted bench-config.ts can be used by the other 3 benchmark scripts
Shell safety: Proper handling of Windows npm.cmd vs Unix npm with NPM_SHELL constant
Resource management: Proper cleanup functions for temporary directories in npm mode

⚠️ Minor Observations

Pattern validation: The regex patterns PKG_NAME_RE and PKG_VERSION_RE are conservative, which is good for security
Error messages: Clear and actionable error messages throughout
Platform handling: Proper Windows/Unix compatibility for npm commands

Configuration Change Detection

✅ No concerning configuration changes detected. This is pure code refactoring without any relaxed linting rules, disabled checks, or weakened validation.

Test Integrity

✅ No test modifications or integrity concerns. The refactoring maintains existing behavior and doesn't modify any test files or expectations.

Critical Concerns

None identified. This is a well-executed refactoring that improves code organization without introducing risks.

Final Recommendation

Rating: ⭐⭐⭐⭐ (4/5)
Action: APPROVE
Reasoning: This is a solid refactoring that reduces technical debt, follows security best practices, and enables code reuse across benchmark scripts. The separation of concerns is clean, error handling is robust, and there are no behavioral changes. Minor deduction only because this is internal cleanup rather than user-facing functionality.

greptile-apps · 2026-05-27T00:35:48Z

Greptile Summary

This PR refactors scripts/token-benchmark.ts and scripts/lib/bench-config.ts by extracting the large resolveBenchmarkSource function into focused helpers and splitting the monolithic main loop into dedicated composable functions.

bench-config.ts: Decomposed into repoRoot, resolveBenchmarkSourceLocal, npmInstallWithRetries, installCodegraphPackage, detectNativePlatformKey, installNativePackage, and installTransformers; shared retry logic eliminates duplication between codegraph and native package installs.
token-benchmark.ts: The ~180-line main loop is replaced by runSessionsForMode, medianForRuns, computeSavings, runIssueExperiment, and computeAggregate, each with a single clear responsibility.

Confidence Score: 5/5

Safe to merge — this is a pure refactoring that preserves all existing behavior with no logic changes introduced.

The refactoring extracts helper functions from two large scripts without altering any control flow, error handling, or output format. Retry logic, cleanup paths, and temp-dir lifecycle are all preserved one-to-one from the original. The only finding is an unused safeVersion field in the installCodegraphPackage return type, which has no runtime impact.

No files require special attention.

Important Files Changed

Filename	Overview
scripts/lib/bench-config.ts	Decomposed resolveBenchmarkSource into focused helper functions; retry logic unified via npmInstallWithRetries; behavior preserved from original.
scripts/token-benchmark.ts	Extracted runSessionsForMode, medianForRuns, computeSavings, runIssueExperiment, and computeAggregate from main; logic is equivalent to the original inline code.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[resolveBenchmarkSource] --> B{--npm flag?}
    B -- No --> C[resolveBenchmarkSourceLocal]
    C --> C1[repoRoot / read package.json / return srcDir + noop cleanup]
    B -- Yes --> D[installCodegraphPackage]
    D --> D1[assertSafePkgVersion / mkdtemp / write package.json]
    D1 --> D2[npmInstallWithRetries]
    D2 -->|success| D3[read installed package.json / return tmpDir pkgDir installedPkg]
    D2 -->|all retries failed| D4[rmSync tmpDir / throw Error]
    D3 --> E[installNativePackage]
    E --> E1[detectNativePlatformKey]
    E1 --> E2[npmInstallWithRetries native pkg]
    E2 -->|error| E3[log warning / swallow]
    D3 --> F[installTransformers]
    F --> F1[execFileSync hf install]
    F1 -->|error| F2[log warning / swallow]
    E --> G[prefer dist/ over src/ / return version srcDir cleanup]
    F --> G

_{Reviews (6): Last reviewed commit: "Merge branch 'main' into refactor/titan-..." | Re-trigger Greptile}

greptile-apps · 2026-05-27T00:35:53Z

+	fs.writeFileSync(path.join(tmpDir, 'package.json'), JSON.stringify({ private: true }));

+	try {
+		await npmInstallWithRetries(`@optave/codegraph@${safeVersion}`, tmpDir, 5, 'Attempt');


Redundant "Attempt attempt N" log message

npmInstallWithRetries formats its retry log as ${label} attempt ${attempt}. When called with label = 'Attempt' the output becomes " Attempt attempt 1 failed, retrying in 15s..." — doubling the word. The original inline code just printed " Attempt 1 failed...". The native-package call uses 'Native install', which composes cleanly (" Native install attempt 1 failed..."), so using a noun phrase like 'Codegraph install' (or an empty label with a different template) would match the original intent for this call.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fixed — extracted maxRetries = 5 as a named constant and changed the label from 'Attempt' to 'Codegraph install' so the log reads "Codegraph install attempt 1 failed..." instead of "Attempt attempt 1 failed...".

greptile-apps · 2026-05-27T00:35:55Z

+	try {
+		await npmInstallWithRetries(`@optave/codegraph@${safeVersion}`, tmpDir, 5, 'Attempt');
+	} catch (err) {
+		fs.rmSync(tmpDir, { recursive: true, force: true });
+		throw new Error(`Failed to install @optave/codegraph@${safeVersion} after 5 attempts: ${(err as Error).message}`);
+	}


Hardcoded retry count in error message diverges from the call site

The error string says "after 5 attempts" but the actual limit is supplied as a parameter to npmInstallWithRetries. If someone changes the call site (e.g., to 3 retries for faster CI), the error message will silently lie. Capture the count from a named constant so the call and the message stay in sync.

Suggested change

try {

await npmInstallWithRetries(`@optave/codegraph@${safeVersion}`, tmpDir, 5, 'Attempt');

} catch (err) {

fs.rmSync(tmpDir, { recursive: true, force: true });

throw new Error(`Failed to install @optave/codegraph@${safeVersion} after 5 attempts: ${(err as Error).message}`);

}

const maxRetries = 5;

try {

await npmInstallWithRetries(`@optave/codegraph@${safeVersion}`, tmpDir, maxRetries, 'Codegraph install');

} catch (err) {

fs.rmSync(tmpDir, { recursive: true, force: true });

throw new Error(`Failed to install @optave/codegraph@${safeVersion} after ${maxRetries} attempts: ${(err as Error).message}`);

}

Fixed — extracted const maxRetries = 5 at the call site so the error message uses ${maxRetries} rather than a hardcoded literal. Both issues addressed in the same commit.

github-actions · 2026-05-27T00:40:34Z

Codegraph Impact Analysis

18 functions changed → 48 callers affected across 13 files

repoRoot in scripts/lib/bench-config.ts:129 (13 transitive callers)
resolveBenchmarkSourceLocal in scripts/lib/bench-config.ts:138 (25 transitive callers)
cleanup in scripts/lib/bench-config.ts:152 (4 transitive callers)
npmInstallWithRetries in scripts/lib/bench-config.ts:160 (13 transitive callers)
installCodegraphPackage in scripts/lib/bench-config.ts:186 (25 transitive callers)
detectNativePlatformKey in scripts/lib/bench-config.ts:207 (12 transitive callers)
installNativePackage in scripts/lib/bench-config.ts:227 (25 transitive callers)
installTransformers in scripts/lib/bench-config.ts:261 (25 transitive callers)
resolveBenchmarkSource in scripts/lib/bench-config.ts:289 (27 transitive callers)
runSessionsForMode in scripts/token-benchmark.ts:411 (3 transitive callers)
medianForRuns in scripts/token-benchmark.ts:432 (3 transitive callers)
medianOf in scripts/token-benchmark.ts:435 (3 transitive callers)
computeSavings in scripts/token-benchmark.ts:449 (3 transitive callers)
runIssueExperiment in scripts/token-benchmark.ts:468 (2 transitive callers)
computeAggregate in scripts/token-benchmark.ts:501 (2 transitive callers)
sum in scripts/token-benchmark.ts:507 (3 transitive callers)
pct in scripts/token-benchmark.ts:512 (14 transitive callers)
main in scripts/token-benchmark.ts:530 (1 transitive callers)

…osition-scripts

…dback

…ch-config

carlos-alm · 2026-05-27T05:23:41Z

@greptileai

…ipts' into refactor/titan-decomposition-scripts

)

carlos-alm · 2026-05-27T07:30:09Z

Fixed the remaining inconsistency flagged in the summary: extracted const maxRetries = 5 in installNativePackage so it matches the pattern already used in installCodegraphPackage.

@greptileai

refactor(scripts): separate config from execution in benchmarking scr…

08780ba

…ipts

greptile-apps Bot reviewed May 27, 2026

View reviewed changes

carlos-alm added 3 commits May 26, 2026 23:15

Merge remote-tracking branch 'origin/main' into refactor/titan-decomp…

254c566

…osition-scripts

Merge remote-tracking branch 'origin/main' into fix/1235-greptile-fee…

3b44386

…dback

fix(scripts): use named constant for retry count and fix label in ben…

934be71

…ch-config

carlos-alm added 2 commits May 27, 2026 01:28

Merge remote-tracking branch 'origin/refactor/titan-decomposition-scr…

a73fbcc

…ipts' into refactor/titan-decomposition-scripts

fix(scripts): extract maxRetries constant in installNativePackage (#1235

0a3ea36

)

Merge branch 'main' into refactor/titan-decomposition-scripts

5dac193

carlos-alm merged commit 35e1dfb into main May 28, 2026
21 checks passed

carlos-alm deleted the refactor/titan-decomposition-scripts branch May 28, 2026 03:06

github-actions Bot locked and limited conversation to collaborators May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(scripts): separate config from execution in benchmarking scripts#1235

refactor(scripts): separate config from execution in benchmarking scripts#1235
carlos-alm merged 7 commits into
mainfrom
refactor/titan-decomposition-scripts

carlos-alm commented May 27, 2026

Uh oh!

claude Bot commented May 27, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 27, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

carlos-alm May 27, 2026

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

carlos-alm May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

carlos-alm commented May 27, 2026

Uh oh!

carlos-alm commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented May 27, 2026

Summary

Commits

Context

Caveats

Test plan

Uh oh!

claude Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Analysis

Todo List

Root Cause Analysis

Backlog Compliance

Code Quality Assessment

✅ Strengths

🔍 Technical Details

⚠️ Minor Observations

Configuration Change Detection

Test Integrity

Critical Concerns

Final Recommendation

Uh oh!

greptile-apps Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm May 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm May 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codegraph Impact Analysis

Uh oh!

carlos-alm commented May 27, 2026

Uh oh!

carlos-alm commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented May 27, 2026 •

edited

Loading

greptile-apps Bot commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading