Skip to content

fix(c-lsp): cap expression type evaluation steps#323

Merged
DeusData merged 1 commit into
DeusData:mainfrom
romanornr:upstream/c-lsp-eval-step-budget
May 30, 2026
Merged

fix(c-lsp): cap expression type evaluation steps#323
DeusData merged 1 commit into
DeusData:mainfrom
romanornr:upstream/c-lsp-eval-step-budget

Conversation

@romanornr
Copy link
Copy Markdown
Contributor

Summary

  • Add a per-file expression-evaluation step budget to the C/C++ LSP resolver
  • Return unknown after the budget is exhausted instead of allowing pathological expression/type lookup loops to hang indexing
  • Keep the existing recursion-depth guard, but also guard the non-stack-overflow case where evaluation repeatedly walks expensive C++ expression/type-resolution paths

Why

While indexing a large Bitcoin-derived C++ codebase, one worker could hang indefinitely in c_eval_expr_type() / c_eval_expr_type_inner() during definition extraction. The minimal repro was a single C++ test file with many versionbits-style checks; GDB showed repeated C++ expression evaluation through member/type lookup rather than a crash.

This guard trades a small amount of precision in pathological files for keeping repository indexing bounded and useful.

Test plan

  • make -f Makefile.cbm test -j$(nproc)
  • Fresh fast index of the large C++ repro repository completed successfully after deleting its cache DB:
    • 2179 files
    • 29,766 nodes
    • ~91k edges
    • completed in ~8.6s

Related

Follow-up to #322, which fixes a separate C++ LSP crash path.

@romanornr romanornr force-pushed the upstream/c-lsp-eval-step-budget branch from 3a5e9bf to 915ccb9 Compare May 8, 2026 00:45
@DeusData DeusData added bug Something isn't working stability/performance Server crashes, OOM, hangs, high CPU/memory parsing/quality Graph extraction bugs, false positives, missing edges labels May 8, 2026
@DeusData DeusData merged commit 926eb7f into DeusData:main May 30, 2026
@DeusData
Copy link
Copy Markdown
Owner

Thank you again, @romanornr! 🙏 Great follow-up to #322. The insight that a recovery-mode C++ AST can drive member/type lookup repeatedly without increasing recursion depth — so the existing eval_depth guard alone can't catch it — is exactly the kind of subtle failure mode that's hard to spot until a real codebase (the Bitcoin-derived one) hangs on it. A generous per-file step budget that degrades to unknown is the right trade-off: best-effort type resolution should never be able to hang indexing. I verified eval_steps resets per file (fresh CLSPContext + c_lsp_init memset at each extract site), so there's no cross-file degradation.

Merged via squash (926eb7f9) — authorship preserved. Build clean, all 3,617 tests pass. Contributes to the stability cluster in #390. Much appreciated!

DeusData added a commit that referenced this pull request May 30, 2026
Issue #355 reported a segfault extracting a macro-heavy xxhash C header
(dhw/xx_hash.h, not available to us). Add a guard that runs the vendored
xxhash.h (~7.5k lines, same macro-dense family) through C extraction under
ASan. It passes on current main — the C-LSP crash hardening (#322/#323/
#360) appears to cover this class. Kept as a regression guard; #355 stays
open pending the reporter's confirmation/file.
zaruous pushed a commit to zaruous/codebase-memory-mcp that referenced this pull request May 31, 2026
Adds a per-file expression-eval step budget (10k) to the C/C++ LSP resolver so pathological recovery-mode ASTs degrade to unknown instead of hanging indexing (the Bitcoin-derived C++ repro). Verified locally: build clean, all 3,617 tests pass including the full clsp_ suite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working parsing/quality Graph extraction bugs, false positives, missing edges stability/performance Server crashes, OOM, hangs, high CPU/memory

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants