falkordblite mcp latest by Naseem77 · Pull Request #706 · FalkorDB/code-graph

Naseem77 · 2026-06-09T05:41:08Z

Summary by CodeRabbit

Release Notes

New Features
- Added optional embedded database backend for lighter deployments without external server dependencies.
- Enhanced code search with free-text hybrid ranking for better relevance.
- Added new symbol discovery and code navigation tools for structural exploration.
- Added context-size benchmark tool for performance evaluation.
Documentation
- Updated setup instructions with embedded database configuration and TCP exposure options.
- Expanded environment variable documentation for new database backend settings.
Refactor
- Centralized database connection logic for improved maintainability.
Tests
- Added comprehensive test coverage for new code navigation tools.

…bors Nav baseline rebased onto the #681 (MCP tools) + #698 (analyzer IMPORTS/OVERRIDES edges) confluence, dropping the duplicate analyzer-edges copy (722c8a3, byte-identical to #698's ce4ecd9) and the T13/T14 onboarding-template edits (deferred). Adds get_neighbors (symbol-id neighbor expansion over CALLS/IMPORTS/EXTENDS/OVERRIDES), get_file_neighbors (file-level structural coupling), get_importers/get_overrides spikes, and a hybrid (BM25 + structural) ranking for search_code. Cherry-pick of 799b218. PR-split note: get_file_neighbors + _resolve_file + FILE_NEIGHBOR_RELS form a self-contained trailing block (PR N2); everything else is PR N1. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

structural.py: - search_code: relevance floor — drop files with zero raw lexical signal (name-exact + path-overlap + raw BM25) so a nonsense query returns [] instead of centrality-ranked noise. - _hybrid_components: deterministic representative-symbol selection via a stable sort key (exact query-id match, then src_start, name, src_end) — no longer depends on FalkorDB row order; fixes nondeterministic snippet/name. - find_path: keep the node label (a path is bare nodes with no per-hop relation, so the label is the only type signal); restores the edge-leak guard. - get_file_neighbors: include file_id in both early-return shapes for a consistent schema. tests: - _find_id resolves a symbol id directly from the graph (search_code is file-oriented and no longer returns per-symbol ids) in test_query_tools.py and test_impact_analysis.py, with a uniqueness assertion. - search_code tests use the file-oriented query= API. 53/53 mcp tests pass against live FalkorDB; ruff clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Bridges human-readable names to the integer symbol_id that get_neighbors, impact_analysis and find_path require (those take an id, not a name; search_code returns files). Accepts a simple name or dotted qualname (last segment used); optional file arg flags/orders in-file matches first so a wrong file filter degrades visibly rather than silently routing to an arbitrary same-named symbol. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Benchmarking showed the MCP ask tool failing on every call (the spawned server env carries only FalkorDB coordinates, no LLM key) and, when keyed, returning File-level fuzzy matches rather than the structural answers the nav workflow needs — an LLM round-trip per call for no signal. Expose only the deterministic structural tools; GraphRAG ask stays on the HTTP /api/chat path. #681's ask implementation is left intact (re-enable by re-adding the import). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Validate/coerce limit: reject non-integers, accept stringified ints, and clamp to 1..FIND_SYMBOL_DB_LIMIT so a negative value no longer flips rows[:limit] into a surprising tail slice (_clamp_find_symbol_limit). - Reject empty symbol names up front (empty, whitespace, or a trailing-dot qualname) instead of silently querying for "". - Always include file_match in the response (False when no file filter is requested) so the documented shape is stable, and sort deterministically (in-file matches first, then file/line/name/id) so ordering no longer depends on FalkorDB row order. - Add tests/mcp/test_find_symbol.py covering simple + dotted-name resolution, file disambiguation + ordering, response-shape stability, deterministic order, and limit/empty-name validation. 70/70 mcp tests pass against live FalkorDB; ruff clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

coderabbitai · 2026-06-09T05:41:21Z

📝 Walkthrough

Walkthrough

This PR introduces FalkorDB Lite embedded backend support and expands MCP structural tools. It centralizes database connection logic via factory functions in a new api/db.py module, then refactors all connection creation across the codebase to use these factories. The MCP tools layer gains hybrid search ranking for code queries, three new tools (symbol resolution, neighbor traversal, file coupling), and comprehensive test coverage.

Changes

FalkorDB Lite Backend & Connection Abstraction

Layer / File(s)	Summary
Backend abstraction module and configuration `api/db.py`, `pyproject.toml`, `.env.template`	New `api/db.py` exports factory functions (`create_falkordb()`, `create_async_falkordb()`, `create_redis_connection()`, `create_async_redis_connection()`, `graphrag_connection_kwargs()`) that select between embedded FalkorDBLite and remote FalkorDB based on `CODE_GRAPH_DB_BACKEND`. Adds optional `light` dependency group for `falkordblite>=0.10.0`. Environment template documents new backend selector and lite-specific configuration.
CLI and setup documentation `api/cli.py`, `.env.template`, `README.md`	`ensure-db` CLI now detects and initializes lite backend, emitting JSON success/error payloads. `.env.template` adds `CODE_GRAPH_DB_BACKEND`, `FALKORDB_LITE_PATH`, and optional `FALKORDB_LITE_HOST`/`FALKORDB_LITE_PORT`. `README.md` documents Option C setup with `uv sync --extra light` and environment variable guidance.
Connection factory adoption `api/git_utils/git_graph.py`, `api/graph.py`, `api/info.py`, `api/llm.py`, `api/migrations/per_branch.py`	Refactored all database connection instantiation to use factory functions instead of inline environment-based construction, removing duplicate connection logic across modules. Sync and async paths now consistently route through centralized factories.
Graph module cleanup `api/graph.py`	Parameter `Graph.get_sub_graph(l)` renamed to `limit` and wired into Cypher query; `set_file_coverage` no longer stores unused `_query` return value; `AsyncGraphQuery.close()` ensured to await async cleanup.

MCP Structural Tools Expansion

Layer / File(s)	Summary
Hybrid search ranking infrastructure `api/mcp/tools/structural.py`	Introduces `_hybrid_rank()` combining exact identifier matches, BM25 scoring over tokenized text, path token overlap, call-graph centrality, and path penalties (tests/vendor/legacy dirs). Adds `_relativize()` for path normalization and `_read_snippet()` for bounded file-line extraction; extends `_node_summary()` to support relativized paths, optional labels, and lazy snippet payloads.
New structural tools `api/mcp/tools/structural.py`	Adds `find_symbol(name, project, file, branch, limit)` for Function/Class name resolution with file-scoped matching; `get_neighbors(symbol_id, project, relation, direction, branch, limit)` for unified bidirectional relation traversal; `get_file_neighbors(file, project, branch, limit)` for discovering structurally coupled files via aggregated 1-hop relations. All return snippets and maintain deterministic ordering.
Refactored code search `api/mcp/tools/structural.py`	Replaces prefix-based `search_code(prefix)` with free-text `search_code(query)` using hybrid ranking; returns files with scores and lazy snippets. Updates `impact_analysis` and `find_path` to use enhanced `_node_summary()` with relativized paths and optional labels. Consolidates per-edge tools into unified `get_neighbors`.
MCP tool surface clarification `api/mcp/tools/__init__.py`	Documents that GraphRAG-backed `ask` tool is excluded from MCP surface (available only on HTTP `/api/chat`); only deterministic structural tools are exposed via MCP protocol.
find_symbol test suite `tests/mcp/test_find_symbol.py`	Comprehensive tests for input validation (empty/whitespace/dotted-name rejection, limit clamping/coercion), symbol resolution, dotted qualname equivalence, `file_match` flag behavior (in-file vs. global candidates), deterministic ordering, and MCP server registration.
Query tools test suite `tests/mcp/test_query_tools.py`	Tests refactored `search_code` as free-text query with file-level hit matching, JSON serializability, and repo-relative path validation. Verifies `get_neighbors` equivalence to legacy per-edge tools. Tests `get_file_neighbors` cross-file dependency discovery, path vs. file-id consistency, and empty results. Updates `_find_id` helper to query FalkorDB directly.
Impact analysis test update `tests/mcp/test_impact_analysis.py`	Refactors `_find_id` test helper to resolve symbol IDs via direct FalkorDB Cypher query instead of `search_code`, ensuring uniqueness and proper resource cleanup.

MCP Context Benchmark

Layer / File(s)	Summary
MCP context benchmark `benchmarks/mcp_context_benchmark.py`	New deterministic benchmark comparing MCP-style CodeGraph context (via `find_symbol`, `get_neighbors`, `get_file_neighbors`) vs. plain source-file context. Includes fixed test targets, token estimation, per-question deltas, and JSON summary reporting. CLI accepts `--project`, `--branch`, `--root`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

FalkorDB/code-graph#677: Adds the shared MCP sample_project fixture and test infrastructure (conftest.py, expected.yaml) that the structural tool tests in this PR depend on.
FalkorDB/code-graph#679: Modifies search_code tool in api/mcp/tools/structural.py with a different prefix-search approach; this PR's hybrid free-text implementation directly overlaps.
FalkorDB/code-graph#675: Updates api/graph.py Graph initialization with per-branch naming; this PR refactors the same initialization to use create_falkordb() factories.

Suggested reviewers

gkorland
galshubeli

🐰 Hybrid ranking hops through the code graph,
Lite backends nestle in local caches,
Symbols resolve with deterministic grace,
While neighbors gather 'round the structured place—
A framework built for swift and clear embrace. ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 45.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'falkordblite mcp latest' is vague and lacks specificity about the actual changes made in the pull request.	Provide a more descriptive title that clearly summarizes the main changes, such as 'Add FalkorDBLite embedded database backend support with MCP tools' or 'Integrate FalkorDBLite embedded backend and MCP structural query tools'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch naseem/falkordblite-mcp-latest

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

api/git_utils/git_graph.py (1)
197-198: ⚠️ Potential issue | 🔴 Critical

Check AsyncGitGraph.close()—callers still invoke it.

AsyncGitGraph is instantiated in api/index.py and tests/index.py, and both paths explicitly do finally: await git_graph.close(). Since those direct calls still exist, removing AsyncGitGraph.close() would break callers unless the method wasn’t actually removed (or the call sites were updated accordingly).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/git_utils/git_graph.py` around lines 197 - 198, AsyncGitGraph.close() was
removed/changed but callers in api/index.py and tests/index.py still call
finally: await git_graph.close(); restore compatibility by re-adding an async
close(self) -> None method on class AsyncGitGraph that simply awaits the
underlying resource shutdown (e.g., await self.db.aclose()) or by providing an
equivalent async wrapper method with the same name so existing call sites
continue to work; ensure the method signature remains async def close(self) ->
None and references self.db.aclose() to properly close resources.

🧹 Nitpick comments (5)

api/mcp/tools/structural.py (1)

133-133: 💤 Low value

Consider adding strict=True to zip() for safety.

Ruff flags this zip() call without an explicit strict= parameter. Since docs is built directly from files on line 121, the lengths always match, but making this explicit prevents silent bugs if the construction ever changes.
🔒 Proposed fix
-    for f, d in zip(files, docs):
+    for f, d in zip(files, docs, strict=True):
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/mcp/tools/structural.py` at line 133, The zip iteration over files and
docs should be made explicit about length expectations to prevent silent
mismatches: update the loop using zip(files, docs) (the for loop that iterates
variables f and d) to call zip with strict=True so mismatched lengths raise an
error; locate the loop in structural.py where files and docs are paired and add
the strict=True argument to the zip() call.
Source: Linters/SAST tools

benchmarks/mcp_context_benchmark.py (3)

30-32: ⚡ Quick win

Move logging configuration after imports to avoid masking import-time warnings.

Calling logging.disable(logging.CRITICAL) before importing api.mcp.tools.structural will suppress any warnings or errors that might occur during module initialization. This makes debugging harder if the MCP tools fail to import correctly.

♻️ Suggested refactor

 from __future__ import annotations
 
 import argparse
 import asyncio
 import json
 import logging
 from dataclasses import dataclass
 from pathlib import Path
 from typing import Any
 
-logging.disable(logging.CRITICAL)
-
-from api.mcp.tools.structural import find_symbol, get_file_neighbors, get_neighbors  # noqa: E402
+from api.mcp.tools.structural import find_symbol, get_file_neighbors, get_neighbors

Then in main(), add logging.disable(logging.CRITICAL) at the beginning:

def main() -> None:
    logging.disable(logging.CRITICAL)
    parser = argparse.ArgumentParser(description=__doc__)
    # ... rest of main

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/mcp_context_benchmark.py` around lines 30 - 32, Move the
logging.disable(logging.CRITICAL) call so it does not run before imports: remove
the top-level logging.disable from the module and instead call
logging.disable(logging.CRITICAL) at the start of main() (the function that
parses args and runs the benchmark). This ensures imports like
api.mcp.tools.structural (and functions find_symbol, get_file_neighbors,
get_neighbors) can emit import-time warnings/errors while still silencing logs
during actual benchmark runtime.

109-109: 💤 Low value

Consider defensive dictionary access for symbol_id.

The code assumes symbols[0] contains a "symbol_id" key. While the internal MCP tool contract likely guarantees this, defensive access with .get() or a try/except block would make the benchmark more robust against API changes.

🛡️ Optional defensive access

-    symbol_id = symbols[0]["symbol_id"]
+    symbol_id = symbols[0].get("symbol_id")
+    if not symbol_id:
+        return payload

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/mcp_context_benchmark.py` at line 109, The line setting symbol_id
from symbols[0]["symbol_id"] should be made defensive: verify symbols is a
non-empty list and use dict.get or a try/except to read "symbol_id" from
symbols[0] (e.g., check if symbols and symbols[0].get("symbol_id") is not None),
and if missing either raise a clear ValueError or handle the case (skip/abort)
so symbol_id is never accessed from a missing key; update the symbol_id
assignment and any downstream uses to rely on this validated value (symbols,
symbol_id).

143-185: ⚖️ Poor tradeoff

Consider adding error handling for individual question failures.

The benchmark will stop and print no output if any graph_context() or read_plain_context() call raises an exception. Wrapping the per-question logic in a try/except block would allow the benchmark to continue and report partial results, improving the user experience when some targets fail.

Optional error handling

     for question in QUESTIONS:
-        if question.target not in graph_contexts:
-            graph_contexts[question.target] = compact_json(
-                await graph_context(question, project, branch)
-            )
-        graph_text = graph_contexts[question.target]
-        plain_text = read_plain_context(root, question.target.file)
-        graph_tokens = estimate_tokens(graph_text)
-        plain_tokens = estimate_tokens(plain_text)
-        graph_total += graph_tokens
-        plain_total += plain_tokens
-        rows.append(
-            {
-                "question": question.text,
-                "symbol": question.target.symbol,
-                "file": question.target.file,
-                "graph_tokens": graph_tokens,
-                "plain_tokens": plain_tokens,
-                "delta_tokens": plain_tokens - graph_tokens,
-                "reduction_pct": round((1 - graph_tokens / plain_tokens) * 100, 1)
-                if plain_tokens
-                else 0,
-            }
-        )
+        try:
+            if question.target not in graph_contexts:
+                graph_contexts[question.target] = compact_json(
+                    await graph_context(question, project, branch)
+                )
+            graph_text = graph_contexts[question.target]
+            plain_text = read_plain_context(root, question.target.file)
+            graph_tokens = estimate_tokens(graph_text)
+            plain_tokens = estimate_tokens(plain_text)
+            graph_total += graph_tokens
+            plain_total += plain_tokens
+            rows.append(
+                {
+                    "question": question.text,
+                    "symbol": question.target.symbol,
+                    "file": question.target.file,
+                    "graph_tokens": graph_tokens,
+                    "plain_tokens": plain_tokens,
+                    "delta_tokens": plain_tokens - graph_tokens,
+                    "reduction_pct": round((1 - graph_tokens / plain_tokens) * 100, 1)
+                    if plain_tokens
+                    else 0,
+                }
+            )
+        except Exception as e:
+            print(f"Warning: Skipped question due to error: {e}", file=sys.stderr)
+            continue

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/mcp_context_benchmark.py` around lines 143 - 185, The per-question
loop in run() can abort the whole benchmark if graph_context() or
read_plain_context() raises; wrap the body that computes graph_text/plain_text,
token estimates and rows.append in a try/except around each question (using
question from QUESTIONS and the graph_contexts cache) and on exception log or
record the error (e.g., include an entry in rows with question/target and an
error field or a skipped flag) then continue; ensure graph_total/plain_total and
question counts used in the summary reflect only successfully processed items or
include skipped counts so the final summary still prints partial results even if
some targets fail.

api/db.py (1)

82-85: Downgrade teardown risk: close() is async in the pinned lite dependency, but keep a safe fallback wrapper anyway.

The current create_async_falkordb() fallback client.aclose = client.close runs only when aclose is missing; for the repo’s light extra (falkordblite>=0.10.0,<1.0.0), upstream AsyncFalkorDB.close()/aclose() are both async coroutines, so await db.aclose() should work as-is. Still, the fallback is brittle across versions—wrap close() to await any returned awaitable (as in the proposed diff) to make lite teardown resilient if a particular build omits aclose or implements close() differently.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/db.py` around lines 82 - 85, In create_async_falkordb(), replace the
brittle direct assignment client.aclose = client.close with a wrapper that makes
client.aclose an async function which calls client.close() and awaits the result
if it's awaitable; this ensures await db.aclose() works whether close() is sync,
returns an awaitable, or is already async—locate the create_async_falkordb
function and update the fallback logic where client.aclose is set.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/cli.py`:
- Line 71: The broad catch `except Exception as e:` in the CLI startup should
either be narrowed to the specific exceptions you expect (e.g., SystemExit,
KeyboardInterrupt, ValueError, or startup-related errors) inside the CLI
entrypoint (where `except Exception as e:` is used) or, if you intentionally
need a catch-all to guarantee JSON output, annotate that line with `# noqa:
BLE001` and add a one-sentence rationale comment explaining why the broad catch
is required (e.g., "Ensure any startup error is serialized to JSON for the CLI
contract"). Update the handler to re-raise critical exceptions when appropriate
or convert them to the JSON error response and ensure the comment references the
decision.

In `@benchmarks/mcp_context_benchmark.py`:
- Around line 90-94: The function read_plain_context currently may call
path.read_text() on a non-existent file and has redundant path logic; update
read_plain_context to (1) resolve path deterministically: choose root.parent /
file_hint if file_hint startswith f"{root.name}/" else root / file_hint, (2) if
that path does not exist try fallback root.parent / file_hint only if it
differs, and (3) if the final path still does not exist handle the error by
raising a clear FileNotFoundError (or return a controlled empty string) that
includes the resolved path, root, and file_hint so callers can handle the
missing file gracefully; refer to read_plain_context, path, root, and file_hint
when making changes.

---

Outside diff comments:
In `@api/git_utils/git_graph.py`:
- Around line 197-198: AsyncGitGraph.close() was removed/changed but callers in
api/index.py and tests/index.py still call finally: await git_graph.close();
restore compatibility by re-adding an async close(self) -> None method on class
AsyncGitGraph that simply awaits the underlying resource shutdown (e.g., await
self.db.aclose()) or by providing an equivalent async wrapper method with the
same name so existing call sites continue to work; ensure the method signature
remains async def close(self) -> None and references self.db.aclose() to
properly close resources.

---

Nitpick comments:
In `@api/db.py`:
- Around line 82-85: In create_async_falkordb(), replace the brittle direct
assignment client.aclose = client.close with a wrapper that makes client.aclose
an async function which calls client.close() and awaits the result if it's
awaitable; this ensures await db.aclose() works whether close() is sync, returns
an awaitable, or is already async—locate the create_async_falkordb function and
update the fallback logic where client.aclose is set.

In `@api/mcp/tools/structural.py`:
- Line 133: The zip iteration over files and docs should be made explicit about
length expectations to prevent silent mismatches: update the loop using
zip(files, docs) (the for loop that iterates variables f and d) to call zip with
strict=True so mismatched lengths raise an error; locate the loop in
structural.py where files and docs are paired and add the strict=True argument
to the zip() call.

In `@benchmarks/mcp_context_benchmark.py`:
- Around line 30-32: Move the logging.disable(logging.CRITICAL) call so it does
not run before imports: remove the top-level logging.disable from the module and
instead call logging.disable(logging.CRITICAL) at the start of main() (the
function that parses args and runs the benchmark). This ensures imports like
api.mcp.tools.structural (and functions find_symbol, get_file_neighbors,
get_neighbors) can emit import-time warnings/errors while still silencing logs
during actual benchmark runtime.
- Line 109: The line setting symbol_id from symbols[0]["symbol_id"] should be
made defensive: verify symbols is a non-empty list and use dict.get or a
try/except to read "symbol_id" from symbols[0] (e.g., check if symbols and
symbols[0].get("symbol_id") is not None), and if missing either raise a clear
ValueError or handle the case (skip/abort) so symbol_id is never accessed from a
missing key; update the symbol_id assignment and any downstream uses to rely on
this validated value (symbols, symbol_id).
- Around line 143-185: The per-question loop in run() can abort the whole
benchmark if graph_context() or read_plain_context() raises; wrap the body that
computes graph_text/plain_text, token estimates and rows.append in a try/except
around each question (using question from QUESTIONS and the graph_contexts
cache) and on exception log or record the error (e.g., include an entry in rows
with question/target and an error field or a skipped flag) then continue; ensure
graph_total/plain_total and question counts used in the summary reflect only
successfully processed items or include skipped counts so the final summary
still prints partial results even if some targets fail.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f236285a-56e6-440c-b729-9439d51eb095

📥 Commits

Reviewing files that changed from the base of the PR and between 126c672 and 8ec9744.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (16)

.env.template
README.md
api/cli.py
api/db.py
api/git_utils/git_graph.py
api/graph.py
api/info.py
api/llm.py
api/mcp/tools/__init__.py
api/mcp/tools/structural.py
api/migrations/per_branch.py
benchmarks/mcp_context_benchmark.py
pyproject.toml
tests/mcp/test_find_symbol.py
tests/mcp/test_impact_analysis.py
tests/mcp/test_query_tools.py

coderabbitai · 2026-06-09T05:55:42Z

+        try:
+            db = create_falkordb()
+            db.connection.ping()
+        except Exception as e:


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle BLE001 explicitly for the intentional catch-all path.

If this broad catch is intentional to guarantee JSON output, annotate it (# noqa: BLE001) with a short rationale; otherwise narrow to expected startup exceptions.

As per coding guidelines, "Run Ruff linter for Python code to check style and quality."

🧰 Tools

🪛 Ruff (0.15.15)

[warning] 71-71: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/cli.py` at line 71, The broad catch `except Exception as e:` in the CLI startup should either be narrowed to the specific exceptions you expect (e.g., SystemExit, KeyboardInterrupt, ValueError, or startup-related errors) inside the CLI entrypoint (where `except Exception as e:` is used) or, if you intentionally need a catch-all to guarantee JSON output, annotate that line with `# noqa: BLE001` and add a one-sentence rationale comment explaining why the broad catch is required (e.g., "Ensure any startup error is serialized to JSON for the CLI contract"). Update the handler to re-raise critical exceptions when appropriate or convert them to the JSON error response and ensure the comment references the decision.

Sources: Coding guidelines, Linters/SAST tools

coderabbitai · 2026-06-09T05:55:42Z

+def read_plain_context(root: Path, file_hint: str) -> str:
+    path = root.parent / file_hint if file_hint.startswith(f"{root.name}/") else root / file_hint
+    if not path.exists():
+        path = root.parent / file_hint
+    return path.read_text(errors="replace")


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add error handling for missing files and simplify path resolution logic.

If the final path does not exist, path.read_text() will raise an unhandled FileNotFoundError, causing the benchmark to crash. Additionally, the path resolution logic with the fallback on lines 92-93 is complex and difficult to reason about—the fallback root.parent / file_hint might produce the same path as the first branch of the conditional.

🛡️ Proposed fix with error handling and simplified logic

def read_plain_context(root: Path, file_hint: str) -> str: - path = root.parent / file_hint if file_hint.startswith(f"{root.name}/") else root / file_hint - if not path.exists(): - path = root.parent / file_hint - return path.read_text(errors="replace") + """Read plain file content, trying multiple path resolutions.""" + candidates = [ + root.parent / file_hint, + root / file_hint, + ] + for path in candidates: + if path.exists(): + return path.read_text(errors="replace") + # Return empty string or raise a clearer error + return f"# File not found: {file_hint}\n"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmarks/mcp_context_benchmark.py` around lines 90 - 94, The function read_plain_context currently may call path.read_text() on a non-existent file and has redundant path logic; update read_plain_context to (1) resolve path deterministically: choose root.parent / file_hint if file_hint startswith f"{root.name}/" else root / file_hint, (2) if that path does not exist try fallback root.parent / file_hint only if it differs, and (3) if the final path still does not exist handle the error by raising a clear FileNotFoundError (or return a controlled empty string) that includes the resolved path, root, and file_hint so callers can handle the missing file gracefully; refer to read_plain_context, path, root, and file_hint when making changes.

Copilot

Pull request overview

This PR introduces an optional embedded FalkorDBLite backend for lightweight deployments, and significantly expands/reshapes the MCP “structural” navigation surface (file-oriented search, symbol resolution, unified neighbor traversal, and file-level coupling), with updated docs, tests, and a benchmarking script.

Changes:

Add optional light extra (falkordblite) and centralize DB/Redis connection creation behind a new backend-selection module.
Refactor/extend MCP structural tools: search_code becomes free-text + hybrid-ranked and file-oriented; add find_symbol, get_neighbors, and get_file_neighbors; add path relativization + snippets.
Update README/.env template plus expand MCP test coverage and add a deterministic context-size benchmark.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`uv.lock`	Locks the new optional `falkordblite` dependency and its transitive deps.
`pyproject.toml`	Adds `light` optional extra for FalkorDBLite.
`api/db.py`	New backend-selection + connection factory layer (FalkorDB vs FalkorDBLite, sync/async, Redis compat, GraphRAG host/port).
`api/graph.py`	Switches graph connections to centralized factories; minor cleanup/rename.
`api/info.py`	Routes Redis metadata operations through centralized connection factories.
`api/llm.py`	Uses centralized GraphRAG connection kwargs (supports Lite with TCP exposure requirement).
`api/git_utils/git_graph.py`	Switches git-graph DB connections to centralized factories.
`api/migrations/per_branch.py`	Centralizes migration DB connection creation via `create_falkordb()`.
`api/cli.py`	Adds Lite-aware `ensure-db` behavior (initializes embedded backend).
`api/mcp/tools/structural.py`	Major MCP tool changes: hybrid free-text file ranking, new tools, snippets, relativized paths, unified neighbors, file-level coupling.
`api/mcp/tools/__init__.py`	Clarifies MCP surface intentionally excludes GraphRAG-backed `ask`.
`tests/mcp/test_query_tools.py`	Updates tests for new `search_code` contract; adds coverage for `_relativize`, `get_neighbors`, `get_file_neighbors`.
`tests/mcp/test_impact_analysis.py`	Updates symbol-id resolution to graph query since `search_code` is file-oriented.
`tests/mcp/test_find_symbol.py`	New test suite for `find_symbol` tool behavior/ordering/validation.
`benchmarks/mcp_context_benchmark.py`	New deterministic benchmark comparing “graph context” vs “plain file context”.
`README.md`	Documents Lite backend setup and new env vars.
`.env.template`	Adds Lite backend configuration variables and comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    rels = [relation] if isinstance(relation, str) else list(relation)
+    direction = (direction or "OUT").upper()
+    if direction == "BOTH":
+        dirs = ["OUT", "IN"]
+    elif direction in ("IN", "OUT"):
+        dirs = [direction]
+    else:
+        raise ValueError(
+            f"direction must be 'IN', 'OUT', or 'BOTH', got: {direction!r}"
+        )
+
+    seen: set[Any] = set()
+    out: list[dict[str, Any]] = []
+    for d in dirs:
+        for rel in rels:
+            rows = await _neighbors_payload(project, branch, symbol_id, rel, d, limit)
+            for row in rows:
+                key = (row.get("id"), row.get("relation"), row.get("direction"))
+                if key in seen:
+                    continue
+                seen.add(key)
+                out.append(row)
+                if len(out) >= limit:
+                    return out
+    return out


+    files_res = await g._query("MATCH (f:File) RETURN f.path, ID(f)")
+    for row in files_res.result_set:
+        ap = row[0]
+        if not ap:
+            continue
+        rp = rel(ap)
+        abs_of[rp] = ap
+        file_id_of[rp] = row[1]
+        pt = _tokenize(rp.replace("/", " "))
+        pathtok[rp] = pt
+        bodytok[rp].extend(pt)
+
+    sym_res = await g._query(
+        "MATCH (n) WHERE n:Function OR n:Class "
+        "RETURN n.name, n.path, n.doc, n.src_start, n.src_end"


+def create_async_redis_connection() -> Any:
+    """Create an async Redis-compatible connection for metadata operations."""
+    if is_lite_backend():
+        return create_async_falkordb().connection
+
+    import redis.asyncio as aioredis


DvirDukhan and others added 8 commits June 8, 2026 17:50

Add FalkorDBLite backend support

eefaa40

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Document FalkorDBLite configuration

84ee21a

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add MCP context benchmark

8ec9744

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

DvirDukhan merged commit c9d6948 into staging Jun 9, 2026
14 of 16 checks passed

DvirDukhan deleted the naseem/falkordblite-mcp-latest branch June 9, 2026 09:50

DvirDukhan restored the naseem/falkordblite-mcp-latest branch June 9, 2026 12:41

DvirDukhan requested a review from Copilot June 9, 2026 12:42

Copilot AI reviewed Jun 9, 2026

View reviewed changes

DvirDukhan mentioned this pull request Jun 9, 2026

Revert FalkorDBLite backend + MCP context benchmark (accidental #706 merge) #707

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

falkordblite mcp latest#706

falkordblite mcp latest#706
DvirDukhan merged 8 commits into
stagingfrom
naseem/falkordblite-mcp-latest

Naseem77 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 9, 2026

Uh oh!

coderabbitai Bot Jun 9, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Naseem77 commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Naseem77 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading