feat(mcp): index_repo tool (T4 #652)#678
Conversation
First real MCP tool. Wraps the existing Project / SourceAnalyzer
pipeline so AI agents can call `index_repo(path_or_url, branch)` over
stdio to populate code-graph for a repo.
- `api/mcp/tools/structural.py` (NEW) — registers `index_repo` on
the shared FastMCP app. Accepts local paths or git URLs;
auto-detects branch from local git checkouts via T17's
`detect_branch`; honors `ALLOWED_ANALYSIS_DIR` for sandboxing.
Non-git folders are handled by driving SourceAnalyzer directly
(Project requires a git repo).
- `api/mcp/tools/__init__.py` (NEW) — package marker; importing it
registers every tool module's `@app.tool()` decorators.
- `api/mcp/server.py` — imports tools at module load so both direct
`from api.mcp.server import app` and `cgraph-mcp` stdio entry
point see the same tool list.
- `tests/mcp/test_index_repo.py` (NEW) — 5 tests: local-path happy
path, missing-path error, ALLOWED_ANALYSIS_DIR sandboxing,
in-process app registration, JSON serialisability.
- `tests/mcp/test_scaffold.py` — replaced the "zero tools"
assertion with a presence check for `index_repo` so it stays
stable as T5-T8 / T11 add more tools.
Return shape:
{project_name, branch, graph_name, num_nodes, num_edges,
languages_detected, mode}
`incremental` parameter is accepted now and forwarded once T18
lands; the current full-reindex path ignores it and always returns
`mode="full"`.
All 8 tests pass against FalkorDB on 6390.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
📝 WalkthroughWalkthroughThis PR implements the ChangesIndex Repo MCP Tool Implementation and Testing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds the first real MCP tool (index_repo) to let MCP clients index a repository/directory into FalkorDB and query subsequent tools against a branch-scoped graph, and wires tool registration into the MCP server startup.
Changes:
- Introduces
api/mcp/tools/structural.pywith anindex_repo(path_or_url, branch, incremental, ignore)FastMCP tool and supporting helpers. - Registers MCP tools on server import (
api/mcp/server.py) via the newapi/mcp/toolspackage. - Updates MCP scaffold smoke test expectations and adds initial
index_repotests.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
api/mcp/tools/structural.py |
Implements and registers the index_repo MCP tool and response payload helpers. |
api/mcp/tools/__init__.py |
Imports tool modules to trigger registration on import. |
api/mcp/server.py |
Imports the tools package after app creation so stdio startup exposes registered tools. |
tests/mcp/test_scaffold.py |
Updates scaffold handshake test to expect at least index_repo in list_tools(). |
tests/mcp/test_index_repo.py |
Adds tests for index_repo shape/errors/registration/JSON-serializability. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- num_nodes: count nodes generically (MATCH (n)) instead of summing only File+Class+Function, so non-Python repos (Method/Interface/Enum/ Constructor nodes) aren't under-reported; symmetric with _count_edges (galshubeli, Copilot). - Reject non-http(s) git URLs (git@.../ssh://...) with an actionable error instead of letting Project.from_git_repository raise a confusing "invalid url" from validators.url(); align the docstring/description to advertise http(s) URLs only. Silently rewriting to https would change auth semantics for private repos (Copilot). - Local git checkout with no configured remote: fall back to Project(url=None) instead of crashing on remotes[0] IndexError, so remote-less repos can still be indexed (galshubeli, Copilot). - Document that `incremental` is accepted but not yet honored (full reindex always); mode is always "full" until T18 (Copilot). - Skip DB-backed index_repo tests when FalkorDB is unreachable via a new require_falkordb fixture, keeping the unit subset runnable in dev/CI without the service (Copilot). - Add tests for the SSH-URL rejection and the no-remote git path (the .git branch previously had no coverage). - Fix a pre-existing assertion referencing expected_contract["counts_min"] (the fixture key is "counts") so the local-path test passes when FalkorDB is reachable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
api/mcp/tools/structural.py (1)
50-54: 💤 Low valueBroad exception handling is acceptable for metadata helpers.
Ruff flags the generic
Exceptioncatches (BLE001) at lines 53, 67, and 82. These are in best-effort metadata collection helpers that return safe defaults (0or[]) on failure, ensuring a query error doesn't crash the tool. More specific exception types (e.g., FalkorDB-specific errors) would be marginally better but require dependency on the graph client's exception hierarchy. Given the defensive intent and low risk, the current broad catches are acceptable.Also applies to: 63-69, 78-83
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/mcp/tools/structural.py` around lines 50 - 54, The generic Exception catches around the metadata helper queries (the try/except blocks that call graph.g.query and return safe defaults like int(rows[0][0]) or []) are intentionally broad; update those except clauses by adding a short comment explaining the defensive intent and suppressing the Ruff BLE001 warning for these specific handlers (e.g., add an inline noqa or pragma referencing BLE001) so tooling stops flagging them while keeping the existing safe-default behavior for functions that query the graph (the blocks using graph.g.query and returning 0 or []).Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@api/mcp/tools/structural.py`:
- Around line 50-54: The generic Exception catches around the metadata helper
queries (the try/except blocks that call graph.g.query and return safe defaults
like int(rows[0][0]) or []) are intentionally broad; update those except clauses
by adding a short comment explaining the defensive intent and suppressing the
Ruff BLE001 warning for these specific handlers (e.g., add an inline noqa or
pragma referencing BLE001) so tooling stops flagging them while keeping the
existing safe-default behavior for functions that query the graph (the blocks
using graph.g.query and returning 0 or []).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dbafd9e0-6d40-429a-bb27-064808c2b0fd
📒 Files selected for processing (6)
api/mcp/server.pyapi/mcp/tools/__init__.pyapi/mcp/tools/structural.pytests/mcp/conftest.pytests/mcp/test_index_repo.pytests/mcp/test_scaffold.py
Prerequisites (merge order)
None — branches directly off
staging; can merge independently.Closes #652.
Stacked on #677 (T3). First real MCP tool — agents can call
index_repo(path_or_url, branch)over stdio.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com
Summary by CodeRabbit
New Features
index_repotool for indexing code repositories from local paths or HTTP(S) git URLsTests