Skip to content

feat(AGX1-263): agent_api_keys route migration to FGAC (404 collapse + two-factor)#252

Draft
dm36 wants to merge 1 commit into
dhruv/agx1-272-agent-api-keys-dual-writefrom
dhruv/agx1-263-agent-api-keys-route-migration
Draft

feat(AGX1-263): agent_api_keys route migration to FGAC (404 collapse + two-factor)#252
dm36 wants to merge 1 commit into
dhruv/agx1-272-agent-api-keys-dual-writefrom
dhruv/agx1-263-agent-api-keys-route-migration

Conversation

@dm36
Copy link
Copy Markdown

@dm36 dm36 commented May 26, 2026

Related work

PR B of the agent_api_keys FGAC stack. Stacks on #248 (dual-write PR A). Mirrors the AGX1-275 task pattern set by Asher in #249 (route-layer rewire + 404 collapse) — same helper shape, same dep-factory hook, same test layout.

Cross-repo stack

# Repo PR Purpose State
1 scaleapi/scaleapi #144657 sgp-authz 0.6.0 → 0.7.0; parent_resource kwarg on register_resource ✅ Merged
2 scaleapi/agentex #354 agentex-auth Port + Spark adapter + HTTP routes + api_key mapping ✅ Merged
3 scaleapi/scale-agentex #248 dual-write api_keys to spark-authz (merge target of this PR) Ready
4 (this PR) scaleapi/scale-agentex this route-layer FGAC + 404 collapse + two-factor mutations Ready

Parent ticket: AGX1-263.

Summary

  • Wires Spark AuthZ checks into all six agent_api_keys routes.
  • Collapses every api_key-resource denial into 404 (ItemDoesNotExist) on the id and name surfaces so callers can't distinguish "present in another tenant" from "absent".
  • Filters GET /agent_api_keys to the set of api_keys the caller can read.
  • Enforces parent-agent.update explicitly on POST /agent_api_keys (no api_key resource exists yet — only enforcement surface at create time).
  • Two-factor mutations on delete: relies on SpiceDB's transitive expansion (api_key.delete = editor & parent_agent->update & tenant_gate), not a second route-layer check — same approach as feat(AGX1-275): per-RPC task permission rewire and 404/403 wrap #249. The parent_agent relation is now populated by feat(AGX1-272): dual-write agent_api_keys to spark-authz behind FGAC flag #248's register_resource(parent=agent) so the cascade works end-to-end.

What changed

  • src/utils/agent_api_key_authorization.py (new): _check_api_key_or_collapse_to_404(authorization, api_key_id, operation) — catches AuthorizationError, raises ItemDoesNotExist. Direct parallel of _check_task_or_collapse_to_404.
  • src/utils/authorization_shortcuts.py: DAuthorizedId now routes AgentexResourceType.api_key through the collapse wrap (gated branch alongside the existing task-child and direct-resource branches).
  • src/api/routes/agent_api_keys.py: per-route wiring. The name routes call the collapse helper inline because the lookup key is (agent_id, name, api_key_type) — not a single globally-unique name path param — so DAuthorizedName doesn't fit.

Why structural choices

Why no DAuthorizedName extension. Asher extended DAuthorizedName for tasks because tasks.name is a single globally-unique path component. agent_api_keys names are scoped — the unique key is (agent_id, name, api_key_type) and the path is /agent_api_keys/name/{name}?agent_id=.... Stuffing this into DAuthorizedName's name -> repository.get(name=...) shape would have required changing its signature in a way that doesn't generalize. Inline collapse helper at the route handler is cleaner and matches the cross-resource composite lookup.

Why no explicit two-factor at delete. Per the SpiceDB schema, api_key.delete = internal_effective_editor & parent_agent->update & internal_tenant_gate. Issuing a second authorization_service.check(agent.update, agent_id) at the route layer would only duplicate what spark-authz already enforces atomically. Mirrors #249's approach. The parent_agent relation is populated by #248's use case calling register_resource(parent=AgentexResource.agent(agent_id)) — that's what closes the cascade end-to-end.

Why explicit parent check at create. Create is the one place where no api_key resource exists yet, so SpiceDB cannot gate transitively. The route MUST issue check(agent.update, parent_agent_id) directly. This is the only explicit two-factor check in the file.

Tests

  • tests/unit/api/test_agent_api_keys_authz.py — 12/12 pass.
    • 3 TestCheckApiKeyOrCollapseTo404 (allow + denied-collapse + delete-op forwarding).
    • 3 TestDAuthorizedIdApiKeyWrap (denial → 404, allow returns id, delete op propagation).
    • 2 TestNameRouteCollapse (both name handlers; verifies delete is NOT invoked when check fails).
    • 2 TestListFiltering (authorized-subset filter + None-passthrough).
    • 2 TestCreateParentAgentCheck (explicit parent-agent.update + create-not-invoked on denial).

Test plan

  • uv run pytest agentex/tests/unit/api/test_agent_api_keys_authz.py — 12 passed.
  • Combined with feat(AGX1-272): dual-write agent_api_keys to spark-authz behind FGAC flag #248's dual-write tests: 20/20 pass on this branch.
  • Ruff + ruff-format clean (pre-commit hook).
  • CI for broader unit + integration suite.
  • Manual: deny an existing api_key by id/name across each surface in a dev cluster; confirm 404 on every surface.

Out of scope

  • agentex-auth spark_mapping.py updates — handled in #354 (merged).
  • Restoring the 403/404 split for same-tenant calls once api_keys carry tenant scope (AGX1-290).

Greptile Summary

This PR wires Spark AuthZ (SpiceDB) checks into all six agent_api_keys routes, mirroring the task FGAC pattern from PR #249. Denials on id- and name-based lookups are collapsed to 404 to prevent cross-tenant existence probing, and list results are filtered to the set the caller can read.

  • agent_api_key_authorization.py (new): _check_api_key_or_collapse_to_404 wraps SpiceDB checks and raises ItemDoesNotExist on any denial, used by both the inline name-route handlers and the DAuthorizedId branch in authorization_shortcuts.py.
  • agent_api_keys.py: Parent agent.update check at create time (no resource row yet); DAuthorizedId(api_key, delete) on id-based delete; inline collapse helper on both name routes; DAuthorizedResourceIds enumeration with None-passthrough on list.
  • test_agent_api_keys_authz.py (new): 12 unit tests covering the helper, DAuthorizedId wrap, name-route collapse, list filtering, and create-time parent check.

Confidence Score: 3/5

The route-layer wiring is structurally sound, but the name routes introduce a new distinguishable 404 response body that reveals api_key existence to any caller who can resolve an agent_id — the opposite of the privacy invariant the PR is trying to establish.

The id-route collapse and list filtering work correctly. The create-time parent check is the right approach. However, both name-route handlers now expose two different 404 body formats depending on whether the key exists in the database, meaning cross-tenant existence probing via the name surface is still possible after this PR. This is a new gap introduced here, not pre-existing, and it contradicts the core security property the PR is designed to enforce.

agentex/src/api/routes/agent_api_keys.py — specifically the pre-auth 404 raises in get_agent_api_key_by_name and delete_agent_api_key_by_name

Security Review

  • Information disclosure via 404 body mismatch — agent_api_keys.py name routes: The "key absent in DB" path raises HTTPException with \"Agent api_key '{name}' not found for agent ID {agent.id}\", while the "key present but denied" path raises ItemDoesNotExist with \"Item with id '{id}' does not exist.\". Both return HTTP 404, but the differing response bodies allow a caller who can resolve an agent_id to probe for api_key name existence, defeating the stated existence-leak collapse on get_agent_api_key_by_name and delete_agent_api_key_by_name.

Important Files Changed

Filename Overview
agentex/src/api/routes/agent_api_keys.py Added FGAC enforcement across all six routes; name-route collapse is logically incomplete — different 404 detail strings still reveal key existence to cross-tenant callers; also contains a minor TOCTOU in delete-by-name.
agentex/src/utils/agent_api_key_authorization.py New helper that wraps SpiceDB checks and collapses AuthorizationError to ItemDoesNotExist (404); logic is correct, but the authorization parameter lacks a type annotation.
agentex/src/utils/authorization_shortcuts.py Added api_key branch to DAuthorizedId that routes through the collapse helper; change is narrow, well-placed, and consistent with the existing task/direct-resource branching pattern.
agentex/tests/unit/api/test_agent_api_keys_authz.py 12 new unit tests covering the helper, DAuthorizedId wrap, name-route collapse, list filtering, and create-time parent check; structure mirrors the task authz test suite well.

Sequence Diagram

sequenceDiagram
    participant C as Caller
    participant R as Route Handler
    participant AU as agent_use_case
    participant AK as agent_api_key_use_case
    participant SDB as SpiceDB (authorization_service)

    Note over R: POST /agent_api_keys (create)
    C->>R: POST with agent_id/agent_name
    R->>AU: get(id/name)
    AU-->>R: agent entity
    R->>SDB: check(agent.update, agent_id)
    alt denied
        SDB-->>R: AuthorizationError
        R-->>C: 403
    else allowed
        SDB-->>R: OK
        R->>AK: create(...)
        AK-->>R: api_key entity
        R-->>C: 200 CreateAPIKeyResponse
    end

    Note over R: GET /agent_api_keys/name/{name}
    C->>R: GET with name + agent_id
    R->>AU: get(id/name)
    AU-->>R: agent entity
    R->>AK: get_by_agent_id_and_name(...)
    alt not found in DB
        AK-->>R: None
        R-->>C: 404 message-A
    else found
        AK-->>R: entity
        R->>SDB: check(api_key.read, entity.id)
        alt denied
            SDB-->>R: AuthorizationError to ItemDoesNotExist
            R-->>C: 404 message-B (different body)
        else allowed
            R-->>C: 200 AgentAPIKey
        end
    end

    Note over R: DELETE /agent_api_keys/{id}
    C->>R: DELETE with id
    R->>SDB: check(api_key.delete, id) via DAuthorizedId
    Note over SDB: SpiceDB transitively checks parent_agent.update
    alt denied
        SDB-->>R: AuthorizationError to ItemDoesNotExist
        R-->>C: 404
    else allowed
        R->>AK: delete(id)
        R-->>C: 200 deleted
    end
Loading

Comments Outside Diff (2)

  1. agentex/src/api/routes/agent_api_keys.py, line 163-179 (link)

    P1 security Distinguishable 404 bodies defeat the existence-leak collapse

    The name routes produce two structurally different 404 responses depending on whether the key exists in the database. If the key is absent, HTTPException is raised with "Agent api_key '{name}' not found for agent ID {agent.id}". If the key exists but the caller is denied, ItemDoesNotExist is raised with "Item with id '{api_key_id}' does not exist.". Both return HTTP 404, but a cross-tenant caller who can resolve an agent_id can probe for key-name existence by comparing the response body — exactly the information leak the collapse is meant to prevent. The same issue exists in delete_agent_api_key_by_name (line ~264). To close it, both "not found" paths should produce an identical detail string (e.g. both raise ItemDoesNotExist with the same generic message).

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: agentex/src/api/routes/agent_api_keys.py
    Line: 163-179
    
    Comment:
    **Distinguishable 404 bodies defeat the existence-leak collapse**
    
    The name routes produce two structurally different 404 responses depending on whether the key exists in the database. If the key is absent, `HTTPException` is raised with `"Agent api_key '{name}' not found for agent ID {agent.id}"`. If the key exists but the caller is denied, `ItemDoesNotExist` is raised with `"Item with id '{api_key_id}' does not exist."`. Both return HTTP 404, but a cross-tenant caller who can resolve an `agent_id` can probe for key-name existence by comparing the response body — exactly the information leak the collapse is meant to prevent. The same issue exists in `delete_agent_api_key_by_name` (line ~264). To close it, both "not found" paths should produce an identical detail string (e.g. both raise `ItemDoesNotExist` with the same generic message).
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Cursor Fix in Claude Code Fix in Codex

  2. agentex/src/api/routes/agent_api_keys.py, line 260-278 (link)

    P2 TOCTOU window between auth check and delete-by-name

    The auth check resolves the key to an id and passes that ID to SpiceDB, but the subsequent delete is issued by name (delete_by_agent_id_and_key_name). If the key is deleted and a new key with the same name is created between the lookup and the delete, the auth check will have evaluated the old key's ID (which the caller may have legitimately had delete rights on) but the mutation will affect the new key. The practical risk is low, but it is a real race between a non-atomic read-auth-mutate sequence. Deleting by the resolved id instead of by name would close it.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: agentex/src/api/routes/agent_api_keys.py
    Line: 260-278
    
    Comment:
    **TOCTOU window between auth check and delete-by-name**
    
    The auth check resolves the key to an `id` and passes that ID to SpiceDB, but the subsequent delete is issued by name (`delete_by_agent_id_and_key_name`). If the key is deleted and a new key with the same name is created between the lookup and the delete, the auth check will have evaluated the old key's ID (which the caller may have legitimately had delete rights on) but the mutation will affect the new key. The practical risk is low, but it is a real race between a non-atomic read-auth-mutate sequence. Deleting by the resolved `id` instead of by name would close it.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Cursor Fix in Claude Code Fix in Codex

Fix All in Cursor Fix All in Claude Code Fix All in Codex

Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
agentex/src/api/routes/agent_api_keys.py:163-179
**Distinguishable 404 bodies defeat the existence-leak collapse**

The name routes produce two structurally different 404 responses depending on whether the key exists in the database. If the key is absent, `HTTPException` is raised with `"Agent api_key '{name}' not found for agent ID {agent.id}"`. If the key exists but the caller is denied, `ItemDoesNotExist` is raised with `"Item with id '{api_key_id}' does not exist."`. Both return HTTP 404, but a cross-tenant caller who can resolve an `agent_id` can probe for key-name existence by comparing the response body — exactly the information leak the collapse is meant to prevent. The same issue exists in `delete_agent_api_key_by_name` (line ~264). To close it, both "not found" paths should produce an identical detail string (e.g. both raise `ItemDoesNotExist` with the same generic message).

### Issue 2 of 3
agentex/src/utils/agent_api_key_authorization.py:9-12
**Missing type annotation on `authorization` parameter**

The `authorization` parameter has no type annotation, while every other parameter in this module is typed. The corresponding helper in `task_authorization` presumably takes `DAuthorizationService`. Leaving it untyped here breaks IDE inference and prevents mypy from catching call-site mismatches.

### Issue 3 of 3
agentex/src/api/routes/agent_api_keys.py:260-278
**TOCTOU window between auth check and delete-by-name**

The auth check resolves the key to an `id` and passes that ID to SpiceDB, but the subsequent delete is issued by name (`delete_by_agent_id_and_key_name`). If the key is deleted and a new key with the same name is created between the lookup and the delete, the auth check will have evaluated the old key's ID (which the caller may have legitimately had delete rights on) but the mutation will affect the new key. The practical risk is low, but it is a real race between a non-atomic read-auth-mutate sequence. Deleting by the resolved `id` instead of by name would close it.

Reviews (1): Last reviewed commit: "feat(AGX1-263): migrate agent_api_keys r..." | Re-trigger Greptile

@dm36 dm36 force-pushed the dhruv/agx1-263-agent-api-keys-route-migration branch from b94f0cb to d15bc88 Compare May 27, 2026 00:46
@dm36 dm36 force-pushed the dhruv/agx1-272-agent-api-keys-dual-write branch from 9668f1a to e72df68 Compare May 28, 2026 17:29
@dm36 dm36 marked this pull request as ready for review May 28, 2026 17:31
@dm36 dm36 force-pushed the dhruv/agx1-263-agent-api-keys-route-migration branch from d15bc88 to 0cbba3d Compare May 28, 2026 17:31
@dm36 dm36 requested a review from a team as a code owner May 28, 2026 17:31
@dm36 dm36 marked this pull request as draft May 28, 2026 17:38
@dm36 dm36 marked this pull request as ready for review May 28, 2026 17:52
…se and two-factor mutations

Mirrors AGX1-275 (PR #249) for agent_api_keys. Wires Spark AuthZ checks
into every api_key route, collapses denials to 404 (so name/id probes
can't distinguish "present in another tenant" from "absent"), and relies
on SpiceDB's transitive expansion of api_key.{update,delete} (= editor &
parent_agent->update & tenant_gate) for two-factor mutations rather than
issuing two explicit checks at the route layer.

- src/utils/agent_api_key_authorization.py (new):
  _check_api_key_or_collapse_to_404 — catches AuthorizationError, raises
  ItemDoesNotExist. Same shape as Asher's task helper.
- src/utils/authorization_shortcuts.py: DAuthorizedId routes
  AgentexResourceType.api_key through the wrap. (DAuthorizedName isn't
  used for api_keys; the name lookup is (agent_id, name, api_key_type),
  not a single globally-unique path param — the route handlers call the
  collapse helper inline instead.)
- src/api/routes/agent_api_keys.py:
  * POST: explicit agent.update on parent (no api_key resource yet).
  * GET list: DAuthorizedResourceIds + filter; None passes through.
  * GET /name/{name}: inline collapse helper.
  * GET /{id}: DAuthorizedId(api_key, read).
  * DELETE /{id}: DAuthorizedId(api_key, delete). Two-factor via SpiceDB
    schema (api_key.delete expands to parent_agent.update); no second
    route-layer check.
  * DELETE /name/{api_key_name}: inline collapse helper.
- tests/unit/api/test_agent_api_keys_authz.py (new): 12 tests, all pass.

Stacked on dhruv/agx1-272-agent-api-keys-dual-write (PR A). Does NOT
touch dual-write logic. Does NOT modify agentex-auth.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@dm36 dm36 force-pushed the dhruv/agx1-263-agent-api-keys-route-migration branch from 0cbba3d to 4335ca7 Compare May 28, 2026 17:54
@dm36 dm36 marked this pull request as draft May 28, 2026 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant