fix(agents): persist pending_action for gated tool calls#337
Conversation
There was a problem hiding this comment.
Sorry @w7-mgfcode, you have reached your weekly rate limit of 500000 diff characters.
Please try again later or upgrade to continue using Sourcery
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Root cause
The experiment agent can call an approval-gated mutation tool (
save_scenario,create_alias,archive_run). Those tools returned{"status":"approval_required", ...}to the model, but never wrote a machine-readablepending_actionto session/deps state.service.chat()/stream_chat()decide approval by inspectingfinal_result.pending_action/final_result.approval_required— fields that the agent's structured output (ExperimentReport) does not define. So a gated call left the sessionactive, persisted nopending_action, and emitted noapproval_requiredevent — the Chat UI received only the assistant's prose and showed no Approve/Reject card. (Investigation: sessione4fe2f76…loggedtool_save_scenario requires_approval=true, yetagent_session.pending_actionstayed null and status stayedactive.)Fix
Deterministic, deps-based propagation (does not depend on the model echoing the request into its output):
AgentDepsgains apending_action: dict | Noneslot +set_pending_action(action_type, arguments, description).ctx.deps.set_pending_action(...)when approval is required, before returning the approval dict. Thesave_scenarioarguments are exactly what_execute_pending_actionreplays on approval.service.chat()andservice.stream_chat()checkdeps.pending_actionfirst (new shared_record_pending_actionhelper), persistingsession.pending_action, flipping the session toawaiting_approval, and emitting theapproval_requiredevent. The legacyfinal_result.pending_action/approval_requiredchecks remain as fallbacks.ExperimentReportis intentionally left unchanged (the model-dependent alternative was the fragile path).Tests
Three new regression tests in
app/features/agents/tests/test_service.py:test_set_pending_action_records_request—AgentDeps.set_pending_actionrecords the request.test_chat_persists_pending_action_from_deps—chat()persistspending_action+ setsawaiting_approvalwhen the output has no approval field.test_stream_chat_emits_approval_required_from_deps—stream_chat()emits theapproval_requiredevent fromdeps.pending_action.Covers the full chain: gated tool →
deps.pending_action→awaiting_approval→approval_required.Validation
ruff check✅ruff format --check✅mypy app/✅ — only pre-existingxgboostoptional-depimport-not-founderrors in untouched files (forecasting/,registry/)pyright app/features/agents/✅ 0 errorspytest -m "not integration"✅ 1650 passed, 12 skipped (agents slice 139, scenarios 53)Notes
Closes #336