Category: correctness Severity: blocker
Location: src/Arcp.Runtime/SessionState.Jobs.cs:28-45, src/Arcp.Runtime/JobManager.cs:104-118
Spec: ARCP v1.1 §7.2
What
On a duplicate job.submit with the same idempotency_key and identical parameters, JobManager.SubmitAsync returns the existing Job (correct per §7.2). But HandleJobSubmitAsync cannot distinguish a replay from a fresh submission — the return tuple has no "is replay" flag — so it unconditionally resolves the agent and launches RunAsync again on that already-running/terminal job. The agent body executes a second time, re-emitting all events plus a second terminal job.result/job.error. For a terminal job it even resets status back to Running (job.MarkRunning()), re-revokes credentials, and schedules a second terminal cleanup. Spec §7.2 requires the runtime to return the same job.accepted for a replay — not to run the job twice.
Evidence
// JobManager.SubmitAsync — returns the existing job on idempotent hit:
if (_jobs.TryGetValue(existingRecord.JobId, out var existing))
{
return (existing, BuildAccepted(existing));
}
// HandleJobSubmitAsync — always runs, even for a replayed existing job:
var submission = await _server.JobManager.SubmitAsync(...).ConfigureAwait(false);
var job = submission.Job;
...
await SendAsync(new Envelope { Type = MessageTypeNames.JobAccepted, ... }).ConfigureAwait(false);
var resolved = _server.AgentRegistry.Resolve(job.Agent).Agent;
_ = Task.Run(() => _server.JobManager.RunAsync(job, resolved, emit, _cts.Token), _cts.Token);
RunAsync then calls job.MarkRunning() on a job that may already be terminal (JobManager.cs:189). IdempotencyTests.Identical_retry_returns_existing_job_id only asserts the returned JobId matches — it never asserts the agent ran once — so the regression is untested.
Proposed fix
- Have
SubmitAsync signal a replay, e.g. return (Job Job, JobAcceptedPayload Accepted, bool IsReplay) (set IsReplay = true on the idempotent-hit early return at JobManager.cs:116).
- In
HandleJobSubmitAsync, send job.accepted for a replay but skip Resolve/RunAsync when IsReplay is true.
- Add an integration test: submit the same key twice to an agent that increments a shared counter; assert the counter is
1 and exactly one terminal job.result is observed.
Acceptance criteria
Category: correctness Severity: blocker
Location:
src/Arcp.Runtime/SessionState.Jobs.cs:28-45,src/Arcp.Runtime/JobManager.cs:104-118Spec: ARCP v1.1 §7.2
What
On a duplicate
job.submitwith the sameidempotency_keyand identical parameters,JobManager.SubmitAsyncreturns the existingJob(correct per §7.2). ButHandleJobSubmitAsynccannot distinguish a replay from a fresh submission — the return tuple has no "is replay" flag — so it unconditionally resolves the agent and launchesRunAsyncagain on that already-running/terminal job. The agent body executes a second time, re-emitting all events plus a second terminaljob.result/job.error. For a terminal job it even resets status back toRunning(job.MarkRunning()), re-revokes credentials, and schedules a second terminal cleanup. Spec §7.2 requires the runtime to return the samejob.acceptedfor a replay — not to run the job twice.Evidence
RunAsyncthen callsjob.MarkRunning()on a job that may already be terminal (JobManager.cs:189).IdempotencyTests.Identical_retry_returns_existing_job_idonly asserts the returnedJobIdmatches — it never asserts the agent ran once — so the regression is untested.Proposed fix
SubmitAsyncsignal a replay, e.g. return(Job Job, JobAcceptedPayload Accepted, bool IsReplay)(setIsReplay = trueon the idempotent-hit early return atJobManager.cs:116).HandleJobSubmitAsync, sendjob.acceptedfor a replay but skipResolve/RunAsyncwhenIsReplayis true.1and exactly one terminaljob.resultis observed.Acceptance criteria
job.acceptedwithout invoking the agent a second time.Runningby a replay.