Skip to content

event_seq assignment and enqueue are not atomic (out-of-order delivery) #39

@nficano

Description

@nficano

Category: bug Severity: major
Location: src/Arcp.Runtime/SessionState.Outbound.cs:71-80

What

EventLog.Append assigns event_seq under a lock and returns; the lock is released before SendAsync enqueues onto the multi-writer outbound channel. Concurrent emitters in one session (e.g. the agent, the lease watchdog at JobManager.cs:414, and the back-pressure status at SessionState.Dispatch.cs:132) can have their Append/enqueue interleave so a higher seq is enqueued before a lower one. The single-reader channel then delivers them out of order, producing a non-monotonic / gapped event_seq on the wire (§8.3 requires strictly monotonic, gap-free ordering).

Evidence

private async ValueTask EmitJobEnvelopeAsync(Envelope env, CancellationToken cancellationToken)
{
    var stamped = env.Type is MessageTypeNames.JobEvent or MessageTypeNames.JobResult or MessageTypeNames.JobError
        ? EventLog.Append(env)
        : env;

    await SendAsync(stamped, cancellationToken).ConfigureAwait(false);
    FanOutToSubscribers(env, stamped, cancellationToken);
}

Proposed fix

Make seq assignment and enqueue atomic: hold the EventLog lock across both append and channel write, or funnel all emissions through a single serialized writer so wire order always matches assigned event_seq.

Acceptance criteria

  • Under concurrent emitters in a single session, delivered event_seq values are strictly increasing with no reordering.

Metadata

Metadata

Assignees

No one assigned

    Labels

    audit/bugAudit: bug / inefficiencysev/majorSeverity: major

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions