Skip to content

[BUG] IndexError: pop from an empty deque during async engine dispose on Postgres / asyncpg (CLI commands, lifespan init, watcher) #831

@SW4T400

Description

@SW4T400

DURING MY SELF HOSTING JOURNEY of basic-memory I, or lets be honest - Claude Code came accross this:

Issue draft 2 — Bug report

Template to choose: Bug report

Suggested title:
IndexError: pop from an empty deque during async engine dispose on Postgres / asyncpg (CLI commands, lifespan init, watcher)


Issue body (copy-paste below this line)

Bug Description

On the Postgres backend (postgresql+asyncpg://...), several code paths that follow the shape "open an async engine -> do work -> await engine.dispose()" can crash with IndexError: pop from an empty deque from asyncio/base_events.py:_run_once. The traceback originates inside SQLAlchemy's async dispose machinery (sqlalchemy/pool/base.py:_close_connection -> connectors/asyncio.py:close -> _concurrency_py3k.py:greenlet_spawn).

This is not a duplicate of #462 / PR #471 ("prevent CLI commands from hanging on exit", merged 2025-12-24). That fix addresses a different symptom — the process hangs forever waiting on outstanding async connections. The bug reported here is a hard exception thrown from asyncio._run_once; the process exits with a non-zero code rather than hanging. The fix from #471 is present in the version I am running (0.20.3) but does not cover this failure mode.

Steps To Reproduce

The crash is timing-dependent, not load-dependent — it hits even single-engine, single-query CLI commands. Easiest path to reproduce:

  1. Install Basic Memory 0.20.3 against a PostgreSQL 17 instance with pgvector enabled (BASIC_MEMORY_DATABASE_BACKEND=postgres, BASIC_MEMORY_DATABASE_URL=postgresql+asyncpg://...).
  2. Run a one-shot CLI command, e.g.:
    basic-memory project list
    
  3. Observe the traceback (does not fire on every invocation — repeat a handful of times).

The same failure also fires in two other paths during normal operation:

  • Lifespan initialization sync at MCP server startup (when BASIC_MEMORY_SKIP_INITIALIZATION_SYNC is not true).
  • In-process file watcher (BASIC_MEMORY_SYNC_CHANGES=true), when watchfiles.awatch runs SyncService against the shared async pool inside the FastMCP lifespan loop.

I do not have a minimal standalone reproducer to attach. The closest upstream reproducer with the same traceback signature is pola-rs/polars#25209, which demonstrates the same crash on multiple async drivers (asyncpg, aiosqlite, oracle+oracledb) — i.e. the root cause is in the CPython / SQLAlchemy async-dispose orchestration, not in any one driver.

Expected Behavior

CLI commands, lifespan startup, and the watcher loop should each open an async engine, do their work, dispose the engine, and return a zero exit status (or, for the watcher, continue running) without raising from asyncio._run_once.

Actual Behavior

The process raises:

File "asyncio/base_events.py", line 2035, in _run_once
    handle = self._ready.popleft()
IndexError: pop from an empty deque

with the originating frames inside SQLAlchemy's async dispose path:

sqlalchemy/pool/base.py:_close_connection
  -> connectors/asyncio.py:close
    -> _concurrency_py3k.py:greenlet_spawn
      -> asyncio/base_events.py:_run_once
        -> IndexError: pop from an empty deque

For one-shot CLI commands the process exits non-zero. For the in-process watcher and lifespan-init path, the surrounding server process dies as well.

Environment

  • OS: Linux (Docker container, Ubuntu-based image)
  • Python version: 3.12.13
  • Basic Memory version: 0.20.3
  • Installation method: pip (inside Docker image)
  • Database backend: PostgreSQL 17 with pgvector
  • Driver: postgresql+asyncpg://...
  • Claude Desktop version: n/a (MCP server, not Desktop client)

Additional Context

Workarounds currently in place to keep the server stable:

  • BASIC_MEMORY_SKIP_INITIALIZATION_SYNC=true
  • BASIC_MEMORY_SYNC_CHANGES=false
  • Manual basic-memory sync invoked on demand instead of relying on the watcher.

These turn off real features, so they are not satisfactory long-term.

Adjacent upstream tracking that helped narrow this down:

  • pola-rs/polars#25209 — canonical reproducer of the same traceback signature across multiple async drivers (asyncpg, aiosqlite, oracle+oracledb). Closed-as-completed on the polars side but the underlying CPython/SQLAlchemy interaction is unchanged for other consumers.
  • jlowin/fastmcp#1311 — adjacent FastMCP + asyncpg cleanup hang. This is referenced verbatim by basic-memory's own justfile:
    # Note: Uses timeout due to FastMCP Client + asyncpg cleanup hang
    # (tests pass, process hangs on exit)
    # See: https://github.com/jlowin/fastmcp/issues/1311
    
  • sqlalchemy/sqlalchemy#8145 — long-standing async pool / cancel-during-close discussion in SQLAlchemy.

Possible Solution

Three knobs worth trying on the basic-memory side, ranked by how likely they are to help vs. effort. None of these are patches I have tested against a live reproducer yet — I am happy to test if a maintainer prefers a specific approach.

  1. poolclass=NullPool + connect_args={"statement_cache_size": 0} on the async engine. SQLAlchemy's documented mitigation for "async engine reused across event loops"; removes the pool-dispose scheduling entirely, which is the surface that races with loop teardown.
  2. Optionally support uvloop as the event loop policy (e.g. via an opt-in env var). uvloop's C-level scheduler does not have the _ready.popleft() codepath at all, so this whole class of crash structurally cannot fire there.
  3. Shield the engine-dispose call from cancellation on shutdown, along the lines discussed in sqlalchemy#8145.

A driver swap from asyncpg to psycopg3 is unlikely to help — the polars reproducer fires on aiosqlite and oracle+oracledb as well, so the root cause is upstream of the driver.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions