perf(copilot): read chat transcripts from copilot_messages (R+1 cutover) by waleedlatif1 · Pull Request #4808 · simstudioai/sim

waleedlatif1 · 2026-05-30T02:53:20Z

What

Cuts over user-facing copilot chat reads from the legacy copilot_chats.messages JSONB array to the normalized copilot_messages table. This is the R+1 read cutover — the payoff for the table + seq ordinal work that already shipped.

Why

copilot_chats is 5.7 GB, 99% of it the messages JSONB in TOAST. Every chat load detoasted + decompressed the whole array. Reading from copilot_messages via the (chat_id, seq) index avoids that entirely — biggest win on large/tail chats and on keeping the base table lean.

How

New helper loadCopilotChatMessages(chatId) in lifecycle.ts reads content from copilot_messages ordered by seq ASC NULLS LAST, created_at ASC, id ASC (the verified canonical order; raw sql fragment because Drizzle's asc() omits NULLS LAST).
Both detail getters (getAccessibleCopilotChat, getAccessibleCopilotChatWithMessages) drop messages from the metadata select (no more detoast) and assemble the transcript from the table after authorization (no wasted query on denied access).
This cascades to the copilot GET (/api/copilot/chat), mothership GET (/api/mothership/chats/[chatId]), and resolveOrCreateChat's conversationHistory (the LLM payload) — all via the two getters.
New-chat insert uses a dedicated returning column set so a freshly-created chat returns messages: [] without a second query.

The normalize → effective-transcript pipeline is unchanged and source-agnostic (copilot_messages.content is the same shape as a JSONB array element), so transcripts are byte-identical.

Scope / safety

Dual-write stays on; the JSONB column stays written — it remains the source for internal-logic reads (terminal-state, fork, cleanup, workspace-vfs) and a fallback. Removing JSONB writes is a later step.
No feature flag (per direction). Revert = reads fall straight back to JSONB, zero data implications.

Integrity verified on prod before cutover

0 messages missing from the table · 0 NULL-seq · 0 duplicate keys · 0 duplicate seq within a chat · 0 orphans · order-parity vs JSONB = 0 mismatches.

Tests

New lifecycle.test.ts: getters source messages from the table in order; empty chat → []; auth-deny → null with no messages query; legacy getter; resolveOrCreateChat existing (table-sourced history) vs new (empty, no read).
Full suite: 472 files / 7,285 tests pass. Type-check clean, biome clean, check:api-validation passes.

Post-deploy verification

Staging smoke: load a large chat via both GETs, confirm identical transcript; EXPLAIN shows copilot_messages_chat_seq_idx and no detoast of copilot_chats.messages. Re-run the low-load TABLESAMPLE parity spot-check (currently 0).

Flip user-facing chat reads from the legacy copilot_chats.messages JSONB array (5.7GB, 99% TOAST) to the normalized copilot_messages table via a new loadCopilotChatMessages helper ordered by seq NULLS LAST, created_at, id — the verified canonical order. Both chat-detail getters (getAccessibleCopilotChat, getAccessibleCopilotChatWithMessages) now drop the messages column from their metadata select (no more whole-array detoast on every load) and assemble the transcript from the table after authorization. This cascades to the copilot + mothership GET endpoints and to resolveOrCreateChat's conversationHistory (the LLM payload). The normalize/effective-transcript pipeline is source-agnostic (copilot_messages.content == a JSONB array element), so transcripts are byte-identical. Dual-write and the JSONB column stay in place as the internal-logic source and fallback; removing JSONB writes is a later step. Prod integrity verified before cutover: 0 messages missing, 0 NULL-seq, 0 dup keys/seq, 0 orphans, order-parity vs JSONB = 0 mismatches. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

vercel · 2026-05-30T02:53:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
docs	Skipped		May 30, 2026 3:00am

cursor · 2026-05-30T02:53:29Z

PR Summary

Medium Risk
Switches the primary read path for all user-facing chat transcripts; mis-ordering or drift from dual-write JSONB would change LLM/history payloads, though shape is unchanged and JSONB remains written.

Overview
User-facing copilot chat loads now build transcripts from copilot_messages instead of detoasting copilot_chats.messages. In lifecycle.ts, loadCopilotChatMessages loads non-deleted rows ordered by seq (NULLS LAST), created_at, and id; getAccessibleCopilotChat and getAccessibleCopilotChatWithMessages stop selecting the JSONB blob, authorize first, then attach the table-backed message list. That path feeds copilot/mothership GETs and resolveOrCreateChat’s conversationHistory; new chats return messages: [] from insert without a messages query.

Adds lifecycle.test.ts for ordering, empty transcripts, auth failures skipping the messages query, legacy getter behavior, and create vs load paths. Dual-write to JSONB is unchanged for other readers.

^{Reviewed by Cursor Bugbot for commit 2e7f4ec. Configure here.}

greptile-apps · 2026-05-30T02:57:50Z

Greptile Summary

This PR cuts over copilot chat transcript reads from the legacy copilot_chats.messages JSONB column to the normalized copilot_messages table, avoiding the costly TOAST decompression on every chat load while keeping the dual-write intact for fallback and internal-logic reads.

New loadCopilotChatMessages(chatId) reads from copilot_messages in (seq ASC NULLS LAST, created_at ASC, id ASC) order; both detail getters (getAccessibleCopilotChat, getAccessibleCopilotChatWithMessages) drop messages from the metadata select and call it only after authorization succeeds.
A separate copilotChatDetailReturningColumns set is introduced for new-chat inserts so a fresh row returns messages: [] directly from the RETURNING clause without issuing a second query.
New lifecycle.test.ts verifies table-sourced messages, empty transcripts, chat-not-found and auth-denied no-query guards, the legacy getter, and both existing/new paths in resolveOrCreateChat.

Confidence Score: 5/5

Safe to merge — reads from the normalized table only after authorization succeeds, dual-write and JSONB column are untouched, and pre-cutover integrity was verified on prod.

The change is a well-scoped read-path swap: the JSONB column remains written and available as a fallback, authorization always gates the new messages query, and the normalized table was fully validated against the JSONB source before cutover. The implementation is clean, the tests exercise the critical invariants (including the auth-denied no-query contract added in the head SHA), and the type-check and full suite pass.

No files require special attention.

Important Files Changed

Filename	Overview
apps/sim/lib/copilot/chat/lifecycle.ts	Introduces `loadCopilotChatMessages` helper reading from `copilot_messages`; both detail getters and `resolveOrCreateChat` correctly sequence authorization before the messages query; the `NULLS LAST` raw SQL fragment is the correct Drizzle pattern for this ordering requirement.
apps/sim/lib/copilot/chat/lifecycle.test.ts	New test file covering all key invariants: messages sourced from the table in order, empty transcript, chat-not-found no-query guard, auth-denied no-query guard (added in the head SHA per previous thread), legacy getter, and both branches of `resolveOrCreateChat`.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant getAccessibleCopilotChatWithMessages
    participant copilot_chats DB
    participant authorizeCopilotChatRow
    participant loadCopilotChatMessages
    participant copilot_messages DB

    Caller->>getAccessibleCopilotChatWithMessages: (chatId, userId)
    getAccessibleCopilotChatWithMessages->>copilot_chats DB: SELECT metadata columns WHERE id=chatId AND userId=userId LIMIT 1
    copilot_chats DB-->>getAccessibleCopilotChatWithMessages: chat row (no messages JSONB)
    getAccessibleCopilotChatWithMessages->>authorizeCopilotChatRow: (chat, chatId, userId)
    alt not found or auth denied
        authorizeCopilotChatRow-->>getAccessibleCopilotChatWithMessages: null
        getAccessibleCopilotChatWithMessages-->>Caller: null (no messages query)
    else authorized
        authorizeCopilotChatRow-->>getAccessibleCopilotChatWithMessages: authorized row
        getAccessibleCopilotChatWithMessages->>loadCopilotChatMessages: (chatId)
        loadCopilotChatMessages->>copilot_messages DB: SELECT content WHERE chat_id=chatId AND deleted_at IS NULL ORDER BY seq ASC NULLS LAST, created_at ASC, id ASC
        copilot_messages DB-->>loadCopilotChatMessages: [{content}, ...]
        loadCopilotChatMessages-->>getAccessibleCopilotChatWithMessages: "Record<string,unknown>[]"
        getAccessibleCopilotChatWithMessages-->>Caller: "{...authorizedRow, messages}"
    end

_{Reviews (2): Last reviewed commit: "test(copilot): cover auth-deny on a foun..." | Re-trigger Greptile}

Address PR review: exercise the `if (!authorized) return null` contract — when the chat row exists but authorization fails, the getter returns null and never issues the copilot_messages read. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

waleedlatif1 · 2026-05-30T03:05:18Z

@greptile

waleedlatif1 · 2026-05-30T03:05:22Z

@cursor review

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit 2e7f4ec. Configure here.}

greptile-apps Bot reviewed May 30, 2026

View reviewed changes

Comment thread apps/sim/lib/copilot/chat/lifecycle.test.ts

vercel Bot temporarily deployed to Preview May 30, 2026 03:00 Inactive

cursor Bot reviewed May 30, 2026

View reviewed changes

waleedlatif1 merged commit 640b7e1 into staging May 30, 2026
14 checks passed

waleedlatif1 deleted the waleedlatif1/copilot-messages-cutover-prep branch May 30, 2026 17:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(copilot): read chat transcripts from copilot_messages (R+1 cutover)#4808

perf(copilot): read chat transcripts from copilot_messages (R+1 cutover)#4808
waleedlatif1 merged 2 commits into
stagingfrom
waleedlatif1/copilot-messages-cutover-prep

waleedlatif1 commented May 30, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 30, 2026 •

edited

Loading

Uh oh!

cursor Bot commented May 30, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

waleedlatif1 commented May 30, 2026

Uh oh!

waleedlatif1 commented May 30, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

waleedlatif1 commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Scope / safety

Integrity verified on prod before cutover

Tests

Post-deploy verification

Uh oh!

vercel Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

greptile-apps Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

waleedlatif1 commented May 30, 2026

Uh oh!

waleedlatif1 commented May 30, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

waleedlatif1 commented May 30, 2026 •

edited

Loading

vercel Bot commented May 30, 2026 •

edited

Loading

cursor Bot commented May 30, 2026 •

edited

Loading

greptile-apps Bot commented May 30, 2026 •

edited

Loading