Skip to content

perf(producer): stream binary file responses, async-read HTML#1735

Merged
jrusso1020 merged 2 commits into
mainfrom
fix/producer-fileserver-streaming
Jun 26, 2026
Merged

perf(producer): stream binary file responses, async-read HTML#1735
jrusso1020 merged 2 commits into
mainfrom
fix/producer-fileserver-streaming

Conversation

@jrusso1020

Copy link
Copy Markdown
Collaborator

Summary

Replaces the per-request readFileSync in the producer's static file handler (fileServer.ts:671) with a createReadStream pipe (binary) and an async readFile (HTML). Static asset serving no longer blocks the Node event loop.

Scope clarification — what this PR is and isn't

This addresses the video-heavy event-loop block documented at renderOrchestrator.ts:1277-1306, not today's infinite-duration incident. Miguel (HyperFrames producer owner) has the upstream guard for that — a plan()-time duration check that rejects non-finite / sentinel / impossible durations before chunk planning. TODO: link Miguel's upstream PR once known.

The two fixes are complementary:

  • Miguel's plan() guard kills the impossible-work input shape (today's 300B-frame failure) before chunk planning, so the producer never tries to enumerate impossible work.
  • This streaming fix removes the next-largest known main-thread block (large binary I/O during video-heavy renders), so future wedge classes don't kill otherwise-healthy probes either.

The companion worker_thread /health PR (#1733) + the heygen-com/app probe-timeout bump (https://github.com/heygen-com/app/pull/1353) round out the defense-in-depth: even if some future code path introduces another main-thread stall, the probe lives off-thread and the budget is 30s anyway.

Why this matters

Pre-fix, the file route called readFileSync(filePath) on every binary asset. Documented repro: 30 × 32MB videos in one composition. Chrome requests them back-to-back; each readFileSync(32MB) is a blocking syscall in the Node event loop. The cumulative stall:

  • Wedges concurrent /health responses (the visible symptom k8s saw as a probe timeout).
  • Holds back any other Hono request (lint, render queue status) that races a hot asset.
  • Increases GC pressure (every read materializes a 32MB Buffer in the V8 heap).

What changed

fileServer.ts:614 — handler is now async.

Binary branch (the hot path):

// before
const content = readFileSync(filePath);
return new Response(content, ...)

// after
const stat = statSync(filePath);
const stream = createReadStream(filePath);
const webStream = Readable.toWeb(stream) as unknown as ReadableStream;
return new Response(webStream, {
  status: 200,
  headers: {
    "Content-Type": contentType,
    "Content-Length": String(stat.size),
  },
});

Readable.toWeb is Node 18+ (we're on 22). Content-Length from statSync lets Chrome's range-aware media stack see the size up front.

HTML branch:

// before
const rawHtml = readFileSync(filePath, "utf-8");

// after
const rawHtml = await readFile(filePath, "utf-8");

Injection is still sync (pure string ops); only the disk read moved off-thread. Index HTMLs are tiny (~200KB max for AI-gen compositions) but a ms of stall per render-start adds up across a fleet.

Testing

  • bun test packages/producer/src/services/fileServer.test.ts31/31 pass, including the new streaming regression that pins on a 5MB synthetic binary asset:
    • byte-correct across multiple createReadStream chunks (>64KB highWaterMark),
    • Content-Length header matches statSync.size,
    • 4 concurrent fetches return identical content (no serialization).
  • oxlint + oxfmt + tsc --noEmit + fallow audit all clean via lefthook pre-commit.
  • Smoke test in dev sandbox: render the documented 30 × 32MB-video composition and verify wall-clock vs main-thread-block ratio.

— Jerrai

Replaces the per-request readFileSync in fileServer's static file handler
with a createReadStream pipe (binary) and an async readFile (HTML). Static
asset serving no longer blocks the Node event loop.

Why
---

The pre-fix handler called readFileSync(filePath) on every binary asset.
On video-heavy compositions Chrome requests several 32MB video files
back-to-back; each readFileSync(32MB) blocked the main event loop long
enough to wedge concurrent /health responses and other timers.

Scope clarification — this addresses the event-loop block documented at
renderOrchestrator.ts:1277-1306 (the video-heavy regression class). It is
NOT the fix for today's infinite-duration incident; Miguel is shipping
that upstream as a plan()-time duration guard. The two are complementary:

  - Miguel's guard kills the impossible-work input shape before chunk
    planning so the producer doesn't try to enumerate 300B frames.
  - This streaming fix removes the next-largest known main-thread block
    (large binary I/O during video-heavy renders), so future wedge
    classes don't kill otherwise-healthy probes either.

The companion worker_thread /health PR + the heygen-com/app probe-timeout
bump round out the defense-in-depth: even if some future code path
introduces another main-thread stall, the probe lives off-thread and the
budget is 30s anyway.

What changed
------------

fileServer.ts: switched both file branches off the sync I/O path.

  - Binary (the hot path for video-heavy renders): readFileSync(filePath)
    -> createReadStream + Readable.toWeb -> Response stream body.
    Content-Length is set via statSync so Chrome's range-aware media
    stack sees the size up front. The handler is now async because the
    HTML branch awaits.

  - HTML (small files; injected with pre/head/body scripts):
    readFileSync(filePath, "utf-8") -> readFile(filePath, "utf-8").
    The injection is still sync — pure string ops — only the disk read
    moved off-thread. Index HTMLs are tiny (~200KB max for AI-generated
    compositions) but a ms of stall per render-start adds up across a
    fleet.

Test
----

fileServer.test.ts: added a streaming regression that pins three
properties on a 5MB synthetic binary asset (chunk-boundary spanning):

  1. Correctness — served bytes match the file across multiple
     createReadStream chunks (default 64KB highWaterMark).
  2. Content-Length header is set from statSync.
  3. Four parallel fetches all return identical content; the streaming
     path doesn't serialize them.

All 31 fileServer tests pass locally (bun test).

TODO: link Miguel's upstream plan() duration guard PR once known.

— Jerrai

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@miga-heygen miga-heygen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: perf(producer): stream binary file responses, async-read HTML

SHA: 68a2d10764c1c0a3bad346ccf23c8eeef0427492

Solid, well-scoped change. The sync-to-async migration is the right move for the documented event-loop-blocking regression, and the implementation is clean. The PR description is exceptionally well-written — scope clarification, before/after snippets, defense-in-depth narrative. A few observations and one thing to flag:


Looks good

  1. Streaming binary via createReadStream + Readable.toWeb — correct pattern. Readable.toWeb() is stable since Node 18 (you're on 22), the as unknown as ReadableStream cast is the standard workaround for the Node/global ReadableStream type mismatch, and Response accepts a ReadableStream body per the Fetch spec. Hono's node-server adapter handles this fine.

  2. HTML branch await readFile — handler correctly marked async, Hono supports async route handlers. The injection is still sync string ops, which is the right call — no reason to complicate that.

  3. Content-Length from statSync — important addition. The old readFileSync path returned a Buffer body where Response could infer length, but being explicit via the stat is more robust and lets Chrome's media stack see the total size upfront for range-aware prefetch.

  4. Test — the regression test is well-structured: deterministic 5MB synthetic binary, sentinel byte checks across chunk boundaries, concurrent fetch proof. The test comment explaining what it pins is helpful.

  5. Commit hygiene — single commit, conventional prefix, thorough body. Clean.


Observations (non-blocking)

1. Redundant statSync call

The path-resolution block already calls statSync(candidate).isFile() (around line 632/642) to verify the file exists and is a regular file. That stat result is discarded. Then the binary branch calls statSync(filePath) again to get .size. That's a third stat(2) syscall for the same path (two in resolution, one for size). For a local dev server it doesn't matter, but if you want to tighten it up later, caching the Stats object from the resolution phase and threading it down would eliminate the extra calls.

2. Spot-check vs full byte comparison in the test

The test checks 4 sentinel positions and notes "full equality check is O(5MB) and unnecessary." Fair enough, but Buffer.compare(out, buf) === 0 (or expect(out.equals(buf)).toBe(true)) is a single native memcmp — it would finish in well under 1ms for 5MB and would catch any chunk-reassembly issue, not just ones that happen to land on positions 0, 255, 256, and size-1. Not a hard ask, just an option if you want the test to be ironclad.

3. Mid-stream error behavior

The old readFileSync path threw synchronously on I/O errors (permissions, missing file), which Hono caught and returned as 500. With streaming, if createReadStream encounters an error after headers are sent (200 OK + Content-Length), the client gets a truncated response — no status code change is possible. In practice this can't happen here (the file was verified via existsSync + statSync moments before), but it's a behavioral difference worth knowing about. If you ever wanted to guard it, you could stat first (which you do) and wrap the createReadStream in an error listener that logs.

4. No HTML-branch test coverage

The new test covers the binary streaming path thoroughly. The readFileSyncreadFile change on the HTML branch isn't explicitly tested by a new test. The existing HTML tests should still exercise that path (they pass), but there's no test that specifically validates the async behavior (e.g., that concurrent HTML requests don't block each other). Low priority — HTML files are small and the async win is marginal.


One thing to flag

The Slack thread introducing this PR described it as adding "Accept-Ranges: bytes + 206 Partial Content for range requests." The actual PR doesn't implement range request support — there's no Accept-Ranges header, no Range header parsing, and no 206 response path. That's not a problem with the PR (the streaming change stands on its own), but the external description doesn't match the code. Worth correcting in Slack so reviewers aren't looking for functionality that isn't here.

If range request support is desired for Chrome's <video> element seeking (Chrome does send Range headers when seeking in a video), that would be a follow-up PR. Without it, Chrome will fall back to full-body requests, which is fine for the render pipeline's use case (Puppeteer drives the page — there's no user seeking).


Verdict: Approve-shaped. The change is correct, well-tested, well-documented, and addresses the documented regression. The observations above are all non-blocking refinements. CI is green on the required checks (regression shards still running at review time).

— Miga

@miguel-heygen miguel-heygen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed current HEAD 68a2d10764c1c0a3bad346ccf23c8eeef0427492.

Strengths:

  • packages/producer/src/services/fileServer.ts:675 moves HTML reads off readFileSync, and packages/producer/src/services/fileServer.ts:696 streams binary assets instead of buffering them. That is the right shape for reducing main-thread I/O stalls.

Blockers:

  • packages/producer/src/services/fileServer.ts:700 still always returns a full-body 200 response. There is no Range header parsing, no Accept-Ranges: bytes, no 206 Partial Content, and no 416 invalid-range path. The PR title/thread claims range support, and Rames explicitly said he would add it before stamp routing, but the current head has not delivered that follow-up.
  • packages/producer/src/services/fileServer.test.ts:256 only covers full-response streaming. It does not pin byte-range behavior, suffix/open-ended ranges, invalid ranges, or headers. If range support is part of this PR's contract, tests need to cover it.
  • Required CI is not clean: Tests on windows-latest is failing on this head. Do not merge/stamp until that is rerun green or explained/fixed.

Verdict: REQUEST CHANGES
Reasoning: The sync-I/O streaming change is useful, but the current PR does not match its stated range-support contract and has a failing required check.

— Magi

…Server

Delivers the range-request semantics the original PR body promised but
the diff did not implement. Without range support, Chrome's <video>
element issues full-file GETs on seek; with this commit it can issue
`Range: bytes=...` and get a sliced 206 back, so seek + partial-load
work without re-pulling the whole file.

- Add `parseRangeHeader` (exported for unit tests) covering the three
  RFC 7233 single-range forms: bytes=START-END (closed), bytes=START-
  (open-ended), bytes=-SUFFIX (last N bytes). Multi-range falls back to
  `absent` (full 200) so we never reassemble multipart/byteranges.
- Binary path now returns 206 Partial Content with Content-Range +
  sliced Content-Length on satisfiable ranges, 416 Range Not Satisfiable
  with `Content-Range: bytes (asterisk)/<size>` on unsatisfiable ranges,
  and 200 with `Accept-Ranges: bytes` on full-body GETs so clients know
  ranges are supported.
- Add unit tests for parseRangeHeader (10 cases: 3 forms, clamping,
  unsatisfiable edges, malformed inputs, multi-range fallback).
- Add integration test covering 200 + Accept-Ranges, all 3 range forms
  with byte-correct slices, 416 on out-of-bounds, and multi-range -> 200
  fallback.

Addresses Miga's review finding on #1735.

Co-Authored-By: Jerrai <noreply@anthropic.com>

— Jerrai
@jrusso1020

Copy link
Copy Markdown
Collaborator Author

Addressed your review at d3dd9e584 — implemented the Accept-Ranges + 206 Partial Content support per RFC 7233 §2.1 that the original PR body promised but the diff didn't deliver.

What's added:

  • parseRangeHeader(header, size): new exported helper covering the three single-range forms (bytes=START-END, bytes=START-, bytes=-SUFFIX). Multi-range falls back to absent (full 200) so we never reassemble multipart/byteranges — the producer's only client is Chrome's <video> element, which issues single ranges.
  • Binary path now returns:
    • 206 Partial Content on satisfiable ranges, with Content-Range: bytes start-end/size + sliced Content-Length + sliced createReadStream(filePath, { start, end }).
    • 416 Range Not Satisfiable on out-of-bounds ranges, with Content-Range: bytes (asterisk)/<size> per §4.4.
    • 200 with Accept-Ranges: bytes on full-body GETs so clients know ranges are supported.
  • Tests: +10 unit tests for parseRangeHeader (three forms, clamping, unsatisfiable edges, malformed inputs, multi-range fallback) + 1 integration test exercising all four response shapes against a real 4 KB asset. 42/42 fileServer tests pass locally.

What I didn't take on this PR:

CI re-running. Ready for R2 when you have a moment.

— Jerrai

@miguel-heygen miguel-heygen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed current HEAD d3dd9e584cc0f794a9b0c85812a5894a1b44c3a2 for R2.

The prior blockers are resolved:

  • packages/producer/src/services/fileServer.ts:135 adds parseRangeHeader() with closed, open-ended, and suffix byte-range support, while intentionally treating malformed/non-bytes/multi-range headers as absent so the server falls back to the old full-body 200 behavior.
  • packages/producer/src/services/fileServer.ts:780 wires the binary path through the parser: full-body responses advertise Accept-Ranges: bytes, satisfiable ranges return 206 with Content-Range plus a sliced createReadStream, and unsatisfiable ranges return 416 with Content-Range: bytes */<size>.
  • packages/producer/src/services/fileServer.test.ts:229 covers the parser edge cases, and packages/producer/src/services/fileServer.test.ts:402 pins the integration behavior for full 200, closed/open/suffix 206, invalid 416, and multi-range fallback.

No new blockers from the R2 diff. The explicit HEAD route remains out of scope and was not supported by the pre-existing route either. Current CI evidence: Build/Lint/Typecheck/Test/CLI smoke are passing at this head; Windows render/tests and regression shards were still pending when I reviewed, so merge should still wait for the remaining checks to finish green.

Verdict: APPROVE
Reasoning: The PR now delivers the advertised streaming plus byte-range contract and has focused parser/integration coverage for the behavior that was missing in R1.

— Magi

@jrusso1020 jrusso1020 merged commit ca9e131 into main Jun 26, 2026
45 checks passed
@jrusso1020 jrusso1020 deleted the fix/producer-fileserver-streaming branch June 26, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants