buffer: increase Buffer.poolSize default to 64 KiB#63597
Conversation
The 8 KiB default has been unchanged since 2015. With the threshold check `size < (Buffer.poolSize >>> 1)`, this means allocations of 4 KiB or larger bypass the pool entirely — including 4 KiB itself, a common page and HTTP-frame size. Raising the default to 64 KiB extends pool coverage to ~32 KiB allocations, capturing common sizes used by HTTP parsers, stream chunks, and small file reads. Throughput improvements on workers-k=8 fs.readFileSync benchmarks (Linux/glibc) at the affected sizes, with no regressions elsewhere: file size | 8 KiB pool | 64 KiB pool | delta -----------+--------------+---------------+------- 4 KiB | 326k ops/s | 360k ops/s | +10% 8 KiB | 202k ops/s | 254k ops/s | +26% 16 KiB | 148k ops/s | 181k ops/s | +23% 64 KiB | 86k ops/s | 87k ops/s | ~ 1 MiB | 12k ops/s | 13k ops/s | ~ Cost: +56 KiB RSS per realm at startup. Signed-off-by: Matteo Collina <hello@matteocollina.com>
567a29e to
0dd37aa
Compare
|
Yes! |
|
SGTM. I'm wondering how close the performance at 32 KiB is to 64 KiB, and whether larger allocations are actually harmful here. |
There is little to no drawback to increase this up to 256KB on my tests, but 8->256KB seemed jumping over a few too many steps. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #63597 +/- ##
=======================================
Coverage 90.32% 90.33%
=======================================
Files 730 730
Lines 234669 234669
Branches 43948 43953 +5
=======================================
+ Hits 211967 211987 +20
+ Misses 14430 14417 -13
+ Partials 8272 8265 -7
🚀 New features to boost your workflow:
|
LiviaMedeiros
left a comment
There was a problem hiding this comment.
64 KiB sounds reasonable.
The value is adjustable so users who know they have enough RAM and benefit from bigger pool can do it on their end. Maybe worth adding example to the docs as followup, i think something like --import 'data:text/javascript,Buffer.poolSize=0x100000;' should do the trick for them?
The
Buffer.poolSizedefault of 8 KiB has been unchanged since May 2015. Two issues with it today:The 4 KiB cliff. The pool check is
size < (Buffer.poolSize >>> 1), so with the 8 KiB default the threshold is 4 KiB and the strict inequality means a 4 KiB allocation itself bypasses the pool. The current default helps allocations from 1 B to 3999 B and abruptly stops at the page-aligned and HTTP-frame-sized boundary where many real allocations land.Stale in 2025. Predates HTTP/2 (16 KiB-1 MiB frame sizes), modern stream chunk sizes, and ~10× growth in typical RAM. Many
Buffer.allocUnsafecalls in core (fs.readFileSyncfor non-utf8 reads, HTTP parser, stream chunkers) sit in the 4-64 KiB range and miss the pool.This PR raises the default to 64 KiB, extending pool coverage to allocations up to ~32 KiB.
Evidence
Benchmark setup: 8 Worker threads in one process, each looping
fs.readFileSync(file)on files of various sizes, measuring throughput. Linux 6.8, glibc 2.39, i7-7700, Node main (built locally).Wins where it matters (4-32 KiB), no regressions at small or large sizes.
Why workers benefit more than single-threaded
Buffer.allocUnsafe(size)forsize ≥ Buffer.poolSize/2falls through to fresh V8 ArrayBuffer allocations, which land on glibcmalloc. When multiple Worker threads do this concurrently they contend on the per-mm_structmmap_lockwrite lock (every arena growth takes it). Confirmed bybpftrace: workers wait ~17× longer permmap_lockacquisition than equivalent child processes on the same workload, with cumulative wait dropping ~2.5× when the pool covers the allocation size. Full investigation: https://github.com/platformatic/node-worker-mmap-lock-contention.Single-threaded apps benefit too, just less dramatically — fewer allocator round-trips for medium-sized buffers.
Cost
+56 KiB RSS per realm at startup (one 64 KiB pool per realm; main thread + each Worker thread). On a typical app with 1 main + 4 Workers that's +280 KiB, trivial on modern hardware. The pool occupies RSS for the lifetime of any sliced Buffer that's still referenced, so peak RSS may grow modestly in apps that hold many small Buffers; same shape as today, just larger granularity.
Why 64 KiB and not larger
Tested 32 / 64 / 128 / 256 KiB; 64 KiB is the smallest pool that captures the 4-16 KiB hot zone where the current default fails. 128/256 KiB give additional wins on 16-64 KiB allocations (+64% at 16 KiB with a 256 KiB pool) but the marginal RSS cost is harder to justify as a default. Easy follow-up if there's appetite.
Pool refills at 64 KiB stay under glibc's
M_MMAP_THRESHOLD(128 KiB default), so each refill uses the heap rather than triggering a freshmmap— that boundary is part of why 64 KiB is the right ceiling for a default.Notes
doc/api/buffer.md(default value + the "4 KiB" example reference).Buffer.poolSizesetter machinery is unchanged.cc @nodejs/buffer @nodejs/performance