Skip to content

Move to nanobind#522

Merged
evertlammerts merged 50 commits into
duckdb:mainfrom
evertlammerts:prototype/nanobind-cutover
Jul 1, 2026
Merged

Move to nanobind#522
evertlammerts merged 50 commits into
duckdb:mainfrom
evertlammerts:prototype/nanobind-cutover

Conversation

@evertlammerts

Copy link
Copy Markdown
Member

No description provided.

…ilding)

Build-system integration WORKS (CMake configure passes): find_package(Python)+nanobind,
nanobind_build_library(nanobind-static) feeding the object libs, nanobind_add_module NB_STATIC;
pyproject build dep pybind11->nanobind. Umbrella (pybind_wrapper.hpp) + enum caster macro +
identifier caster ported to nanobind from_python/from_cpp API; mechanical renames applied
(NB_MODULE, python_error, borrow/steal, def_prop_ro, namespace py = nanobind).

First build surfaced 254 errors; keystone fixes bring it to 224, cascade cleared. Remaining work
concentrated: numpy nb::ndarray port (~122), arrow_array_stream (59), py:: API diffs in
python_objects/relation/result/connection headers (~60), object wrappers in dataframe.hpp (12),
optional/pyconnection_default casters, register_exception, py::options, init_implicit, 81 .none().
Cleared categorically: identifier+enum casters, object wrappers (borrow_t ctors,
handle_type_name blocks removed), module_::import_, py::module_, namespace py = nanobind.
Build system still green (configure passes). Remaining concentrated in: numpy nb::ndarray
port (py::dtype has no nanobind equiv -> reroute via numpy.empty + nb::ndarray; touches
callers, not just the facade), ~150 scattered py:: API diffs (py::str->string, handle/object
nuances) across connection/relation/result/expression, optional/pyconnection_default casters,
register_exception->nb::exception, init_implicit, py::options.
numpy DONE: NumpyArray facade ported off py::array/py::dtype (cold-path ctypes.data buffer
access, dtype-as-string Allocate via numpy.empty, in-place resize) -- move-faithful, no copies.
Converted 15 .cast<>() method calls -> py::cast<>(), py::ssize_t->Py_ssize_t,
py::function->py::callable, dropped py::options. numpy_array.hpp + arrow_array_stream.hpp now
compile. Remaining: per-site py:: tail (~25 functional-cast string(obj)->py::cast, ~36
missing-member, move/ref bindings) across 12 files + pybind_wrapper.cpp impl + pyconnection_default
caster.
…ault caster retirement, bulk str/int/type-of/cast conversions
…, type_object, capsule.data, len, more conversions
…ssion), dict/list builds, bytes; numpy buffer-pointer caching (perf)
…t type-punning) in dataframe/scan/bind/map/udf
…or implicit conversions); guard numpy ctypes eager-compute
…o PyObject_Str runs) across numpy/pandas/udf/replacement paths
…ls crash cascade)

Add a custom type_caster<shared_ptr<DuckDBPyExpression>> (mirrors the DuckDBPyType one): keep
cast_flags::convert so the registered implicit conversions (str->column, scalar->constant) fire
for shared_ptr args, and when the inner caster yields no instance, construct through the registered
Python ctor (None->NULL constant) -- a real owned object, no dangling -- with PyErr_Clear() on
failure. Allow None on the Expression object-ctor (py::arg.none()). The PyErr_Clear is what
eliminates the stale-PyErr segfault CASCADE: the full fast suite now runs clean in parallel
(0 crashes, was unmeasurable). Failures 86 -> 66; expression/spark Expression cluster resolved
(spark 6->3). Belt-and-suspenders None guard in CreateCompareExpression/Coalesce.
@evertlammerts evertlammerts force-pushed the prototype/nanobind-cutover branch from b9929c1 to 2983c92 Compare June 30, 2026 18:36
The NumpyArray facade read the buffer pointer via numpy's `ctypes.data`
attribute chain and allocated via `numpy.empty(count, dtype_string)`. For a
top-level column that runs once per 2048-row chunk (amortized), but the
LIST/ARRAY per-element converter allocates a fresh array per row, so at 200k
rows it became ~600k ctypes-object allocations: df()/fetchnumpy() of a LIST
column ran ~6x slower than the pybind11 baseline (829ms vs 136ms).

Read the buffer pointer directly from numpy's PyArrayObject C struct (a plain
field read, as pybind11's array.data() did), gated by a PyObject_TypeCheck
against numpy.ndarray so non-ndarray wrappers are never reinterpreted. Cache the
numpy.empty callable and per-dtype np.dtype objects, and skip the no-op
resize-to-current-length on the per-element path.

Output is byte-identical (lists, nested, nulls, empty, masked, large-N); the row
and arrow paths and the int/double/struct columnar paths are unaffected. LIST
df()/fetchnumpy() now match-or-beat the pybind11 baseline (69ms).
@evertlammerts evertlammerts marked this pull request as draft July 1, 2026 08:57
@evertlammerts evertlammerts marked this pull request as ready for review July 1, 2026 08:57
@evertlammerts evertlammerts merged commit d7e138f into duckdb:main Jul 1, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant