Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
342e4ba
Bump submodule
evertlammerts Feb 27, 2026
d33b693
enable caching
evertlammerts Feb 27, 2026
f291d17
Merge branch 'v1.5-variegata'
evertlammerts Mar 9, 2026
716ce8a
Fix main
evertlammerts Mar 11, 2026
1f54f8c
Catch exceptions in OPTIONAL filters in Arrow scans
evertlammerts Mar 18, 2026
2326ce2
bump submodule
evertlammerts Mar 18, 2026
5a5c16d
fix FS tests
evertlammerts Mar 18, 2026
74f4617
Re-enable nightlies
evertlammerts Mar 18, 2026
3ecd608
Merge with v1.5-andium (#351)
evertlammerts Mar 18, 2026
05292bf
Fix main
evertlammerts Mar 20, 2026
9abfe76
Bump submodule
duckdblabs-bot Mar 21, 2026
fb10ef5
[duckdb-labs bot] Bump DuckDB submodule (#396)
evertlammerts Mar 21, 2026
25fb0d9
Fix physical operator and bump submodule
evertlammerts Mar 24, 2026
f68a338
bump submodule
evertlammerts Mar 30, 2026
5c2a7f7
Add CLAUDE.md
evertlammerts Apr 9, 2026
299e61f
pin duckdb at may 10 nightly
evertlammerts May 11, 2026
291f62c
fix compilation
evertlammerts May 11, 2026
da44e40
Merge remote-tracking branch 'upstream/v1.5-variegata' into merge_var…
evertlammerts May 11, 2026
4ab671e
arrow pushdown fixes
evertlammerts May 12, 2026
276028b
rework arrow pushdown tests
evertlammerts May 12, 2026
1852dfd
Unify filter pushdown across arrow and polars
evertlammerts May 12, 2026
724671f
Merge variegata into main pt 1 (#453)
evertlammerts May 12, 2026
abb9332
Fix validity masks in Arrow UDF
evertlammerts May 12, 2026
f1ee06f
Restore arrow-UDF null filtering for rows where some columns are null…
evertlammerts May 13, 2026
2abc717
fix .clangd
evertlammerts May 19, 2026
ab63b5f
Only disable unity builds for editable installs on OSX
evertlammerts May 19, 2026
b365cb9
Merge remote-tracking branch 'upstream/v1.5-variegata' into main_prep
evertlammerts Jun 11, 2026
620d84c
main fixes
evertlammerts Jun 11, 2026
6b23153
finish expression filter integration
evertlammerts Jun 12, 2026
aa511e3
add timestamp_tz_ns support
evertlammerts Jun 12, 2026
ecbebcb
add timestamp_tz_ns support to the pyspark module
evertlammerts Jun 12, 2026
55155d7
fix whitespace-only expression bug
evertlammerts Jun 12, 2026
5b9dbbf
Integrated with Identifier and MaxLogicalType, and using SetChildCard…
evertlammerts Jun 16, 2026
849119d
more fixes
evertlammerts Jun 16, 2026
4d7e6ff
added sqlnull support and fixed explain
evertlammerts Jun 16, 2026
0911fa4
fixed some more test failures
evertlammerts Jun 16, 2026
289a9e3
fix pybind type casters for enums
evertlammerts Jun 17, 2026
0ff9762
fix identifier <-> name conversion
evertlammerts Jun 17, 2026
76175ee
fix profiler test
evertlammerts Jun 17, 2026
249f44f
xfail query graph rendering test
evertlammerts Jun 17, 2026
e2aad13
adapt adbc tests to stricter default TransactionInvalidationPolicy
evertlammerts Jun 17, 2026
6d1c749
fix spark tests
evertlammerts Jun 17, 2026
c7f7f9a
fix test errors
evertlammerts Jun 17, 2026
45e1670
Merge v1.5-variegata and integrate the latest DuckDB main (#494)
evertlammerts Jun 17, 2026
42b6596
empty structs are supported
evertlammerts Jun 18, 2026
ef20531
pin torch
evertlammerts Jun 18, 2026
bceb3a4
bump submodule
evertlammerts Jun 19, 2026
2f3a245
fix failing ci
evertlammerts Jun 19, 2026
8a0903e
Bump submodule
duckdblabs-bot Jun 21, 2026
a1eeaa6
[duckdb-labs bot] Bump DuckDB submodule (#500)
evertlammerts Jun 21, 2026
b8d15ab
fix numpy deprecation errors
evertlammerts Jun 22, 2026
768ddf2
bump python version for mypy checks
evertlammerts Jun 22, 2026
91c7c49
one more
evertlammerts Jun 22, 2026
9033322
Fix numpy deprecation errors (#502)
evertlammerts Jun 22, 2026
82466f2
Bump submodule
duckdblabs-bot Jun 22, 2026
2f0561a
[duckdb-labs bot] Bump DuckDB submodule (#504)
evertlammerts Jun 22, 2026
5351efe
bump submodule to June 25 nightly
evertlammerts Jun 25, 2026
f4e3786
fix variant import
evertlammerts Jun 25, 2026
01cf1f1
Fix import and bump submodule (#506)
evertlammerts Jun 25, 2026
9e2e5a2
Merge branch 'v1.5-variegata' into main
evertlammerts Jun 25, 2026
4c63e39
bump submodule to June 26 nightly
evertlammerts Jun 26, 2026
3d73752
QualifiedName and ProfilerPrintFormat
evertlammerts Jun 26, 2026
56c26cc
Merge/v1.5 variegata into main (#507)
evertlammerts Jun 26, 2026
bb06fdd
Remove internal duckdb::*_ptr usage
evertlammerts Jun 26, 2026
501a5fc
Fix dev distance versioning on main
evertlammerts Jun 26, 2026
467b48e
Drop pybind11 module_local() from bound-type registrations
evertlammerts Jun 26, 2026
636d500
Remove internal duckdb::*_ptr usage (#509)
evertlammerts Jun 26, 2026
3f89994
Introduce NumpyArray façade over py::array
evertlammerts Jun 26, 2026
7c26386
Consolidate enum casters into reusable macro
evertlammerts Jun 26, 2026
680cb27
Fix dev distance versioning on main (#510)
evertlammerts Jun 26, 2026
b27d05b
Extract DuckDBPyConnection module state into DuckDBPyModuleState
evertlammerts Jun 26, 2026
676a023
Drop pybind11 module_local() from bound-type registrations (#512)
evertlammerts Jun 26, 2026
5a7cb18
Introduce NumpyArray façade over py::array (#513)
evertlammerts Jun 26, 2026
4542c30
Extract DuckDBPyConnection module state into DuckDBPyModuleState (#515)
evertlammerts Jun 26, 2026
664d7e4
Consolidate enum casters into reusable macro (#514)
evertlammerts Jun 26, 2026
1b70332
Close #468
leostimpfle Jun 30, 2026
5761376
Add test for query chaining with alias
leostimpfle Jun 30, 2026
44c45e2
pin submodule on v1.5
evertlammerts Jul 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .github/actions/ccache-action/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: 'Setup Ccache'
inputs:
key:
description: 'Cache key (defaults to github.job)'
required: false
default: ''
runs:
using: "composite"
steps:
- name: Setup Ccache
uses: hendrikmuhs/ccache-action@main
with:
key: ${{ inputs.key || github.job }}
save: ${{ github.repository != 'duckdb/duckdb-python' || contains('["refs/heads/main", "refs/heads/v1.4-andium", "refs/heads/v1.5-variegata"]', github.ref) }}
# Dump verbose ccache statistics report at end of CI job.
verbose: 1
# Increase per-directory limit: 5*1024 MB / 16 = 320 MB.
# Note: `layout=subdirs` computes the size limit divided by 16 dirs.
# See also: https://ccache.dev/manual/4.9.html#_cache_size_management
max-size: 1500MB
# Evicts all cache files that were not touched during the job run.
# Removing cache files from previous runs avoids creating huge caches.
evict-old-files: 'job'
127 changes: 119 additions & 8 deletions .github/workflows/packaging_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,112 @@ on:
type: string

jobs:
seed_wheels:
name: 'Seed: cp314-${{ matrix.platform.cibw_system }}_${{ matrix.platform.arch }}'
strategy:
fail-fast: false
matrix:
python: [ cp314 ]
platform:
- { os: windows-2022, arch: amd64, cibw_system: win }
- { os: windows-11-arm, arch: ARM64, cibw_system: win }
- { os: ubuntu-24.04, arch: x86_64, cibw_system: manylinux }
- { os: ubuntu-24.04-arm, arch: aarch64, cibw_system: manylinux }
- { os: macos-15, arch: arm64, cibw_system: macosx }
- { os: macos-15, arch: universal2, cibw_system: macosx }
- { os: macos-15-intel, arch: x86_64, cibw_system: macosx }
minimal:
- ${{ inputs.minimal }}
exclude:
- { minimal: true, platform: { arch: universal2 } }
runs-on: ${{ matrix.platform.os }}
env:
CCACHE_DIR: ${{ github.workspace }}/.ccache
### cibuildwheel configuration
#
# This is somewhat brittle, so be careful with changes. Some notes for our future selves (and others):
# - cibw will change its cwd to a temp dir and create a separate venv for testing. It then installs the wheel it
# built into that venv, and run the CIBW_TEST_COMMAND. We have to install all dependencies ourselves, and make
# sure that the pytest config in pyproject.toml is available.
# - CIBW_BEFORE_TEST installs the test dependencies by exporting them into a pylock.toml. At the time of writing,
# `uv sync --no-install-project` had problems correctly resolving dependencies using resolution environments
# across all platforms we build for. This might be solved in newer uv versions.
# - CIBW_TEST_COMMAND specifies pytest conf from pyproject.toml. --confcutdir is needed to prevent pytest from
# traversing the full filesystem, which produces an error on Windows.
# - CIBW_TEST_SKIP we always skip tests for *-macosx_universal2 builds, because we run tests for arm64 and x86_64.
CIBW_TEST_SKIP: ${{ inputs.testsuite == 'none' && '*' || '*-macosx_universal2' }}
CIBW_TEST_SOURCES: tests
CIBW_BEFORE_TEST: >
uv export --only-group test --no-emit-project --quiet --output-file pylock.toml --directory {project} &&
uv pip install -r pylock.toml
CIBW_TEST_COMMAND: >
uv run -v pytest --confcutdir=. --rootdir . -c {project}/pyproject.toml ${{ inputs.testsuite == 'fast' && './tests/fast' || './tests' }}

steps:
- name: Checkout DuckDB Python
uses: actions/checkout@v4
with:
ref: ${{ inputs.duckdb-python-sha }}
fetch-depth: 0
submodules: true

- name: Checkout DuckDB
shell: bash
if: ${{ inputs.duckdb-sha }}
run: |
cd external/duckdb
git fetch origin
git checkout ${{ inputs.duckdb-sha }}

- name: Set CIBW_ENVIRONMENT
shell: bash
run: |
cibw_env=""
if [[ "${{ matrix.platform.cibw_system }}" == "manylinux" ]]; then
cibw_env="CCACHE_DIR=/host${{ github.workspace }}/.ccache"
fi
if [[ -n "${{ inputs.set-version }}" ]]; then
cibw_env="${cibw_env:+$cibw_env }OVERRIDE_GIT_DESCRIBE=${{ inputs.set-version }}"
fi
if [[ -n "$cibw_env" ]]; then
echo "CIBW_ENVIRONMENT=${cibw_env}" >> $GITHUB_ENV
fi

- name: Setup Ccache
uses: ./.github/actions/ccache-action
with:
key: ${{ matrix.platform.cibw_system }}_${{ matrix.platform.arch }}

# Install Astral UV, which will be used as build-frontend for cibuildwheel
- uses: astral-sh/setup-uv@v7
with:
version: "0.9.0"
enable-cache: false
cache-suffix: -${{ matrix.python }}-${{ matrix.platform.cibw_system }}_${{ matrix.platform.arch }}

- name: Build${{ inputs.testsuite != 'none' && ' and test ' || ' ' }}wheels
uses: pypa/cibuildwheel@v3.2
env:
CIBW_ARCHS: ${{ matrix.platform.arch == 'amd64' && 'AMD64' || matrix.platform.arch }}
CIBW_BUILD: ${{ matrix.python }}-${{ matrix.platform.cibw_system }}_${{ matrix.platform.arch }}

- name: Upload wheel
uses: actions/upload-artifact@v4
with:
name: wheel-${{ matrix.python }}-${{ matrix.platform.cibw_system }}_${{ matrix.platform.arch }}
path: wheelhouse/*.whl
compression-level: 0

build_wheels:
name: 'Wheel: ${{ matrix.python }}-${{ matrix.platform.cibw_system }}_${{ matrix.platform.arch }}'
needs: seed_wheels
strategy:
fail-fast: false
matrix:
python: [ cp310, cp311, cp312, cp313, cp314 ]
python: [ cp310, cp311, cp312, cp313 ]
platform:
- { os: windows-2022, arch: amd64, cibw_system: win }
- { os: windows-11-arm, arch: ARM64, cibw_system: win } # cibw requires ARM64 to be uppercase
- { os: windows-2025, arch: amd64, cibw_system: win }
- { os: windows-11-arm, arch: ARM64, cibw_system: win }
- { os: ubuntu-24.04, arch: x86_64, cibw_system: manylinux }
- { os: ubuntu-24.04-arm, arch: aarch64, cibw_system: manylinux }
- { os: macos-15, arch: arm64, cibw_system: macosx }
Expand All @@ -46,9 +143,10 @@ jobs:
- { minimal: true, python: cp312 }
- { minimal: true, python: cp313 }
- { minimal: true, platform: { arch: universal2 } }
- { python: cp310, platform: { os: windows-11-arm, arch: ARM64 } } # too many dependency problems for win arm64
- { python: cp310, platform: { os: windows-11-arm, arch: ARM64 } }
runs-on: ${{ matrix.platform.os }}
env:
CCACHE_DIR: ${{ github.workspace }}/.ccache
### cibuildwheel configuration
#
# This is somewhat brittle, so be careful with changes. Some notes for our future selves (and others):
Expand Down Expand Up @@ -85,11 +183,24 @@ jobs:
git fetch origin
git checkout ${{ inputs.duckdb-sha }}

# Make sure that OVERRIDE_GIT_DESCRIBE is propagated to cibuildwhel's env, also when it's running linux builds
- name: Set OVERRIDE_GIT_DESCRIBE
- name: Set CIBW_ENVIRONMENT
shell: bash
if: ${{ inputs.set-version != '' }}
run: echo "CIBW_ENVIRONMENT=OVERRIDE_GIT_DESCRIBE=${{ inputs.set-version }}" >> $GITHUB_ENV
run: |
cibw_env=""
if [[ "${{ matrix.platform.cibw_system }}" == "manylinux" ]]; then
cibw_env="CCACHE_DIR=/host${{ github.workspace }}/.ccache"
fi
if [[ -n "${{ inputs.set-version }}" ]]; then
cibw_env="${cibw_env:+$cibw_env }OVERRIDE_GIT_DESCRIBE=${{ inputs.set-version }}"
fi
if [[ -n "$cibw_env" ]]; then
echo "CIBW_ENVIRONMENT=${cibw_env}" >> $GITHUB_ENV
fi

- name: Setup Ccache
uses: ./.github/actions/ccache-action
with:
key: ${{ matrix.platform.cibw_system }}_${{ matrix.platform.arch }}

# Install Astral UV, which will be used as build-frontend for cibuildwheel
- uses: astral-sh/setup-uv@v7
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ defaults:
jobs:
build_sdist:
name: Build an sdist and determine versions
if: ${{ github.ref != 'refs/heads/main' }}
uses: ./.github/workflows/packaging_sdist.yml
with:
testsuite: all
Expand Down
4 changes: 2 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ cmake_minimum_required(VERSION 3.29)

project(duckdb_py LANGUAGES CXX)

# Always use C++11
set(CMAKE_CXX_STANDARD 11)
# Always use C++17
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)
set(CMAKE_MSVC_RUNTIME_LIBRARY "MultiThreaded$<$<CONFIG:Debug>:Debug>")
Expand Down
4 changes: 3 additions & 1 deletion _duckdb-stubs/__init__.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -537,7 +537,9 @@ class DuckDBPyRelation:
def distinct(self) -> DuckDBPyRelation: ...
def except_(self, other_rel: Self) -> DuckDBPyRelation: ...
def execute(self) -> DuckDBPyRelation: ...
def explain(self, type: ExplainType | ExplainTypeLiteral = ExplainType.STANDARD) -> str: ...
def explain(
self, type: ExplainType | ExplainTypeLiteral = ExplainType.STANDARD, format: str | None = None
) -> str: ...
def favg(
self, expression: str, groups: str = "", window_spec: str = "", projected_columns: str = ""
) -> DuckDBPyRelation: ...
Expand Down
8 changes: 6 additions & 2 deletions duckdb/experimental/spark/sql/type_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
IntegerType,
LongType,
MapType,
NullType,
ShortType,
StringType,
StructField,
Expand All @@ -27,6 +28,7 @@
TimeNTZType,
TimestampMillisecondNTZType,
TimestampNanosecondNTZType,
TimestampNanosecondType,
TimestampNTZType,
TimestampSecondNTZType,
TimestampType,
Expand All @@ -41,6 +43,7 @@
)

_sqltype_to_spark_class = {
"null": NullType,
"boolean": BooleanType,
"utinyint": UnsignedByteType,
"tinyint": ByteType,
Expand All @@ -62,9 +65,10 @@
"time with time zone": TimeType,
"timestamp": TimestampNTZType,
"timestamp with time zone": TimestampType,
"timestamp_ms": TimestampNanosecondNTZType,
"timestamp_ns": TimestampMillisecondNTZType,
"timestamp_ms": TimestampMillisecondNTZType,
"timestamp_ns": TimestampNanosecondNTZType,
"timestamp_s": TimestampSecondNTZType,
"timestamptz_ns": TimestampNanosecondType,
"interval": DayTimeIntervalType,
"list": ArrayType,
"struct": StructType,
Expand Down
21 changes: 21 additions & 0 deletions duckdb/experimental/spark/sql/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@
"TimestampMillisecondNTZType",
"TimestampNTZType",
"TimestampNanosecondNTZType",
"TimestampNanosecondType",
"TimestampSecondNTZType",
"TimestampType",
"UUIDType",
Expand Down Expand Up @@ -239,6 +240,26 @@ def fromInternal(self, ts: int) -> datetime.datetime: # noqa: D102
return datetime.datetime.fromtimestamp(ts // 1000000).replace(microsecond=ts % 1000000)


class TimestampNanosecondType(AtomicType, metaclass=DataTypeSingleton):
"""Timestamp (datetime.datetime) data type with timezone information with nanosecond precision."""

def __init__(self) -> None: # noqa: D107
super().__init__(DuckDBPyType("TIMESTAMPTZ_NS"))

def needConversion(self) -> bool: # noqa: D102
return True

@classmethod
def typeName(cls) -> str: # noqa: D102
return "timestamptz_ns"

def toInternal(self, dt: datetime.datetime) -> int: # noqa: D102
raise ContributionsAcceptedError

def fromInternal(self, ts: int) -> datetime.datetime: # noqa: D102
raise ContributionsAcceptedError


class TimestampNTZType(AtomicType, metaclass=DataTypeSingleton):
"""Timestamp (datetime.datetime) data type without timezone information with microsecond precision."""

Expand Down
2 changes: 1 addition & 1 deletion duckdb_packaging/setuptools_scm_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from ._versioning import format_version, parse_version

# MAIN_BRANCH_VERSIONING should be 'True' on main branch only
MAIN_BRANCH_VERSIONING = False
MAIN_BRANCH_VERSIONING = True

SCM_PRETEND_ENV_VAR = "SETUPTOOLS_SCM_PRETEND_VERSION_FOR_DUCKDB"
SCM_GLOBAL_PRETEND_ENV_VAR = "SETUPTOOLS_SCM_PRETEND_VERSION"
Expand Down
10 changes: 8 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,12 @@ version_scheme = "duckdb_packaging.setuptools_scm_version:version_scheme"
local_scheme = "no-local-version"
fallback_version = "0.0.1.dev1"

# main only: count dev distance from the last *minor* tag (v*.*.0), so a patch
# tag (e.g. v1.5.4) merged in from a release branch can't reset .devN.
# Release branches must NOT have this, they correctly count from v*.*.* (the default).
[tool.setuptools_scm.scm.git]
describe_command = "git describe --dirty --tags --long --abbrev=40 --match v*.*.0"

# Override: if COVERAGE is set then:
# - we create a RelWithDebInfo build
# - we make sure we use a persistent build dir so we get access to the .gcda files
Expand Down Expand Up @@ -122,7 +128,7 @@ cmake.build-type = "Debug"
[[tool.scikit-build.overrides]]
if.state = "editable"
if.env.COVERAGE = false
if.platform-system = "Darwin"
if.platform-system = "(?i)darwin"
inherit.cmake.define = "append"
cmake.define.DISABLE_UNITY = "1"

Expand Down Expand Up @@ -333,7 +339,7 @@ packages = ["duckdb", "_duckdb"]
strict = true
warn_unreachable = true
pretty = true
python_version = "3.10"
python_version = "3.12"
exclude = [
"duckdb/experimental/", # not checking the pyspark API
"duckdb/query_graph/", # old and unmaintained (should probably remove)
Expand Down
16 changes: 15 additions & 1 deletion scripts/cache_data.json
Original file line number Diff line number Diff line change
Expand Up @@ -532,7 +532,9 @@
"polars.DataFrame",
"polars.LazyFrame",
"polars.col",
"polars.lit"
"polars.lit",
"polars.Series",
"polars.Decimal"
],
"required": false
},
Expand Down Expand Up @@ -822,5 +824,17 @@
"full_path": "polars.lit",
"name": "lit",
"children": []
},
"polars.Series": {
"type": "attribute",
"full_path": "polars.Series",
"name": "Series",
"children": []
},
"polars.Decimal": {
"type": "attribute",
"full_path": "polars.Decimal",
"name": "Decimal",
"children": []
}
}
2 changes: 2 additions & 0 deletions scripts/imports.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,8 @@
polars.LazyFrame
polars.col
polars.lit
polars.Series
polars.Decimal

import duckdb
import duckdb.filesystem
Expand Down
5 changes: 3 additions & 2 deletions src/duckdb_py/arrow/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# this is used for clang-tidy checks
add_library(
python_arrow OBJECT arrow_array_stream.cpp arrow_export_utils.cpp
polars_filter_pushdown.cpp pyarrow_filter_pushdown.cpp)
python_arrow OBJECT
arrow_array_stream.cpp arrow_export_utils.cpp filter_pushdown_visitor.cpp
polars_filter_pushdown.cpp pyarrow_filter_pushdown.cpp)

target_link_libraries(python_arrow PRIVATE _duckdb_dependencies)
Loading