Skip to content

Inconsistency in from_parquet and read_parquet wrt. file/glob list #26

Description

@dsevilla

What happens?

As can be seen in https://github.com/duckdb/duckdb-python/blob/main/src/duckdb_py/duckdb_python.cpp#L707, both from_parquet and read_parquet allow the first parameter to be a string specifying a file or glob or a list of files/globs.

In the python equivalent functions, https://github.com/duckdb/duckdb-python/blob/main/duckdb/__init__.pyi#L348, both appear just with an 'str' argument.

I know these files are auto-generated, but they also give instructions on how to specify that these are tweaks after the generation. I'll add them as a patch so that they can be seen by maintainers.

To Reproduce

duckdb.read_parquet([parquet1, parquet2], ...)
duckdb.from_parquet([parquet1, parquet2], ...)

works, but they are not recognized correctly by type checkers or IDEs.

OS:

OSX, Linux

DuckDB Package Version:

latest

Python Version:

all

Full Name:

Diego Sevilla Ruiz

Affiliation:

University of Murcia

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions