read_parquet(path TEXT or TEXT[], optional_parameters) returns SETOF duckdb.row
#Reads a Parquet file from a storage.
Returns DuckDB rows. You can expand
these rows using * or select specific columns
using the r['column_name']
syntax. To select specific columns, give the function call an
easy alias, such as r:
SELECT * FROM read_parquet('file.parquet');
SELECT r['id'], r['name'] FROM read_parquet('file.parquet') r WHERE r['age'] > 21;
SELECT COUNT(*) FROM read_parquet('file.parquet');
Required parameters:
|
Name |
Type |
Description |
|---|---|---|
|
|
|
The path to the storage with Parquet files to read. It can be a glob or array of files. |
Optional parameters mirror the
DuckDB
read_parquet function. To specify
optional parameters, use parameter := 'value'.
read_csv(path TEXT or TEXT[], optional_parameters) returns SETOF duckdb.row
#Reads a CSV file from a storage.
Returns DuckDB rows. You can expand these
rows using * or select specific columns
using the r['column_name'] syntax.
To select specific columns, give the function call an
easy alias, such as r:
SELECT * FROM read_csv('file.csv');
SELECT r['id'], r['name'] FROM read_csv('file.csv') r WHERE r['age'] > 21;
SELECT COUNT(*) FROM read_csv('file.csv');
Required parameters:
|
Name |
Type |
Description |
|---|---|---|
|
|
|
The path to the storage with CSV files to read. It can be a glob or array of files. |
Optional parameters mirror the
DuckDB
read_csv function. To specify optional
parameters, use parameter := 'value'.
columns is not currently supported
nullstr must be an array
(TEXT[])
read_json(path TEXT or TEXT[], optional_parameters) returns SETOF duckdb.row
#Reads a JSON file from a storage.
Returns DuckDB rows. You can expand
these rows using * or select specific columns
using the r['column_name']
syntax. To select specific columns, give the function call an
easy alias, such as r:
Example 21.1.
SELECT * FROM read_parquet('file.parquet');
SELECT r['id'], r['name'] FROM read_parquet('file.parquet') r WHERE r['age'] > 21;
SELECT COUNT(*) FROM read_parquet('file.parquet');
Required parameters:
|
Name |
Type |
Description |
|---|---|---|
|
|
|
The path to the storage with JSON files to read. It can be a glob or array of files. |
Optional parameters mirror the
DuckDB
read_json function. To specify optional
parameters, use parameter := 'value'.
columns is not currently supported.
iceberg_scan(path TEXT, optional_parameters) returns SETOF duckdb.row
#Reads an iceberg table from a storage.
Returns DuckDB rows. You can expand these
rows using * or select specific columns
using the r['column_name'] syntax.
To select specific columns, give the function call an
easy alias, such as r:
Example 21.2.
SELECT * FROM iceberg_scan('data/iceberg/table');
SELECT r['id'], r['name'] FROM iceberg_scan('data/iceberg/table') r WHERE r['age'] > 21;
SELECT COUNT(*) FROM iceberg_scan('data/iceberg/table');
To use iceberg_scan, enable
the iceberg extension:
SELECT duckdb.install_extension('iceberg');
Required parameters:
|
Name |
Type |
Description |
|---|---|---|
|
|
|
The path to the storage with the iceberg table to read. |
Optional parameters mirror the DuckDB
iceberg_scan function. To specify optional
parameters, use parameter := 'value'.
Optional parameters:
|
Name |
Type |
Default value |
Description |
|---|---|---|---|
|
|
|
false |
Ensures that path resolution is performed, which allows scanning iceberg tables that are moved. |
|
|
|
'' |
— |
|
|
|
'none'
|
— |
|
|
|
false |
— |
|
|
|
|
— |
|
|
|
|
— |
iceberg_metadata(path TEXT, optional_parameters) returns SETOF iceberg_metadata_record
#
Returns metadata of an iceberg table.
Metadata is returned as a set of icerberg_metadata_record,
which is defined as follows:
CREATE TYPE duckdb.iceberg_metadata_record AS (
manifest_path TEXT,
manifest_sequence_number NUMERIC,
manifest_content TEXT,
status TEXT,
content TEXT,
file_path TEXT
);
To use iceberg_metadata, enable the
iceberg extension:
SELECT duckdb.install_extension('iceberg');
Required parameters:
|
Name |
Type |
Description |
|---|---|---|
|
|
|
The path to the storage with the iceberg table to read. |
Optional parameters mirror the DuckDB
iceberg_metadata function. To specify optional
parameters, use parameter := 'value'.
Optional parameters:
|
Name |
Type |
Default value |
Description |
|---|---|---|---|
|
|
|
false |
Ensures that path resolution is performed, which allows scanning iceberg tables that are moved. |
|
|
|
'none'
|
— |
|
|
|
false |
— |
|
|
|
'version-hint.text'
|
— |
|
|
|
|
iceberg_snapshots (path TEXT, optional_parameters) returns SETOF iceberg_snapshot_record
#Reads snapshot information from an iceberg table.
Returns snapshot metadata for an iceberg table, which can be useful for "time travel" queries and understanding table history.
Example 21.3.
SELECT * FROM iceberg_snapshots('data/iceberg/table');
To use iceberg_snapshots, enable the
iceberg extension:
SELECT duckdb.install_extension('iceberg');
Required parameters:
|
Name |
Type |
Description |
|---|---|---|
|
|
|
The path to the storage with the iceberg table to read. |
Optional parameters mirror the DuckDB iceberg_snapshots
function. To specify optional parameters, use parameter := 'value'.
Optional parameters:
|
Name |
Type |
Default value |
|---|---|---|
|
|
|
|
|
|
|
false |
|
|
|
'version-hint.text'
|
|
|
|
|
delta_scan(path TEXT) returns SETOF duckdb.row
#Reads a Delta dataset from a storage.
Returns DuckDB rows. You can expand these
rows using * or select specific columns
using the r['column_name'] syntax. To
select specific columns, give the function call an
easy alias, such as r:
Example 21.4.
SELECT * FROM delta_scan('/path/to/delta/dataset');
SELECT r['id'], r['name'] FROM delta_scan('/path/to/delta/dataset') r WHERE r['age'] > 21;
SELECT COUNT(*) FROM delta_scan('/path/to/delta/dataset');
To use delta_scan, enable the
delta extension:
SELECT duckdb.install_extension('delta');
Required parameters:
|
Name |
Type |
Description |
|---|---|---|
|
|
|
The path to the storage with the Delta dataset to read. |