pyarrow.parquet.write_to_dataset

pyarrow.parquet.write_to_dataset(table, root_path, partition_cols=None, filesystem=None, **kwargs)[source]

Wrapper around parquet.write_table for writing a Table to Parquet format by partitions. For each combination of partition columns and values, a subdirectories are created in the following manner:

root_dir/
group1=value1
group2=value1
<uuid>.parquet
group2=value2
<uuid>.parquet
group1=valueN
group2=value1
<uuid>.parquet
group2=valueN
<uuid>.parquet
Parameters:
  • table (pyarrow.Table) –
  • root_path (string,) – The root directory of the dataset
  • filesystem (FileSystem, default None) – If nothing passed, paths assumed to be found in the local on-disk filesystem
  • partition_cols (list,) – Column names by which to partition the dataset Columns are partitioned in the order they are given
  • **kwargs (dict, kwargs for write_table function.) –