pyarrow.parquet.ParquetWriter¶

class pyarrow.parquet.ParquetWriter(where, schema, filesystem=None, flavor=None, version='1.0', use_dictionary=True, compression='snappy', use_deprecated_int96_timestamps=None, **options)[source]¶

Bases: object

Class for incrementally building a Parquet file for Arrow tables

Parameters:

where (path or file-like object) –
schema (arrow Schema) –
version ({"1.0", "2.0"}, default "1.0") – The Parquet format version, defaults to 1.0
use_dictionary (bool or list) – Specify if we should use dictionary encoding in general or only for some columns.
use_deprecated_int96_timestamps (boolean, default None) – Write timestamps to INT96 Parquet format. Defaults to False unless enabled by flavor argument. This take priority over the coerce_timestamps option.
coerce_timestamps (string, default None) – Cast timestamps a particular resolution. Valid values: {None, ‘ms’, ‘us’}
allow_truncated_timestamps (boolean, default False) – Allow loss of data when coercing timestamps to a particular resolution. E.g. if microsecond or nanosecond data is lost when coercing to ‘ms’, do not raise an exception
compression (str or dict) – Specify the compression codec, either on a general basis or per-column. Valid values: {‘NONE’, ‘SNAPPY’, ‘GZIP’, ‘LZO’, ‘BROTLI’, ‘LZ4’, ‘ZSTD’}
flavor ({'spark'}, default None) – Sanitize schema or set other compatibility options for compatibility
filesystem (FileSystem, default None) – If nothing passed, will be inferred from where if path-like, else where is already a file-like object so no filesystem is needed.

__init__(where, schema, filesystem=None, flavor=None, version='1.0', use_dictionary=True, compression='snappy', use_deprecated_int96_timestamps=None, **options)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`(where, schema[, filesystem, …])	Initialize self.
`close`()
`write_table`(table[, row_group_size])

close()[source]¶

write_table(table, row_group_size=None)[source]¶