pyarrow.RecordBatchFileReader¶

class pyarrow.RecordBatchFileReader(source, footer_offset=None)[source]¶

Bases: pyarrow.lib._RecordBatchFileReader, pyarrow.ipc._ReadPandasOption

Class for reading Arrow record batch data from the Arrow binary file format

Parameters:	source (bytes/buffer-like, pyarrow.NativeFile, or file-like Python object) – Either an in-memory buffer, or a readable file object footer_offset (int, default None) – If the file is embedded in some larger file, this is the byte offset to the very end of the file data

__init__(source, footer_offset=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`(source[, footer_offset])	Initialize self.
`get_batch`(self, int i)
`get_record_batch`	_RecordBatchFileReader.get_batch(self, int i)
`read_all`(self)	Read all record batches as a pyarrow.Table
`read_pandas`(**options)	Read contents of stream and convert to pandas.DataFrame using Table.to_pandas

Attributes

`num_record_batches`
`schema`

read_pandas(**options)¶

Read contents of stream and convert to pandas.DataFrame using Table.to_pandas

Parameters:	*options (arguments to forward to Table.to_pandas*) –
Returns:	df (pandas.DataFrame)