pyarrow.RecordBatchStreamReader

class pyarrow.RecordBatchStreamReader(source)[source]

Bases: pyarrow.lib._RecordBatchStreamReader, pyarrow.ipc._ReadPandasOption

Reader for the Arrow streaming binary format

Parameters:source (bytes/buffer-like, pyarrow.NativeFile, or file-like Python object) – Either an in-memory buffer, or a readable file object
__init__(source)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(source) Initialize self.
get_next_batch(self)
read_all(self) Read all record batches as a pyarrow.Table
read_next_batch(self) Read next RecordBatch from the stream.
read_pandas(**options) Read contents of stream and convert to pandas.DataFrame using Table.to_pandas

Attributes

schema
get_next_batch(self)
read_all(self)

Read all record batches as a pyarrow.Table

read_next_batch(self)

Read next RecordBatch from the stream. Raises StopIteration at end of stream

read_pandas(**options)

Read contents of stream and convert to pandas.DataFrame using Table.to_pandas

Parameters:**options (arguments to forward to Table.to_pandas) –
Returns:df (pandas.DataFrame)
schema