pyarrow.Schema¶

class pyarrow.Schema¶

Bases: object

__init__()¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`add_metadata`(self, metadata)	Add metadata as dict of string keys and values to Schema
`append`(self, Field field)	Append a field at the end of the schema.
`empty_table`(self)	Provide an empty table according to the schema.
`equals`(self, other, bool check_metadata=True)	Test if this schema is equal to the other
`field_by_name`(self, name)	Access a field by its name rather than the column index.
`from_pandas`(type cls, df, …)	Returns implied schema from dataframe
`get_field_index`(self, name)
`insert`(self, int i, Field field)	Add a field at position i to the schema.
`remove`(self, int i)	Remove the field at index i from the schema.
`remove_metadata`(self)	Create new schema without metadata, if any
`serialize`(self[, memory_pool])	Write Schema to Buffer as encapsulated IPC message
`set`(self, int i, Field field)	Replace a field at position i in the schema.

Attributes

`metadata`
`names`	The schema’s field names.
`pandas_metadata`	Return deserialized-from-JSON pandas metadata field (if it exists)
`types`	The schema’s field types.

add_metadata(self, metadata)¶

Add metadata as dict of string keys and values to Schema

Parameters:	metadata (dict) – Keys and values must be string-like / coercible to bytes
Returns:	schema (pyarrow.Schema)

append(self, Field field)¶

Append a field at the end of the schema.

Parameters:	field (Field) –
Returns:	schema (Schema)

empty_table(self)¶

Provide an empty table according to the schema.

Returns:	table (pyarrow.Table)

equals(self, other, bool check_metadata=True)¶

Test if this schema is equal to the other

Parameters:	other (pyarrow.Schema) – check_metadata (bool, default False) – Key/value metadata must be equal too
Returns:	is_equal (boolean)

field_by_name(self, name)¶

Access a field by its name rather than the column index.

Parameters:	name (str) –
Returns:	field (pyarrow.Field)

from_pandas(type cls, df, bool preserve_index=True)¶

Returns implied schema from dataframe

Parameters:	df (pandas.DataFrame) – preserve_index (bool, default True) – Whether to store the index as an additional column (or columns, for MultiIndex) in the resulting Table.
Returns:	pyarrow.Schema

Examples

>>> import pandas as pd
>>> import pyarrow as pa
>>> df = pd.DataFrame({
    ...     'int': [1, 2],
    ...     'str': ['a', 'b']
    ... })
>>> pa.Schema.from_pandas(df)
int: int64
str: string
__index_level_0__: int64

get_field_index(self, name)¶

insert(self, int i, Field field)¶

Add a field at position i to the schema.

Parameters:	i (int) – field (Field) –
Returns:	schema (Schema)

metadata¶

names¶

The schema’s field names.

Returns:	list of str

pandas_metadata¶: Return deserialized-from-JSON pandas metadata field (if it exists)

remove(self, int i)¶

Remove the field at index i from the schema.

Parameters:	i (int) –
Returns:	schema (Schema)

remove_metadata(self)¶

Create new schema without metadata, if any

Returns:	schema (pyarrow.Schema)

serialize(self, memory_pool=None)¶

Write Schema to Buffer as encapsulated IPC message

Parameters:	memory_pool (MemoryPool, default None) – Uses default memory pool if not specified
Returns:	serialized (Buffer)

set(self, int i, Field field)¶

Replace a field at position i in the schema.

Parameters:	i (int) – field (Field) –
Returns:	schema (Schema)

types¶

The schema’s field types.

Returns:	list of DataType