pyarrow.Schema¶
-
class
pyarrow.
Schema
¶ Bases:
object
-
__init__
()¶ Initialize self. See help(type(self)) for accurate signature.
Methods
add_metadata
(self, metadata)Add metadata as dict of string keys and values to Schema append
(self, Field field)Append a field at the end of the schema. empty_table
(self)Provide an empty table according to the schema. equals
(self, other, bool check_metadata=True)Test if this schema is equal to the other field_by_name
(self, name)Access a field by its name rather than the column index. from_pandas
(type cls, df, …)Returns implied schema from dataframe get_field_index
(self, name)insert
(self, int i, Field field)Add a field at position i to the schema. remove
(self, int i)Remove the field at index i from the schema. remove_metadata
(self)Create new schema without metadata, if any serialize
(self[, memory_pool])Write Schema to Buffer as encapsulated IPC message set
(self, int i, Field field)Replace a field at position i in the schema. Attributes
metadata
names
The schema’s field names. pandas_metadata
Return deserialized-from-JSON pandas metadata field (if it exists) types
The schema’s field types. -
add_metadata
(self, metadata)¶ Add metadata as dict of string keys and values to Schema
Parameters: metadata (dict) – Keys and values must be string-like / coercible to bytes Returns: schema (pyarrow.Schema)
-
append
(self, Field field)¶ Append a field at the end of the schema.
Parameters: field (Field) – Returns: schema (Schema)
-
empty_table
(self)¶ Provide an empty table according to the schema.
Returns: table (pyarrow.Table)
-
equals
(self, other, bool check_metadata=True)¶ Test if this schema is equal to the other
Parameters: - other (pyarrow.Schema) –
- check_metadata (bool, default False) – Key/value metadata must be equal too
Returns: is_equal (boolean)
-
field_by_name
(self, name)¶ Access a field by its name rather than the column index.
Parameters: name (str) – Returns: field (pyarrow.Field)
-
from_pandas
(type cls, df, bool preserve_index=True)¶ Returns implied schema from dataframe
Parameters: - df (pandas.DataFrame) –
- preserve_index (bool, default True) – Whether to store the index as an additional column (or columns, for MultiIndex) in the resulting Table.
Returns: pyarrow.Schema
Examples
>>> import pandas as pd >>> import pyarrow as pa >>> df = pd.DataFrame({ ... 'int': [1, 2], ... 'str': ['a', 'b'] ... }) >>> pa.Schema.from_pandas(df) int: int64 str: string __index_level_0__: int64
-
get_field_index
(self, name)¶
-
insert
(self, int i, Field field)¶ Add a field at position i to the schema.
Parameters: - i (int) –
- field (Field) –
Returns: schema (Schema)
-
metadata
¶
-
names
¶ The schema’s field names.
Returns: list of str
-
pandas_metadata
¶ Return deserialized-from-JSON pandas metadata field (if it exists)
-
remove
(self, int i)¶ Remove the field at index i from the schema.
Parameters: i (int) – Returns: schema (Schema)
-
remove_metadata
(self)¶ Create new schema without metadata, if any
Returns: schema (pyarrow.Schema)
-
serialize
(self, memory_pool=None)¶ Write Schema to Buffer as encapsulated IPC message
Parameters: memory_pool (MemoryPool, default None) – Uses default memory pool if not specified Returns: serialized (Buffer)
-
set
(self, int i, Field field)¶ Replace a field at position i in the schema.
Parameters: - i (int) –
- field (Field) –
Returns: schema (Schema)
-
types
¶ The schema’s field types.
Returns: list of DataType
-