pyarrow.Column¶

class pyarrow.Column¶

Bases: pyarrow.lib._PandasConvertible

Named vector of elements of equal type.

Warning

Do not call this class’s constructor directly.

__init__()¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`cast`(self, target_type, bool safe=True)	Cast column values to another data type
`dictionary_encode`(self)	Compute dictionary-encoded representation of array
`equals`(self, Column other)	Check if contents of two columns are equal
`flatten`(self, MemoryPool memory_pool=None)	Flatten this Column.
`from_array`(*args)
`length`(self)
`to_pandas`(self[, categories])	Convert to a pandas-compatible NumPy array or DataFrame, as appropriate
`to_pylist`(self)	Convert to a list of native Python objects.
`unique`(self)	Compute distinct elements in array

Attributes

`data`	The underlying data
`field`
`name`	Label of the column
`null_count`	Number of null entires
`shape`	Dimensions of this columns
`type`	Type information for this column

cast(self, target_type, bool safe=True)¶

Cast column values to another data type

Parameters:	target_type (DataType) – Type to cast to safe (boolean, default True) – Check for overflows or other unsafe conversions
Returns:	casted (Column)

data¶

The underlying data

Returns:	pyarrow.ChunkedArray

dictionary_encode(self)¶

Compute dictionary-encoded representation of array

Returns:	pyarrow.Column – Same chunking as the input, all chunks share a common dictionary.

equals(self, Column other)¶

Check if contents of two columns are equal

Parameters:	other (pyarrow.Column) –
Returns:	are_equal (boolean)

field¶

flatten(self, MemoryPool memory_pool=None)¶

Flatten this Column. If it has a struct type, the column is flattened into one column per struct field.

Parameters:	memory_pool (MemoryPool, default None) – For memory allocations, if required, otherwise use default pool
Returns:	result (List[Column])

static from_array(*args)¶

length(self)¶

name¶

Label of the column

Returns:	str

null_count¶

Number of null entires

Returns:	int

shape¶

Dimensions of this columns

Returns:	(int,)

to_pandas(self, categories=None, bool strings_to_categorical=False, bool zero_copy_only=False, bool integer_object_nulls=False, bool date_as_object=True, bool use_threads=True, bool deduplicate_objects=True, bool ignore_metadata=False)¶

Convert to a pandas-compatible NumPy array or DataFrame, as appropriate

Parameters:

strings_to_categorical (boolean, default False) – Encode string (UTF8) and binary types to pandas.Categorical
categories (list, default empty) – List of fields that should be returned as pandas.Categorical. Only applies to table-like data structures
zero_copy_only (boolean, default False) – Raise an ArrowException if this function call would require copying the underlying data
integer_object_nulls (boolean, default False) – Cast integers with nulls to objects
date_as_object (boolean, default False) – Cast dates to objects
use_threads (boolean, default True) – Whether to parallelize the conversion using multiple threads
deduplicate_objects (boolean, default False) – Do not create multiple copies Python objects when created, to save on memory use. Conversion will be slower
ignore_metadata (boolean, default False) – If True, do not use the ‘pandas’ metadata to reconstruct the DataFrame index, if present

Returns:

NumPy array or DataFrame depending on type of object

to_pylist(self)¶: Convert to a list of native Python objects.

type¶

Type information for this column

Returns:	pyarrow.DataType

unique(self)¶

Compute distinct elements in array

Returns:	pyarrow.Array