Data Types¶
- 
enum 
arrow::Type::type¶ Main data type enumeration.
This enumeration provides a quick way to interrogate the category of a DataType instance.
Values:
- 
NA¶ A NULL type having no physical storage.
- 
BOOL¶ Boolean as 1 bit, LSB bit-packed ordering.
- 
UINT8¶ Unsigned 8-bit little-endian integer.
- 
INT8¶ Signed 8-bit little-endian integer.
- 
UINT16¶ Unsigned 16-bit little-endian integer.
- 
INT16¶ Signed 16-bit little-endian integer.
- 
UINT32¶ Unsigned 32-bit little-endian integer.
- 
INT32¶ Signed 32-bit little-endian integer.
- 
UINT64¶ Unsigned 64-bit little-endian integer.
- 
INT64¶ Signed 64-bit little-endian integer.
- 
HALF_FLOAT¶ 2-byte floating point value
- 
FLOAT¶ 4-byte floating point value
- 
DOUBLE¶ 8-byte floating point value
- 
STRING¶ UTF8 variable-length string as List<Char>
- 
BINARY¶ Variable-length bytes (no guarantee of UTF8-ness)
- 
FIXED_SIZE_BINARY¶ Fixed-size binary. Each value occupies the same number of bytes.
- 
DATE32¶ int32_t days since the UNIX epoch
- 
DATE64¶ int64_t milliseconds since the UNIX epoch
- 
TIMESTAMP¶ Exact timestamp encoded with int64 since UNIX epoch Default unit millisecond.
- 
TIME32¶ Time as signed 32-bit integer, representing either seconds or milliseconds since midnight.
- 
TIME64¶ Time as signed 64-bit integer, representing either microseconds or nanoseconds since midnight.
- 
INTERVAL¶ YEAR_MONTH or DAY_TIME interval in SQL style.
- 
DECIMAL¶ Precision- and scale-based decimal type.
Storage type depends on the parameters.
- 
LIST¶ A list of some logical data type.
- 
STRUCT¶ Struct of logical types.
- 
UNION¶ Unions of logical types.
- 
DICTIONARY¶ Dictionary aka Category type.
- 
MAP¶ Map, a repeated struct logical type.
- 
EXTENSION¶ Custom data type, implemented by user.
- 
 
- 
class 
DataType¶ Base class for all data types.
Data types in this library are all logical. They can be expressed as either a primitive physical type (bytes or bits of some fixed size), a nested type consisting of other data types, or another data type (e.g. a timestamp encoded as an int64).
Simple datatypes may be entirely described by their Type::type id, but complex datatypes are usually parametric.
Subclassed by arrow::BinaryType, arrow::ExtensionType, arrow::FixedWidthType, arrow::NestedType, arrow::NullType
Public Functions
- 
bool 
Equals(const DataType &other, bool check_metadata = true) const¶ Return whether the types are equal.
Types that are logically convertible from one to another (e.g. List<UInt8> and Binary) are NOT equal.
Return whether the types are equal.
- 
virtual std::string 
ToString() const = 0¶ A string representation of the type, including any children.
- 
virtual std::string 
name() const = 0¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
bool 
 
Factory functions¶
These functions are recommended for creating data types. They may return new objects or existing singletons, depending on the type requested.
- 
std::shared_ptr<DataType> 
arrow::fixed_size_binary(int32_t byte_width)¶ Create a FixedSizeBinaryType instance.
- 
std::shared_ptr<DataType> 
arrow::decimal(int32_t precision, int32_t scale)¶ Create a Decimal128Type instance.
- 
std::shared_ptr<DataType> 
arrow::timestamp(TimeUnit::type unit)¶ Create a TimestampType instance from its unit.
- 
std::shared_ptr<DataType> 
arrow::timestamp(TimeUnit::type unit, const std::string &timezone)¶ Create a TimestampType instance from its unit and timezone.
- 
std::shared_ptr<DataType> 
arrow::time32(TimeUnit::type unit)¶ Create a 32-bit time type instance.
Unit can be either SECOND or MILLI
- 
std::shared_ptr<DataType> 
arrow::time64(TimeUnit::type unit)¶ Create a 64-bit time type instance.
Unit can be either MICRO or NANO
Create a StructType instance.
Create a UnionType instance.
Create a UnionType instance.
Create a DictionaryType instance.
- 
std::shared_ptr<DataType> 
arrow::boolean()¶ Return a BooleanType instance.
- 
std::shared_ptr<DataType> 
arrow::uint16()¶ Return a UInt16Type instance.
- 
std::shared_ptr<DataType> 
arrow::uint32()¶ Return a UInt32Type instance.
- 
std::shared_ptr<DataType> 
arrow::uint64()¶ Return a UInt64Type instance.
- 
std::shared_ptr<DataType> 
arrow::float16()¶ Return a HalfFloatType instance.
- 
std::shared_ptr<DataType> 
arrow::float64()¶ Return a DoubleType instance.
- 
std::shared_ptr<DataType> 
arrow::utf8()¶ Return a StringType instance.
- 
std::shared_ptr<DataType> 
arrow::binary()¶ Return a BinaryType instance.
- 
std::shared_ptr<DataType> 
arrow::date32()¶ Return a Date32Type instance.
- 
std::shared_ptr<DataType> 
arrow::date64()¶ Return a Date64Type instance.
Concrete type subclasses¶
Primitive¶
- 
class 
NullType: public arrow::DataType, public arrow::NoExtraMeta¶ Concrete type class for always-null data.
- 
class 
BooleanType: public arrow::FixedWidthType, public arrow::NoExtraMeta¶ Concrete type class for boolean data.
- 
class 
Int8Type: public arrow::detail::IntegerTypeImpl<Int8Type, Type::INT8, int8_t>¶ Concrete type class for signed 8-bit integer data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
Int16Type: public arrow::detail::IntegerTypeImpl<Int16Type, Type::INT16, int16_t>¶ Concrete type class for signed 16-bit integer data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
Int32Type: public arrow::detail::IntegerTypeImpl<Int32Type, Type::INT32, int32_t>¶ Concrete type class for signed 32-bit integer data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
Int64Type: public arrow::detail::IntegerTypeImpl<Int64Type, Type::INT64, int64_t>¶ Concrete type class for signed 64-bit integer data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
UInt8Type: public arrow::detail::IntegerTypeImpl<UInt8Type, Type::UINT8, uint8_t>¶ Concrete type class for unsigned 8-bit integer data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
UInt16Type: public arrow::detail::IntegerTypeImpl<UInt16Type, Type::UINT16, uint16_t>¶ Concrete type class for unsigned 16-bit integer data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
UInt32Type: public arrow::detail::IntegerTypeImpl<UInt32Type, Type::UINT32, uint32_t>¶ Concrete type class for unsigned 32-bit integer data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
UInt64Type: public arrow::detail::IntegerTypeImpl<UInt64Type, Type::UINT64, uint64_t>¶ Concrete type class for unsigned 64-bit integer data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
HalfFloatType: public arrow::detail::CTypeImpl<HalfFloatType, FloatingPoint, Type::HALF_FLOAT, uint16_t>¶ Concrete type class for 16-bit floating-point data.
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
FloatType: public arrow::detail::CTypeImpl<FloatType, FloatingPoint, Type::FLOAT, float>¶ Concrete type class for 32-bit floating-point data (C “float”)
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
- 
class 
DoubleType: public arrow::detail::CTypeImpl<DoubleType, FloatingPoint, Type::DOUBLE, double>¶ Concrete type class for 64-bit floating-point data (C “double”)
Public Functions
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::string 
 
Binary-like¶
- 
class 
BinaryType: public arrow::DataType, public arrow::NoExtraMeta¶ Concrete type class for variable-size binary data.
Subclassed by arrow::StringType
- 
class 
StringType: public arrow::BinaryType¶ Concrete type class for variable-size string data, utf8-encoded.
- 
class 
FixedSizeBinaryType: public arrow::FixedWidthType, public arrow::ParametricType¶ Concrete type class for fixed-size binary data.
Subclassed by arrow::DecimalType
- 
class 
Decimal128Type: public arrow::DecimalType¶ Concrete type class for 128-bit decimal data.
Nested¶
- 
class 
ListType: public arrow::NestedType¶ Concrete type class for list data.
List data is nested data where each value is a variable number of child items. Lists can be recursively nested, for example list(list(int32)).
- 
class 
StructType: public arrow::NestedType¶ Concrete type class for struct data.
Public Functions
- 
std::string 
ToString() const¶ A string representation of the type, including any children.
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
- 
std::shared_ptr<Field> 
GetFieldByName(const std::string &name) const¶ Returns null if name not found.
- 
std::vector<std::shared_ptr<Field>> 
GetAllFieldsByName(const std::string &name) const¶ Return all fields having this name.
- 
int 
GetFieldIndex(const std::string &name) const¶ Returns -1 if name not found or if there are multiple fields having the same name.
- 
std::vector<int> 
GetAllFieldIndices(const std::string &name) const¶ Return the indices of all fields having this name.
- 
std::string 
 
- 
class 
UnionType: public arrow::NestedType¶ Concrete type class for union data.
Dictionary-encoded¶
- 
class 
DictionaryType: public arrow::FixedWidthType¶ Concrete type class for dictionary data.
Public Functions
- 
std::string 
ToString() const¶ A string representation of the type, including any children.
- 
std::string 
name() const¶ A string name of the type, omitting any child fields.
- Note
 - Experimental API
 - Since
 - 0.7.0
 
Public Static Functions
Unify several dictionary types.
Compute a resulting dictionary that will allow the union of values of all input dictionary types. The input types must all have the same value type.
- Parameters
 pool: Memory pool to allocate dictionary values fromtypes: A sequence of input dictionary typesout_type: The unified dictionary typeout_transpose_maps: (optionally) A sequence of integer vectors, one per input type. Each integer vector represents the transposition of input type indices into unified type indices.
- 
std::string 
 
Fields and Schemas¶
Create a Field instance.
- Parameters
 name: the field nametype: the field value typenullable: whether the values are nullable, default truemetadata: any custom key-value metadata, default null
Create a Schema instance.
- Return
 - schema shared_ptr to Schema
 - Parameters
 fields: the schema’s fieldsmetadata: any custom key-value metadata, default null
Create a Schema instance.
- Return
 - schema shared_ptr to Schema
 - Parameters
 fields: the schema’s fields (rvalue reference)metadata: any custom key-value metadata, default null
- 
class 
Field¶ The combination of a field name and data type, with optional metadata.
Fields are used to describe the individual constituents of a nested DataType or a Schema.
A field’s metadata is represented by a KeyValueMetadata instance, which holds arbitrary key-value pairs.
Public Functions
- 
std::shared_ptr<const KeyValueMetadata> 
metadata() const¶ Return the field’s attached metadata.
- 
bool 
HasMetadata() const¶ Return whether the field has non-empty metadata.
Return a copy of this field with the given metadata attached to it.
- 
std::shared_ptr<Field> 
RemoveMetadata() const¶ Return a copy of this field without any metadata attached to it.
Return a copy of this field with the replaced type.
- 
std::string 
ToString() const¶ Return a string representation ot the field.
- 
const std::string &
name() const¶ Return the field name.
- 
bool 
nullable() const¶ Return whether the field is nullable.
- 
std::shared_ptr<const KeyValueMetadata> 
 
- 
class 
Schema¶ Sequence of arrow::Field objects describing the columns of a record batch or table data structure.
Public Functions
- 
bool 
Equals(const Schema &other, bool check_metadata = true) const¶ Returns true if all of the schema fields are equal.
- 
std::shared_ptr<Field> 
GetFieldByName(const std::string &name) const¶ Returns null if name not found.
- 
std::vector<std::shared_ptr<Field>> 
GetAllFieldsByName(const std::string &name) const¶ Return all fields having this name.
- 
int 
GetFieldIndex(const std::string &name) const¶ Returns -1 if name not found.
- 
std::vector<int> 
GetAllFieldIndices(const std::string &name) const¶ Return the indices of all fields having this name.
- 
std::shared_ptr<const KeyValueMetadata> 
metadata() const¶ The custom key-value metadata, if any.
- Return
 - metadata may be null
 
- 
std::string 
ToString() const¶ Render a string representation of the schema suitable for debugging.
Replace key-value metadata with new metadata.
- Return
 - new Schema
 - Parameters
 metadata: new KeyValueMetadata
- 
int 
num_fields() const¶ Return the number of fields (columns) in the schema.
- 
bool