Memory (management)¶
Buffers¶
-
class
Buffer
¶ Object containing a pointer to a piece of contiguous memory with a particular size.
Buffers have two related notions of length: size and capacity. Size is the number of bytes that might have valid data. Capacity is the number of bytes that were allocated for the buffer in total.
The Buffer base class does not own its memory, but subclasses often do.
The following invariant is always true: Size <= Capacity
Subclassed by arrow::cuda::CudaBuffer, arrow::MutableBuffer, arrow::py::NumPyBuffer, arrow::py::PyBuffer, arrow::py::PyForeignBuffer
Public Functions
-
Buffer
(const uint8_t *data, int64_t size)¶ Construct from buffer and size without copying memory.
- Note
- The passed memory must be kept alive through some other means
- Parameters
data
: a memory buffersize
: buffer size
-
Buffer
(const std::string &data)¶ Construct from std::string without copying memory.
- Note
- The std::string must stay alive for the lifetime of the Buffer, so temporary rvalue strings must be stored in an lvalue somewhere
- Parameters
data
: a std::string object
An offset into data that is owned by another buffer, but we want to be able to retain a valid pointer to it even after other shared_ptr’s to the parent buffer have been destroyed.
This method makes no assertions about alignment or padding of the buffer but in general we expected buffers to be aligned and padded to 64 bytes. In the future we might add utility methods to help determine if a buffer satisfies this contract.
-
bool
Equals
(const Buffer &other, int64_t nbytes) const¶ Return true if both buffers are the same size and contain the same bytes up to the number of compared bytes.
-
bool
Equals
(const Buffer &other) const¶ Return true if both buffers are the same size and contain the same bytes.
Copy a section of the buffer into a new Buffer.
Copy a section of the buffer using the default memory pool into a new Buffer.
-
void
ZeroPadding
()¶ Zero bytes in padding, i.e. bytes between size_ and capacity_.
-
std::string
ToString
() const¶ Copy buffer contents into a new std::string.
- Return
- std::string
- Note
- Can throw std::bad_alloc if buffer is large
-
const uint8_t *
data
() const¶ Return a pointer to the buffer’s data.
-
uint8_t *
mutable_data
()¶ Return a writable pointer to the buffer’s data.
The buffer has to be mutable. Otherwise, an assertion may be thrown or a null pointer may be returned.
-
int64_t
size
() const¶ Return the buffer’s size in bytes.
-
int64_t
capacity
() const¶ Return the buffer’s capacity (number of allocated bytes)
Public Static Functions
Construct a new buffer that owns its memory from a std::string.
- Return
- Status message
- Parameters
data
: a std::string objectpool
: a memory poolout
: the created buffer
Construct a new buffer that owns its memory from a std::string using the default memory pool.
-
static std::shared_ptr<Buffer>
FromString
(std::string &&data)¶ Construct an immutable buffer that takes ownership of the contents of an std::string.
- Return
- a new Buffer instance
- Parameters
data
: an rvalue-reference of a string
-
template <typename T, typename SizeType = int64_t>
static std::shared_ptr<Buffer>Wrap
(const T *data, SizeType length)¶ Create buffer referencing typed memory with some length without copying.
- Return
- a new shared_ptr<Buffer>
- Parameters
data
: the typed memory as C arraylength
: the number of values in the array
-
-
class
MutableBuffer
: public arrow::Buffer¶ A Buffer whose contents can be mutated.
May or may not own its data.
Subclassed by arrow::cuda::CudaHostBuffer, arrow::ResizableBuffer
Public Static Functions
-
class
ResizableBuffer
: public arrow::MutableBuffer¶ A mutable buffer that can be resized.
Public Functions
-
virtual Status
Resize
(const int64_t new_size, bool shrink_to_fit = true) = 0¶ Change buffer reported size to indicated size, allocating memory if necessary.
This will ensure that the capacity of the buffer is a multiple of 64 bytes as defined in Layout.md. Consider using ZeroPadding afterwards, to conform to the Arrow layout specification.
- Parameters
new_size
: The new size for the buffer.shrink_to_fit
: Whether to shrink the capacity if new size < current size
-
virtual Status
Memory Pools¶
-
MemoryPool *
arrow
::
default_memory_pool
()¶ Return the process-wide default memory pool.
-
class
MemoryPool
¶ Base class for memory allocation.
Besides tracking the number of allocated bytes, the allocator also should take care of the required 64-byte alignment.
Subclassed by arrow::LoggingMemoryPool, arrow::ProxyMemoryPool, arrow::STLMemoryPool< Allocator >
Public Functions
-
virtual Status
Allocate
(int64_t size, uint8_t **out) = 0¶ Allocate a new memory region of at least size bytes.
The allocated region shall be 64-byte aligned.
-
virtual Status
Reallocate
(int64_t old_size, int64_t new_size, uint8_t **ptr) = 0¶ Resize an already allocated memory section.
As by default most default allocators on a platform don’t support aligned reallocation, this function can involve a copy of the underlying data.
-
virtual void
Free
(uint8_t *buffer, int64_t size) = 0¶ Free an allocated region.
- Parameters
buffer
: Pointer to the start of the allocated memory regionsize
: Allocated size located at buffer. An allocator implementation may use this for tracking the amount of allocated bytes as well as for faster deallocation if supported by its backend.
-
virtual int64_t
bytes_allocated
() const = 0¶ The number of bytes that were allocated and not yet free’d through this allocator.
-
virtual int64_t
max_memory
() const¶ Return peak memory allocation in this memory pool.
- Return
- Maximum bytes allocated. If not known (or not implemented), returns -1
Public Static Functions
-
static std::unique_ptr<MemoryPool>
CreateDefault
()¶ EXPERIMENTAL. Create a new instance of the default MemoryPool.
-
virtual Status
-
class
LoggingMemoryPool
: public arrow::MemoryPool¶ Public Functions
-
Status
Allocate
(int64_t size, uint8_t **out)¶ Allocate a new memory region of at least size bytes.
The allocated region shall be 64-byte aligned.
-
Status
Reallocate
(int64_t old_size, int64_t new_size, uint8_t **ptr)¶ Resize an already allocated memory section.
As by default most default allocators on a platform don’t support aligned reallocation, this function can involve a copy of the underlying data.
-
void
Free
(uint8_t *buffer, int64_t size)¶ Free an allocated region.
- Parameters
buffer
: Pointer to the start of the allocated memory regionsize
: Allocated size located at buffer. An allocator implementation may use this for tracking the amount of allocated bytes as well as for faster deallocation if supported by its backend.
-
int64_t
bytes_allocated
() const¶ The number of bytes that were allocated and not yet free’d through this allocator.
-
int64_t
max_memory
() const¶ Return peak memory allocation in this memory pool.
- Return
- Maximum bytes allocated. If not known (or not implemented), returns -1
-
Status
-
class
ProxyMemoryPool
: public arrow::MemoryPool¶ Derived class for memory allocation.
Tracks the number of bytes and maximum memory allocated through its direct calls. Actual allocation is delegated to MemoryPool class.
Public Functions
-
Status
Allocate
(int64_t size, uint8_t **out)¶ Allocate a new memory region of at least size bytes.
The allocated region shall be 64-byte aligned.
-
Status
Reallocate
(int64_t old_size, int64_t new_size, uint8_t **ptr)¶ Resize an already allocated memory section.
As by default most default allocators on a platform don’t support aligned reallocation, this function can involve a copy of the underlying data.
-
void
Free
(uint8_t *buffer, int64_t size)¶ Free an allocated region.
- Parameters
buffer
: Pointer to the start of the allocated memory regionsize
: Allocated size located at buffer. An allocator implementation may use this for tracking the amount of allocated bytes as well as for faster deallocation if supported by its backend.
-
int64_t
bytes_allocated
() const¶ The number of bytes that were allocated and not yet free’d through this allocator.
-
int64_t
max_memory
() const¶ Return peak memory allocation in this memory pool.
- Return
- Maximum bytes allocated. If not known (or not implemented), returns -1
-
Status
Allocation Functions¶
These functions allocate a buffer from a particular memory pool.
Allocate a fixed size mutable buffer from a memory pool, zero its padding.
- Return
- Status message
- Parameters
pool
: a memory poolsize
: size of buffer to allocateout
: the allocated buffer (contains padding)
-
Status
arrow
::
AllocateBuffer
(MemoryPool *pool, const int64_t size, std::unique_ptr<Buffer> *out)¶ Allocate a fixed size mutable buffer from a memory pool, zero its padding.
- Return
- Status message
- Parameters
pool
: a memory poolsize
: size of buffer to allocateout
: the allocated buffer (contains padding)
Allocate a fixed-size mutable buffer from the default memory pool.
- Return
- Status message
- Parameters
size
: size of buffer to allocateout
: the allocated buffer (contains padding)
-
Status
arrow
::
AllocateBuffer
(const int64_t size, std::unique_ptr<Buffer> *out)¶ Allocate a fixed-size mutable buffer from the default memory pool.
- Return
- Status message
- Parameters
size
: size of buffer to allocateout
: the allocated buffer (contains padding)
Allocate a resizeable buffer from a memory pool, zero its padding.
- Return
- Status message
- Parameters
pool
: a memory poolsize
: size of buffer to allocateout
: the allocated buffer
-
Status
arrow
::
AllocateResizableBuffer
(MemoryPool *pool, const int64_t size, std::unique_ptr<ResizableBuffer> *out)¶ Allocate a resizeable buffer from a memory pool, zero its padding.
- Return
- Status message
- Parameters
pool
: a memory poolsize
: size of buffer to allocateout
: the allocated buffer
Allocate a resizeable buffer from the default memory pool.
- Return
- Status message
- Parameters
size
: size of buffer to allocateout
: the allocated buffer
-
Status
arrow
::
AllocateResizableBuffer
(const int64_t size, std::unique_ptr<ResizableBuffer> *out)¶ Allocate a resizeable buffer from the default memory pool.
- Return
- Status message
- Parameters
size
: size of buffer to allocateout
: the allocated buffer
Allocate a bitmap buffer from a memory pool no guarantee on values is provided.
- Return
- Status message
- Parameters
pool
: memory pool to allocate memory fromlength
: size in bits of bitmap to allocateout
: the resulting buffer
Allocate a zero-initialized bitmap buffer from a memory pool.
- Return
- Status message
- Parameters
pool
: memory pool to allocate memory fromlength
: size in bits of bitmap to allocateout
: the resulting buffer (zero-initialized).
Allocate a zero-initialized bitmap buffer from the default memory pool.
- Return
- Status message
- Parameters
length
: size in bits of bitmap to allocateout
: the resulting buffer
Slicing¶
Construct a view on a buffer at the given offset and length.
This function cannot fail and does not check for errors (except in debug builds)
Construct a view on a buffer at the given offset, up to the buffer’s end.
This function cannot fail and does not check for errors (except in debug builds)
Like SliceBuffer, but construct a mutable buffer slice.
If the parent buffer is not mutable, behavior is undefined (it may abort in debug builds).
Buffer Builders¶
-
class
BufferBuilder
¶ A class for incrementally building a contiguous chunk of in-memory data.
Public Functions
-
Status
Resize
(const int64_t new_capacity, bool shrink_to_fit = true)¶ Resize the buffer to the nearest multiple of 64 bytes.
- Return
- Status
- Parameters
new_capacity
: the new capacity of the of the builder. Will be rounded up to a multiple of 64 bytes for paddingshrink_to_fit
: if new capacity is smaller than the existing size, reallocate internal buffer. Set to false to avoid reallocations when shrinking the builder.
-
Status
Reserve
(const int64_t additional_bytes, bool grow_by_factor = false)¶ Ensure that builder can accommodate the additional number of bytes without the need to perform allocations.
- Return
- Status
- Parameters
additional_bytes
: number of additional bytes to make space forgrow_by_factor
: if true, round up allocations using the strategy in BufferBuilder::GrowByFactor
-
Status
Append
(const void *data, const int64_t length)¶ Append the given data to the buffer.
The buffer is automatically expanded if necessary.
-
Status
Append
(const int64_t num_copies, uint8_t value)¶ Append copies of a value to the buffer.
The buffer is automatically expanded if necessary.
-
template <size_t NBYTES>
StatusAppend
(const std::array<uint8_t, NBYTES> &data)¶ Append the given data to the buffer.
The buffer is automatically expanded if necessary.
Return result of builder as a Buffer object.
The builder is reset and can be reused afterwards.
Public Static Functions
-
static int64_t
GrowByFactor
(const int64_t min_capacity)¶ Return a capacity expanded by a growth factor of 2.
-
Status
-
template <typename T, typename Enable = void>
classTypedBufferBuilder
¶
STL Integration¶
-
template <class T>
classstl_allocator
¶ A STL allocator delegating allocations to a Arrow MemoryPool.
Public Functions
-
stl_allocator
()¶ Construct an allocator from the default MemoryPool.
-
stl_allocator
(MemoryPool *pool)¶ Construct an allocator from the given MemoryPool.
-
-
template <typename Allocator = std::allocator<uint8_t>>
classSTLMemoryPool
: public arrow::MemoryPool¶ A MemoryPool implementation delegating allocations to a STL allocator.
Note that STL allocators don’t provide a resizing operation, and therefore any buffer resizes will do a full reallocation and copy.
Public Functions
-
STLMemoryPool
(const Allocator &alloc)¶ Construct a memory pool from the given allocator.
-
Status
Allocate
(int64_t size, uint8_t **out)¶ Allocate a new memory region of at least size bytes.
The allocated region shall be 64-byte aligned.
-
Status
Reallocate
(int64_t old_size, int64_t new_size, uint8_t **ptr)¶ Resize an already allocated memory section.
As by default most default allocators on a platform don’t support aligned reallocation, this function can involve a copy of the underlying data.
-
void
Free
(uint8_t *buffer, int64_t size)¶ Free an allocated region.
- Parameters
buffer
: Pointer to the start of the allocated memory regionsize
: Allocated size located at buffer. An allocator implementation may use this for tracking the amount of allocated bytes as well as for faster deallocation if supported by its backend.
-
int64_t
bytes_allocated
() const¶ The number of bytes that were allocated and not yet free’d through this allocator.
-
int64_t
max_memory
() const¶ Return peak memory allocation in this memory pool.
- Return
- Maximum bytes allocated. If not known (or not implemented), returns -1
-