pyarrow.plasma.PlasmaClient¶
-
class
pyarrow.plasma.
PlasmaClient
¶ Bases:
object
The PlasmaClient is used to interface with a plasma store and manager.
The PlasmaClient can ask the PlasmaStore to allocate a new buffer, seal a buffer, and get a buffer. Buffers are referred to by object IDs, which are strings.
-
__init__
()¶ Initialize self. See help(type(self)) for accurate signature.
Methods
contains
(self, ObjectID object_id)Check if the object is present and sealed in the PlasmaStore. create
(self, ObjectID object_id, …)Create a new buffer in the PlasmaStore for a particular object ID. create_and_seal
(self, ObjectID object_id, …)Store a new object in the PlasmaStore for a particular object ID. decode_notification
(self, const uint8_t *buf)Get the notification from the buffer. delete
(self, object_ids)Delete the objects with the given IDs from other object store. disconnect
(self)Disconnect this client from the Plasma store. evict
(self, int64_t num_bytes)Evict some objects until to recover some bytes. get
(self, object_ids, int timeout_ms=-1[, …])Get one or more Python values from the object store. get_buffers
(self, object_ids[, timeout_ms, …])Returns data buffer from the PlasmaStore based on object ID. get_metadata
(self, object_ids[, timeout_ms])Returns metadata buffer from the PlasmaStore based on object ID. get_next_notification
(self)Get the next notification from the notification socket. get_notification_socket
(self)Get the notification socket. hash
(self, ObjectID object_id)Compute the checksum of an object in the object store. list
(self)Experimental: List the objects in the store. put
(self, value, ObjectID object_id=None, …)Store a Python value into the object store. put_raw_buffer
(self, value, …)Store Python buffer into the object store. seal
(self, ObjectID object_id)Seal the buffer in the PlasmaStore for a particular object ID. store_capacity
(self)Get the memory capacity of the store. subscribe
(self)Subscribe to notifications about sealed objects. to_capsule
(self)Attributes
store_socket_name
-
contains
(self, ObjectID object_id)¶ Check if the object is present and sealed in the PlasmaStore.
Parameters: object_id (ObjectID) – A string used to identify an object.
-
create
(self, ObjectID object_id, int64_t data_size, string metadata=b'')¶ Create a new buffer in the PlasmaStore for a particular object ID.
The returned buffer is mutable until seal is called.
Parameters: - object_id (ObjectID) – The object ID used to identify an object.
- size (int) – The size in bytes of the created buffer.
- metadata (bytes) – An optional string of bytes encoding whatever metadata the user wishes to encode.
Raises: PlasmaObjectExists
– This exception is raised if the object could not be created because there already is an object with the same ID in the plasma store.- PlasmaStoreFull: This exception is raised if the object could – not be created because the plasma store is unable to evict enough objects to create room for it.
-
create_and_seal
(self, ObjectID object_id, string data, string metadata=b'')¶ Store a new object in the PlasmaStore for a particular object ID.
Parameters: - object_id (ObjectID) – The object ID used to identify an object.
- data (bytes) – The object to store.
- metadata (bytes) – An optional string of bytes encoding whatever metadata the user wishes to encode.
Raises: PlasmaObjectExists
– This exception is raised if the object could not be created because there already is an object with the same ID in the plasma store.- PlasmaStoreFull: This exception is raised if the object could – not be created because the plasma store is unable to evict enough objects to create room for it.
-
decode_notification
(self, const uint8_t *buf)¶ Get the notification from the buffer.
Returns: - ObjectID – The object ID of the object that was stored.
- int – The data size of the object that was stored.
- int – The metadata size of the object that was stored.
-
delete
(self, object_ids)¶ Delete the objects with the given IDs from other object store.
Parameters: object_ids (list) – A list of strings used to identify the objects.
-
disconnect
(self)¶ Disconnect this client from the Plasma store.
-
evict
(self, int64_t num_bytes)¶ Evict some objects until to recover some bytes.
Recover at least num_bytes bytes if possible.
Parameters: num_bytes (int) – The number of bytes to attempt to recover.
-
get
(self, object_ids, int timeout_ms=-1, serialization_context=None)¶ Get one or more Python values from the object store.
Parameters: - object_ids (list or ObjectID) – Object ID or list of object IDs associated to the values we get from the store.
- timeout_ms (int, default -1) – The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.
- serialization_context (pyarrow.SerializationContext, default None) – Custom serialization and deserialization context.
Returns: list or object – Python value or list of Python values for the data associated with the object_ids and ObjectNotAvailable if the object was not available.
-
get_buffers
(self, object_ids, timeout_ms=-1, with_meta=False)¶ Returns data buffer from the PlasmaStore based on object ID.
If the object has not been sealed yet, this call will block. The retrieved buffer is immutable.
Parameters: - object_ids (list) – A list of ObjectIDs used to identify some objects.
- timeout_ms (int) – The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.
- with_meta (bool) –
Returns: list – If with_meta=False, this is a list of PlasmaBuffers for the data associated with the object_ids and None if the object was not available. If with_meta=True, this is a list of tuples of PlasmaBuffer and metadata bytes.
-
get_metadata
(self, object_ids, timeout_ms=-1)¶ Returns metadata buffer from the PlasmaStore based on object ID.
If the object has not been sealed yet, this call will block. The retrieved buffer is immutable.
Parameters: - object_ids (list) – A list of ObjectIDs used to identify some objects.
- timeout_ms (int) – The number of milliseconds that the get call should block before timing out and returning. Pass -1 if the call should block and 0 if the call should return immediately.
Returns: list – List of PlasmaBuffers for the metadata associated with the object_ids and None if the object was not available.
-
get_next_notification
(self)¶ Get the next notification from the notification socket.
Returns: - ObjectID – The object ID of the object that was stored.
- int – The data size of the object that was stored.
- int – The metadata size of the object that was stored.
-
get_notification_socket
(self)¶ Get the notification socket.
-
hash
(self, ObjectID object_id)¶ Compute the checksum of an object in the object store.
Parameters: object_id (ObjectID) – A string used to identify an object. Returns: bytes – A digest string object’s hash. If the object isn’t in the object store, the string will have length zero.
-
list
(self)¶ Experimental: List the objects in the store.
Returns: dict – Dictionary from ObjectIDs to an “info” dictionary describing the object. The “info” dictionary has the following entries: - data_size
- size of the object in bytes
- metadata_size
- size of the object metadata in bytes
- ref_count
- Number of clients referencing the object buffer
- create_time
- Unix timestamp of the creation of the object
- construct_duration
- Time the creation of the object took in seconds
- state
- ”created” if the object is still being created and “sealed” if it is already sealed
-
put
(self, value, ObjectID object_id=None, int memcopy_threads=6, serialization_context=None)¶ Store a Python value into the object store.
Parameters: - value (object) – A Python object to store.
- object_id (ObjectID, default None) – If this is provided, the specified object ID will be used to refer to the object.
- memcopy_threads (int, default 6) – The number of threads to use to write the serialized object into the object store for large objects.
- serialization_context (pyarrow.SerializationContext, default None) – Custom serialization and deserialization context.
Returns: The object ID associated to the Python object.
-
put_raw_buffer
(self, value, ObjectID object_id=None, string metadata=b'', int memcopy_threads=6)¶ Store Python buffer into the object store.
Parameters: - value (Python object that implements the buffer protocol) – A Python buffer object to store.
- object_id (ObjectID, default None) – If this is provided, the specified object ID will be used to refer to the object.
- metadata (bytes) – An optional string of bytes encoding whatever metadata the user wishes to encode.
- memcopy_threads (int, default 6) – The number of threads to use to write the serialized object into the object store for large objects.
Returns: The object ID associated to the Python buffer object.
-
seal
(self, ObjectID object_id)¶ Seal the buffer in the PlasmaStore for a particular object ID.
Once a buffer has been sealed, the buffer is immutable and can only be accessed through get.
Parameters: object_id (ObjectID) – A string used to identify an object.
-
store_capacity
(self)¶ Get the memory capacity of the store.
Returns: int – The memory capacity of the store in bytes.
-
store_socket_name
¶
-
subscribe
(self)¶ Subscribe to notifications about sealed objects.
-
to_capsule
(self)¶
-