ParquetFormatter¶
- class lsst.daf.butler.formatters.parquet.ParquetFormatter(file_descriptor: FileDescriptor, *, ref: DatasetRef, write_parameters: Mapping[str, Any] | None = None, write_recipes: Mapping[str, Any] | None = None, **kwargs: Any)¶
Bases:
FormatterV2
Interface for reading and writing Arrow Table objects to and from Parquet files.
Attributes Summary
Declare whether
read_from_file
is available to this formatter.Default extension to use when writing a file.
Methods Summary
can_accept
(in_memory_dataset)Indicate whether this formatter can accept the specified storage class directly.
read_from_local_file
(path[, component, ...])Read a dataset from a URI guaranteed to refer to the local file system.
write_local_file
(in_memory_dataset, uri)Serialize the in-memory dataset to a local file.
Attributes Documentation
- can_read_from_local_file: ClassVar[bool] = True¶
Declare whether
read_from_file
is available to this formatter.
- default_extension: ClassVar[str | None] = '.parq'¶
Default extension to use when writing a file.
Can be
None
if the extension is determined dynamically. Use theget_write_extension
method to get the actual extension to use.
Methods Documentation
- can_accept(in_memory_dataset: Any) bool ¶
Indicate whether this formatter can accept the specified storage class directly.
- Parameters:
- in_memory_dataset
object
The dataset that is to be written.
- in_memory_dataset
- Returns:
Notes
The base class always returns
False
even if the given type is an instance of the storage class type. This will result in a storage class conversion no-op but also allows mocks with mocked storage classes to work properly.
- read_from_local_file(path: str, component: str | None = None, expected_size: int = -1) Any ¶
Read a dataset from a URI guaranteed to refer to the local file system.
- Parameters:
- Returns:
- in_memory_dataset
object
orNotImplemented
The Python object read from the resource or
NotImplemented
.
- in_memory_dataset
- Raises:
- FormatterNotImplementedError
Raised if there is no implementation written to read data from a local file.
Notes
This method will only be called if the class property
can_read_from_local_file
isTrue
and other options were not used.
- write_local_file(in_memory_dataset: Any, uri: ResourcePath) None ¶
Serialize the in-memory dataset to a local file.
- Parameters:
- in_memory_dataset
object
The Python object to serialize.
- uri
ResourcePath
The URI to use when writing the file.
- in_memory_dataset
- Raises:
- FormatterNotImplementedError
Raised if the formatter subclass has not implemented this method or has failed to implement the
to_bytes
method.
Notes
By default this method will attempt to call
to_bytes
and then write these bytes to the file.