ParquetFormatter¶
-
class
lsst.daf.butler.formatters.parquet.
ParquetFormatter
(fileDescriptor: FileDescriptor, dataId: DataCoordinate, writeParameters: Optional[Dict[str, Any]] = None, writeRecipes: Optional[Dict[str, Any]] = None)¶ Bases:
lsst.daf.butler.Formatter
Interface for reading and writing Pandas DataFrames to and from Parquet files.
This formatter is for the DataFrame StorageClass.
Attributes Summary
dataId
Return Data ID associated with this formatter ( DataCoordinate
).extension
fileDescriptor
File descriptor associated with this formatter ( FileDescriptor
).supportedExtensions
supportedWriteParameters
unsupportedParameters
writeParameters
Parameters to use when writing out datasets. writeRecipes
Detailed write Recipes indexed by recipe name. Methods Summary
can_read_bytes
()Indicate if this formatter can format from bytes. fromBytes
(serializedDataset, component, …)Read serialized data into a Dataset or its component. makeUpdatedLocation
(location)Return a new Location
updated with this formatter’s extension.name
()Return the fully qualified name of the formatter. predictPath
()Return the path that would be returned by write. read
(component, None] = None)Read a Dataset. segregateParameters
(parameters, Any], …)Segregate the supplied parameters. toBytes
(inMemoryDataset)Serialize the Dataset to bytes based on formatter. validateExtension
(location)Check the extension of the provided location for compatibility. validateWriteRecipes
(recipes, Any], None])Validate supplied recipes for this formatter. write
(inMemoryDataset)Write a Dataset. Attributes Documentation
-
dataId
¶ Return Data ID associated with this formatter (
DataCoordinate
).
-
extension
= '.parq'¶
-
fileDescriptor
¶ File descriptor associated with this formatter (
FileDescriptor
).Read-only property.
-
supportedExtensions
= frozenset()¶
-
supportedWriteParameters
= None¶
-
unsupportedParameters
= frozenset()¶
-
writeParameters
¶ Parameters to use when writing out datasets.
-
writeRecipes
¶ Detailed write Recipes indexed by recipe name.
Methods Documentation
-
classmethod
can_read_bytes
() → bool¶ Indicate if this formatter can format from bytes.
Returns:
-
fromBytes
(serializedDataset: bytes, component: Optional[str, None] = None) → object¶ Read serialized data into a Dataset or its component.
Parameters: Returns: - inMemoryDataset :
object
The requested data as a Python object. The type of object is controlled by the specific formatter.
- inMemoryDataset :
-
makeUpdatedLocation
(location: lsst.daf.butler.core.location.Location) → lsst.daf.butler.core.location.Location¶ Return a new
Location
updated with this formatter’s extension.Parameters: - location :
Location
The location to update.
Returns: - updated :
Location
A new
Location
with a new file extension applied.
Raises: - NotImplementedError
Raised if there is no
extension
attribute associated with this formatter.
Notes
This method is available to all Formatters but might not be implemented by all formatters. It requires that a formatter set an
extension
attribute containing the file extension used when writing files. Ifextension
isNone
the supplied file will not be updated. Not all formatters write files so this is not defined in the base class.- location :
-
classmethod
name
() → str¶ Return the fully qualified name of the formatter.
Returns: - name :
str
Fully-qualified name of formatter class.
- name :
-
predictPath
() → str¶ Return the path that would be returned by write.
Does not write any data file.
Uses the
FileDescriptor
associated with the instance.Returns: - path :
str
Path within datastore that would be associated with the location stored in this
Formatter
.
- path :
-
read
(component: Optional[str, None] = None) → Any¶ Read a Dataset.
Parameters: - component :
str
, optional Component to read from the file. Only used if the
StorageClass
for reading differed from theStorageClass
used to write the file.
Returns: - inMemoryDataset :
object
The requested Dataset.
- component :
-
segregateParameters
(parameters: Optional[Dict[str, Any], None] = None) → Tuple[Dict, Dict]¶ Segregate the supplied parameters.
This splits the parameters into those understood by the formatter and those not understood by the formatter.
Any unsupported parameters are assumed to be usable by associated assemblers.
Parameters: Returns:
-
toBytes
(inMemoryDataset: Any) → bytes¶ Serialize the Dataset to bytes based on formatter.
Parameters: - inMemoryDataset :
object
The Python object to serialize.
Returns: - serializedDataset :
bytes
Bytes representing the serialized dataset.
- inMemoryDataset :
-
classmethod
validateExtension
(location: lsst.daf.butler.core.location.Location) → None¶ Check the extension of the provided location for compatibility.
Parameters: - location :
Location
Location from which to extract a file extension.
Raises: - NotImplementedError
Raised if file extensions are a concept not understood by this formatter.
- ValueError
Raised if the formatter does not understand this extension.
Notes
This method is available to all Formatters but might not be implemented by all formatters. It requires that a formatter set an
extension
attribute containing the file extension used when writing files. Ifextension
isNone
only the set of supported extensions will be examined.- location :
-
classmethod
validateWriteRecipes
(recipes: Optional[collections.abc.Mapping[str, Any], None]) → Optional[collections.abc.Mapping[str, Any], None]¶ Validate supplied recipes for this formatter.
The recipes are supplemented with default values where appropriate.
Parameters: - recipes :
dict
Recipes to validate.
Returns: - validated :
dict
Validated recipes.
Raises: - RuntimeError
Raised if validation fails. The default implementation raises if any recipes are given.
- recipes :
-