ParquetFormatter¶
-
class
lsst.daf.butler.formatters.parquet.ParquetFormatter(fileDescriptor: FileDescriptor, dataId: Optional[DataCoordinate] = None, writeParameters: Optional[Dict[str, Any]] = None, writeRecipes: Optional[Dict[str, Any]] = None)¶ Bases:
lsst.daf.butler.FormatterInterface for reading and writing Pandas DataFrames to and from Parquet files.
This formatter is for the DataFrame StorageClass.
Attributes Summary
dataIdDataId associated with this formatter ( DataCoordinate)extensionfileDescriptorFileDescriptor associated with this formatter ( FileDescriptor, read-only)supportedExtensionssupportedWriteParametersunsupportedParameterswriteParametersParameters to use when writing out datasets. writeRecipesDetailed write Recipes indexed by recipe name. Methods Summary
can_read_bytes()Indicate if this formatter can format from bytes. fromBytes(serializedDataset, component)Reads serialized data into a Dataset or its component. makeUpdatedLocation(location)Return a new Locationinstance updated with this formatter’s extension.name()Returns the fully qualified name of the formatter. predictPath()Return the path that would be returned by write, without actually writing. read(component)Read a Dataset. segregateParameters(parameters, Any]] = None)Segregate the supplied parameters into those understood by the formatter and those not understood by the formatter. toBytes(inMemoryDataset)Serialize the Dataset to bytes based on formatter. validateExtension(location)Check that the provided location refers to a file extension that is understood by this formatter. validateWriteRecipes(recipes, Any]])Validate supplied recipes for this formatter. write(inMemoryDataset)Write a Dataset. Attributes Documentation
-
dataId¶ DataId associated with this formatter (
DataCoordinate)
-
extension= '.parq'¶
-
fileDescriptor¶ FileDescriptor associated with this formatter (
FileDescriptor, read-only)
-
supportedExtensions= frozenset()¶
-
supportedWriteParameters= None¶
-
unsupportedParameters= frozenset()¶
-
writeParameters¶ Parameters to use when writing out datasets.
-
writeRecipes¶ Detailed write Recipes indexed by recipe name.
Methods Documentation
-
classmethod
can_read_bytes() → bool¶ Indicate if this formatter can format from bytes.
Returns:
-
fromBytes(serializedDataset: bytes, component: Optional[str] = None) → object¶ Reads serialized data into a Dataset or its component.
Parameters: Returns: - inMemoryDataset :
object The requested data as a Python object. The type of object is controlled by the specific formatter.
- inMemoryDataset :
-
makeUpdatedLocation(location: lsst.daf.butler.core.location.Location) → lsst.daf.butler.core.location.Location¶ Return a new
Locationinstance updated with this formatter’s extension.Parameters: - location :
Location The location to update.
Returns: - updated :
Location A new
Locationwith a new file extension applied.
Raises: - NotImplementedError
Raised if there is no
extensionattribute associated with this formatter.
Notes
This method is available to all Formatters but might not be implemented by all formatters. It requires that a formatter set an
extensionattribute containing the file extension used when writing files. IfextensionisNonethe supplied file will not be updated. Not all formatters write files so this is not defined in the base class.- location :
-
classmethod
name() → str¶ Returns the fully qualified name of the formatter.
Returns: - name :
str Fully-qualified name of formatter class.
- name :
-
predictPath() → str¶ Return the path that would be returned by write, without actually writing.
Uses the
FileDescriptorassociated with the instance.Returns: - path :
str Path within datastore that would be associated with the location stored in this
Formatter.
- path :
-
read(component: Optional[str] = None) → Any¶ Read a Dataset.
Parameters: - component :
str, optional Component to read from the file. Only used if the
StorageClassfor reading differed from theStorageClassused to write the file.
Returns: - inMemoryDataset :
object The requested Dataset.
- component :
-
segregateParameters(parameters: Optional[Dict[str, Any]] = None) → Tuple[Dict[KT, VT], Dict[KT, VT]]¶ Segregate the supplied parameters into those understood by the formatter and those not understood by the formatter.
Any unsupported parameters are assumed to be usable by associated assemblers.
Parameters: Returns:
-
toBytes(inMemoryDataset: Any) → bytes¶ Serialize the Dataset to bytes based on formatter.
Parameters: - inMemoryDataset :
object The Python object to serialize.
Returns: - serializedDataset :
bytes Bytes representing the serialized dataset.
- inMemoryDataset :
-
classmethod
validateExtension(location: lsst.daf.butler.core.location.Location) → None¶ Check that the provided location refers to a file extension that is understood by this formatter.
Parameters: - location :
Location Location from which to extract a file extension.
Raises: - NotImplementedError
Raised if file extensions are a concept not understood by this formatter.
- ValueError
Raised if the formatter does not understand this extension.
Notes
This method is available to all Formatters but might not be implemented by all formatters. It requires that a formatter set an
extensionattribute containing the file extension used when writing files. IfextensionisNoneonly the set of supported extensions will be examined.- location :
-
classmethod
validateWriteRecipes(recipes: Optional[Mapping[str, Any]]) → Optional[Mapping[str, Any]]¶ Validate supplied recipes for this formatter.
The recipes are supplemented with default values where appropriate.
Parameters: - recipes :
dict Recipes to validate.
Returns: - validated :
dict Validated recipes.
Raises: - RuntimeError
Raised if validation fails. The default implementation raises if any recipes are given.
- recipes :
-