DatasetType

class lsst.daf.butler.DatasetType(name: str, dimensions: Union[DimensionGraph, Iterable[Dimension]], storageClass: Union[StorageClass, str], parentStorageClass: Optional[Union[StorageClass, str]] = None, *, universe: Optional[DimensionUniverse] = None, isCalibration: bool = False)

Bases: object

A named category of Datasets that defines how they are organized, related, and stored.

A concrete, final class whose instances represent DatasetTypes. DatasetType instances may be constructed without a Registry, but they must be registered via Registry.registerDatasetType() before corresponding Datasets may be added. DatasetType instances are immutable.

Parameters:
name : str

A string name for the Dataset; must correspond to the same DatasetType across all Registries. Names must start with an upper or lowercase letter, and may contain only letters, numbers, and underscores. Component dataset types should contain a single period separating the base dataset type name from the component name (and may be recursive).

dimensions : DimensionGraph or iterable of Dimension

Dimensions used to label and relate instances of this DatasetType. If not a DimensionGraph, universe must be provided as well.

storageClass : StorageClass or str

Instance of a StorageClass or name of StorageClass that defines how this DatasetType is persisted.

parentStorageClass : StorageClass or str, optional

Instance of a StorageClass or name of StorageClass that defines how the composite parent is persisted. Must be None if this is not a component. Mandatory if it is a component but can be the special temporary placeholder (DatasetType.PlaceholderParentStorageClass) to allow construction with an intent to finalize later.

universe : DimensionUniverse, optional

Set of all known dimensions, used to normalize dimensions if it is not already a DimensionGraph.

isCalibration : bool, optional

If True, this dataset type may be included in CALIBRATION collections.

Attributes Summary

PlaceholderParentStorageClass Placeholder StorageClass that can be used temporarily for a component.
VALID_NAME_REGEX
dimensions The Dimensions that label and relate instances of this DatasetType (DimensionGraph).
name A string name for the Dataset; must correspond to the same DatasetType across all Registries.
parentStorageClass StorageClass instance that defines how the composite associated with this DatasetType is persisted.
storageClass StorageClass instance that defines how this DatasetType is persisted.

Methods Summary

component() Component name (if defined)
componentTypeName(component) Given a component name, derive the datasetTypeName of that component
finalizeParentStorageClass(newParent) Replace the current placeholder parent storage class with the real parent.
from_json(json_str, universe, registry) Convert a JSON string created by to_json and return something of the supplied class.
from_simple(simple, str], universe, registry) Construct a new object from the data returned from the to_simple method.
isCalibration() Return whether datasets of this type may be included in calibration collections.
isComponent() Boolean indicating whether this DatasetType refers to a component of a composite.
isComposite() Boolean indicating whether this DatasetType is a composite type.
makeAllComponentDatasetTypes() Return all the component dataset types assocaited with this dataset type.
makeComponentDatasetType(component) Return a DatasetType suitable for the given component, assuming the same dimensions as the parent.
makeCompositeDatasetType() Return a DatasetType suitable for the composite version of this component dataset type.
nameAndComponent() Return the root name of this dataset type and the component name (if defined).
nameWithComponent(datasetTypeName, componentName) Form a valid DatasetTypeName from a parent and component.
splitDatasetTypeName(datasetTypeName) Given a dataset type name, return the root name and the component name.
to_json(minimal) Convert this class to JSON form.
to_simple(minimal) Convert this class to a simple python type suitable for serialization.

Attributes Documentation

PlaceholderParentStorageClass = StorageClass('PlaceHolder')

Placeholder StorageClass that can be used temporarily for a component.

This can be useful in pipeline construction where we are creating dataset types without a registry.

VALID_NAME_REGEX = re.compile('^[a-zA-Z][a-zA-Z0-9_]*(\\.[a-zA-Z][a-zA-Z0-9_]*)*$')
dimensions

The Dimensions that label and relate instances of this DatasetType (DimensionGraph).

name

A string name for the Dataset; must correspond to the same DatasetType across all Registries.

parentStorageClass

StorageClass instance that defines how the composite associated with this DatasetType is persisted.

Note that if DatasetType was constructed with a name of a StorageClass then Butler has to be initialized before using this property. Can be None if this is not a component of a composite. Must be defined if this is a component.

storageClass

StorageClass instance that defines how this DatasetType is persisted. Note that if DatasetType was constructed with a name of a StorageClass then Butler has to be initialized before using this property.

Methods Documentation

component() → Optional[str]

Component name (if defined)

Returns:
comp : str

Name of component part of DatasetType name. None if this DatasetType is not associated with a component.

componentTypeName(component: str) → str

Given a component name, derive the datasetTypeName of that component

Parameters:
component : str

Name of component

Returns:
derived : str

Compound name of this DatasetType and the component.

Raises:
KeyError

Requested component is not supported by this DatasetType.

finalizeParentStorageClass(newParent: lsst.daf.butler.core.storageClass.StorageClass) → None

Replace the current placeholder parent storage class with the real parent.

Parameters:
newParent : StorageClass

The new parent to be associated with this composite dataset type. This replaces the temporary placeholder parent that was specified during construction.

Raises:
ValueError

Raised if this dataset type is not a component of a composite. Raised if a StorageClass is not given. Raised if the parent currently associated with the dataset type is not a placeholder.

classmethod from_json(json_str: str, universe: Optional[DimensionUniverse] = None, registry: Optional[Registry] = None) → SupportsSimple

Convert a JSON string created by to_json and return something of the supplied class.

Parameters:
json_str : str

Representation of the dimensions in JSON format as created by to_json().

universe : DimensionUniverse, optional

The special graph of all known dimensions. Passed directly to from_simple().

registry : lsst.daf.butler.Registry, optional

Registry to use to convert simple name of a DatasetType to a full DatasetType. Passed directly to from_simple().

Returns:
constructed : Any

Newly-constructed object.

classmethod from_simple(simple: Union[Dict, str], universe: Optional[DimensionUniverse] = None, registry: Optional[Registry] = None) → DatasetType

Construct a new object from the data returned from the to_simple method.

Parameters:
simple : dict of [str, Any] or str

The value returned by to_simple().

universe : DimensionUniverse

The special graph of all known dimensions of which this graph will be a subset. Can be None if a registry is provided.

registry : lsst.daf.butler.Registry, optional

Registry to use to convert simple name of a DatasetType to a full DatasetType. Can be None if a full description of the type is provided along with a universe.

Returns:
datasetType : DatasetType

Newly-constructed object.

isCalibration() → bool

Return whether datasets of this type may be included in calibration collections.

Returns:
flag : bool

True if datasets of this type may be included in calibration collections.

isComponent() → bool

Boolean indicating whether this DatasetType refers to a component of a composite.

Returns:
isComponent : bool

True if this DatasetType is a component, False otherwise.

isComposite() → bool

Boolean indicating whether this DatasetType is a composite type.

Returns:
isComposite : bool

True if this DatasetType is a composite type, False otherwise.

makeAllComponentDatasetTypes() → List[lsst.daf.butler.core.datasets.type.DatasetType]

Return all the component dataset types assocaited with this dataset type.

Returns:
all : list of DatasetType

All the component dataset types. If this is not a composite then returns an empty list.

makeComponentDatasetType(component: str) → lsst.daf.butler.core.datasets.type.DatasetType

Return a DatasetType suitable for the given component, assuming the same dimensions as the parent.

Parameters:
component : str

Name of component

Returns:
datasetType : DatasetType

A new DatasetType instance.

makeCompositeDatasetType() → lsst.daf.butler.core.datasets.type.DatasetType

Return a DatasetType suitable for the composite version of this component dataset type.

Returns:
composite : DatasetType

The composite dataset type.

Raises:
RuntimeError

Raised if this dataset type is not a component dataset type.

nameAndComponent() → Tuple[str, Optional[str]]

Return the root name of this dataset type and the component name (if defined).

Returns:
rootName : str

Root name for this DatasetType without any components.

componentName : str

The component if it has been specified, else None.

static nameWithComponent(datasetTypeName: str, componentName: str) → str

Form a valid DatasetTypeName from a parent and component.

No validation is performed.

Parameters:
datasetTypeName : str

Base type name.

componentName : str

Name of component.

Returns:
compTypeName : str

Name to use for component DatasetType.

static splitDatasetTypeName(datasetTypeName: str) → Tuple[str, Optional[str]]

Given a dataset type name, return the root name and the component name.

Parameters:
datasetTypeName : str

The name of the dataset type, can include a component using a “.”-separator.

Returns:
rootName : str

Root name without any components.

componentName : str

The component if it has been specified, else None.

Notes

If the dataset type name is a.b.c this method will return a root name of a and a component name of b.c.

to_json(minimal: bool = False) → str

Convert this class to JSON form.

The class type is not recorded in the JSON so the JSON decoder must know which class is represented.

Parameters:
minimal : bool, optional

Use minimal serialization. Requires Registry to convert back to a full type.

Returns:
json : str

The class in JSON string format.

to_simple(minimal: bool = False) → Union[Dict[KT, VT], str]

Convert this class to a simple python type suitable for serialization.

Parameters:
minimal : bool, optional

Use minimal serialization. Requires Registry to convert back to a full type.

Returns:
simple : dict or str

The object converted to a dictionary or a simple string.