Butler¶
- class lsst.daf.butler.Butler(config: Config | str | ParseResult | ResourcePath | Path | None = None, *, collections: Any = None, run: str | None = None, searchPaths: Sequence[str | ParseResult | ResourcePath | Path] | None = None, writeable: bool | None = None, inferDefaults: bool = True, without_datastore: bool = False, **kwargs: Any)¶
- Bases: - LimitedButler- Interface for data butler and factory for Butler instances. - Parameters:
- configButlerConfig,Configorstr, optional
- Configuration. Anything acceptable to the - ButlerConfigconstructor. If a directory path is given the configuration will be read from a- butler.yamlfile in that location. If- Noneis given default values will be used. If- configcontains “cls” key then its value is used as a name of butler class and it must be a sub-class of this class, otherwise- DirectButleris instantiated.
- collectionsstrorIterable[str], optional
- An expression specifying the collections to be searched (in order) when reading datasets. This may be a - strcollection name or an iterable thereof. See Collection expressions for more information. These collections are not registered automatically and must be manually registered before they are used by any method, but they may be manually registered after the- Butleris initialized.
- runstr, optional
- Name of the - RUNcollection new datasets should be inserted into. If- collectionsis- Noneand- runis not- None,- collectionswill be set to- [run]. If not- None, this collection will automatically be registered. If this is not set (and- writeableis not set either), a read-only butler will be created.
- searchPathslistofstr, optional
- Directory paths to search when calculating the full Butler configuration. Not used if the supplied config is already a - ButlerConfig.
- writeablebool, optional
- Explicitly sets whether the butler supports write operations. If not provided, a read-write butler is created if any of - run,- tags, or- chainsis non-empty.
- inferDefaultsbool, optional
- If - True(default) infer default data ID values from the values present in the datasets in- collections: if all collections have the same value (or no value) for a governor dimension, that value will be the default for that dimension. Nonexistent collections are ignored. If a default value is provided explicitly for a governor dimension via- **kwargs, no default will be inferred for that dimension.
- without_datastorebool, optional
- If - Truedo not attach a datastore to this butler. Any attempts to use a datastore will fail.
- **kwargsAny
- Additional keyword arguments passed to a constructor of actual butler class. 
 
- config
 - Notes - The preferred way to instantiate Butler is via the - from_configmethod. The call to- Butler(...)is equivalent to- Butler.from_config(...), but- mypywill complain about the former.- Attributes Summary - Object with methods for modifying collection chains ( - ButlerCollections).- The collections to search by default, in order ( - Sequence[- str]).- The object that manages dataset metadata and relationships ( - Registry).- Name of the run this butler writes outputs to by default ( - stror- None).- Methods Summary - exists(dataset_ref_or_type, /[, data_id, ...])- Indicate whether a dataset is known to Butler registry and datastore. - export(*[, directory, filename, format, ...])- Export datasets from the repository represented by this - Butler.- find_dataset(dataset_type[, data_id, ...])- Find a dataset given its - DatasetTypeand data ID.- from_config([config, collections, run, ...])- Create butler instance from configuration. - get(datasetRefOrType, /[, dataId, ...])- Retrieve a stored dataset. - getDeferred(datasetRefOrType, /[, dataId, ...])- Create a - DeferredDatasetHandlewhich can later retrieve a dataset, after an immediate registry lookup.- getURI(datasetRefOrType, /[, dataId, ...])- Return the URI to the Dataset. - getURIs(datasetRefOrType, /[, dataId, ...])- Return the URIs associated with the dataset. - get_dataset(id, *[, storage_class, ...])- Retrieve a Dataset entry. - get_dataset_type(name)- Get the - DatasetType.- Retrieve the list of known repository labels. - get_repo_uri(label[, return_label])- Look up the label in a butler repository index. - import_(*[, directory, filename, format, ...])- Import datasets into this repository that were exported from a different butler repository via - export.- ingest(*datasets[, transfer, ...])- Store and register one or more datasets that already exist on disk. - makeRepo(root[, config, dimensionConfig, ...])- Create an empty data repository by adding a butler.yaml config to a repository root directory. - put(obj, datasetRefOrType, /[, dataId, run])- Store and register a dataset. - removeRuns(names[, unstore])- Remove one or more - RUNcollections and the datasets within them.- retrieveArtifacts(refs, destination[, ...])- Retrieve the artifacts associated with the supplied refs. - Context manager supporting - Butlertransactions.- Transfer dimension records to this Butler from another Butler. - transfer_from(source_butler, source_refs[, ...])- Transfer datasets to this Butler from a run in another Butler. - validateConfiguration([logFailures, ...])- Validate butler configuration. - Attributes Documentation - collection_chains¶
- Object with methods for modifying collection chains ( - ButlerCollections).- Use of this object is preferred over - registrywherever possible.
 - registry¶
- The object that manages dataset metadata and relationships ( - Registry).- Many operations that don’t involve reading or writing butler datasets are accessible only via - Registrymethods. Eventually these methods will be replaced by equivalent- Butlermethods.
 - Methods Documentation - abstract exists(dataset_ref_or_type: DatasetRef | DatasetType | str, /, data_id: DataId | None = None, *, full_check: bool = True, collections: Any = None, **kwargs: Any) DatasetExistence¶
- Indicate whether a dataset is known to Butler registry and datastore. - Parameters:
- dataset_ref_or_typeDatasetRef,DatasetType, orstr
- When - DatasetRefthe- dataIdshould be- None. Otherwise the- DatasetTypeor name thereof.
- data_iddictorDataCoordinate
- A - dictof- Dimensionlink name, value pairs that label the- DatasetRefwithin a Collection. When- None, a- DatasetRefshould be provided as the first argument.
- full_checkbool, optional
- If - True, a check will be made for the actual existence of a dataset artifact. This will involve additional overhead due to the need to query an external system. If- False, this check will be omitted, and the registry and datastore will solely be asked if they know about the dataset but no direct check for the artifact will be performed.
- collectionsAny, optional
- Collections to be searched, overriding - self.collections. Can be any of the types supported by the- collectionsargument to butler construction.
- **kwargs
- Additional keyword arguments used to augment or construct a - DataCoordinate. See- DataCoordinate.standardizeparameters.
 
- dataset_ref_or_type
- Returns:
- existenceDatasetExistence
- Object indicating whether the dataset is known to registry and datastore. Evaluates to - Trueif the dataset is present and known to both.
 
- existence
 
 - abstract export(*, directory: str | None = None, filename: str | None = None, format: str | None = None, transfer: str | None = None) AbstractContextManager[RepoExportContext]¶
- Export datasets from the repository represented by this - Butler.- This method is a context manager that returns a helper object ( - RepoExportContext) that is used to indicate what information from the repository should be exported.- Parameters:
- directorystr, optional
- Directory dataset files should be written to if - transferis not- None.
- filenamestr, optional
- Name for the file that will include database information associated with the exported datasets. If this is not an absolute path and - directoryis not- None, it will be written to- directoryinstead of the current working directory. Defaults to “export.{format}”.
- formatstr, optional
- File format for the database information file. If - None, the extension of- filenamewill be used.
- transferstr, optional
- Transfer mode passed to - Datastore.export.
 
- directory
- Raises:
- TypeError
- Raised if the set of arguments passed is inconsistent. 
 
 - Examples - Typically the - Registry.queryDataIdsand- Registry.queryDatasetsmethods are used to provide the iterables over data IDs and/or datasets to be exported:- with butler.export("exports.yaml") as export: # Export all flats, but none of the dimension element rows # (i.e. data ID information) associated with them. export.saveDatasets(butler.registry.queryDatasets("flat"), elements=()) # Export all datasets that start with "deepCoadd_" and all of # their associated data ID information. export.saveDatasets(butler.registry.queryDatasets("deepCoadd_*")) 
 - abstract find_dataset(dataset_type: DatasetType | str, data_id: DataId | None = None, *, collections: str | Sequence[str] | None = None, timespan: Timespan | None = None, storage_class: str | StorageClass | None = None, dimension_records: bool = False, datastore_records: bool = False, **kwargs: Any) DatasetRef | None¶
- Find a dataset given its - DatasetTypeand data ID.- This can be used to obtain a - DatasetRefthat permits the dataset to be read from a- Datastore. If the dataset is a component and can not be found using the provided dataset type, a dataset ref for the parent will be returned instead but with the correct dataset type.- Parameters:
- dataset_typeDatasetTypeorstr
- A - DatasetTypeor the name of one. If this is a- DatasetTypeinstance, its storage class will be respected and propagated to the output, even if it differs from the dataset type definition in the registry, as long as the storage classes are convertible.
- data_iddictorDataCoordinate, optional
- A - dict-like object containing the- Dimensionlinks that identify the dataset within a collection. If it is a- dictthe dataId can include dimension record values such as- day_obsand- seq_numor- full_namethat can be used to derive the primary dimension.
- collectionsstrorlist[str], optional
- A an ordered list of collections to search for the dataset. Defaults to - self.defaults.collections.
- timespanTimespan, optional
- A timespan that the validity range of the dataset must overlap. If not provided, any - CALIBRATIONcollections matched by the- collectionsargument will not be searched.
- storage_classstrorStorageClassorNone
- A storage class to use when creating the returned entry. If given it must be compatible with the default storage class. 
- dimension_recordsbool, optional
- If - Truethe ref will be expanded and contain dimension records.
- datastore_recordsbool, optional
- If - Truethe ref will contain associated datastore records.
- **kwargs
- Additional keyword arguments passed to - DataCoordinate.standardizeto convert- dataIdto a true- DataCoordinateor augment an existing one. This can also include dimension record metadata that can be used to derive a primary dimension value.
 
- dataset_type
- Returns:
- refDatasetRef
- A reference to the dataset, or - Noneif no matching Dataset was found.
 
- ref
- Raises:
- lsst.daf.butler.NoDefaultCollectionError
- LookupError
- Raised if one or more data ID keys are missing. 
- lsst.daf.butler.MissingDatasetTypeError
- Raised if the dataset type does not exist. 
- lsst.daf.butler.MissingCollectionError
- Raised if any of - collectionsdoes not exist in the registry.
 
 - Notes - This method simply returns - Noneand does not raise an exception even when the set of collections searched is intrinsically incompatible with the dataset type, e.g. if- datasetType.isCalibration() is False, but only- CALIBRATIONcollections are being searched. This may make it harder to debug some lookup failures, but the behavior is intentional; we consider it more important that failed searches are reported consistently, regardless of the reason, and that adding additional collections that do not contain a match to the search path never changes the behavior.- This method handles component dataset types automatically, though most other query operations do not. 
 - classmethod from_config(config: Config | str | ParseResult | ResourcePath | Path | None = None, *, collections: Any = None, run: str | None = None, searchPaths: Sequence[str | ParseResult | ResourcePath | Path] | None = None, writeable: bool | None = None, inferDefaults: bool = True, without_datastore: bool = False, **kwargs: Any) Butler¶
- Create butler instance from configuration. - Parameters:
- configButlerConfig,Configorstr, optional
- Configuration. Anything acceptable to the - ButlerConfigconstructor. If a directory path is given the configuration will be read from a- butler.yamlfile in that location. If- Noneis given default values will be used. If- configcontains “cls” key then its value is used as a name of butler class and it must be a sub-class of this class, otherwise- DirectButleris instantiated.
- collectionsstrorIterable[str], optional
- An expression specifying the collections to be searched (in order) when reading datasets. This may be a - strcollection name or an iterable thereof. See Collection expressions for more information. These collections are not registered automatically and must be manually registered before they are used by any method, but they may be manually registered after the- Butleris initialized.
- runstr, optional
- Name of the - RUNcollection new datasets should be inserted into. If- collectionsis- Noneand- runis not- None,- collectionswill be set to- [run]. If not- None, this collection will automatically be registered. If this is not set (and- writeableis not set either), a read-only butler will be created.
- searchPathslistofstr, optional
- Directory paths to search when calculating the full Butler configuration. Not used if the supplied config is already a - ButlerConfig.
- writeablebool, optional
- Explicitly sets whether the butler supports write operations. If not provided, a read-write butler is created if any of - run,- tags, or- chainsis non-empty.
- inferDefaultsbool, optional
- If - True(default) infer default data ID values from the values present in the datasets in- collections: if all collections have the same value (or no value) for a governor dimension, that value will be the default for that dimension. Nonexistent collections are ignored. If a default value is provided explicitly for a governor dimension via- **kwargs, no default will be inferred for that dimension.
- without_datastorebool, optional
- If - Truedo not attach a datastore to this butler. Any attempts to use a datastore will fail.
- **kwargsAny
- Default data ID key-value pairs. These may only identify “governor” dimensions like - instrumentand- skymap.
 
- config
- Returns:
 - Notes - Calling this factory method is identical to calling - Butler(config, ...). Its only raison d’être is that- mypycomplains about- Butler()call.- Examples - While there are many ways to control exactly how a - Butlerinteracts with the collections in its- Registry, the most common cases are still simple.- For a read-only - Butlerthat searches one collection, do:- butler = Butler.from_config( "/path/to/repo", collections=["u/alice/DM-50000"] ) - For a read-write - Butlerthat writes to and reads from a- RUNcollection:- butler = Butler.from_config( "/path/to/repo", run="u/alice/DM-50000/a" ) - The - Butlerpassed to a- PipelineTaskis often much more complex, because we want to write to one- RUNcollection but read from several others (as well):- butler = Butler.from_config( "/path/to/repo", run="u/alice/DM-50000/a", collections=[ "u/alice/DM-50000/a", "u/bob/DM-49998", "HSC/defaults" ] ) - This butler will - putnew datasets to the run- u/alice/DM-50000/a. Datasets will be read first from that run (since it appears first in the chain), and then from- u/bob/DM-49998and finally- HSC/defaults.- Finally, one can always create a - Butlerwith no collections:- butler = Butler.from_config("/path/to/repo", writeable=True) - This can be extremely useful when you just want to use - butler.registry, e.g. for inserting dimension data or managing collections, or when the collections you want to use with the butler are not consistent. Passing- writeableexplicitly here is only necessary if you want to be able to make changes to the repo - usually the value for- writeablecan be guessed from the collection arguments provided, but it defaults to- Falsewhen there are not collection arguments.
 - abstract get(datasetRefOrType: DatasetRef | DatasetType | str, /, dataId: DataId | None = None, *, parameters: dict[str, Any] | None = None, collections: Any = None, storageClass: StorageClass | str | None = None, timespan: Timespan | None = None, **kwargs: Any) Any¶
- Retrieve a stored dataset. - Parameters:
- datasetRefOrTypeDatasetRef,DatasetType, orstr
- When - DatasetRefthe- dataIdshould be- None. Otherwise the- DatasetTypeor name thereof. If a resolved- DatasetRef, the associated dataset is returned directly without additional querying.
- dataIddictorDataCoordinate
- A - dictof- Dimensionlink name, value pairs that label the- DatasetRefwithin a Collection. When- None, a- DatasetRefshould be provided as the first argument.
- parametersdict
- Additional StorageClass-defined options to control reading, typically used to efficiently read only a subset of the dataset. 
- collectionsAny, optional
- Collections to be searched, overriding - self.collections. Can be any of the types supported by the- collectionsargument to butler construction.
- storageClassStorageClassorstr, optional
- The storage class to be used to override the Python type returned by this method. By default the returned type matches the dataset type definition for this dataset. Specifying a read - StorageClasscan force a different type to be returned. This type must be compatible with the original type.
- timespanTimespanorNone, optional
- A timespan that the validity range of the dataset must overlap. If not provided and this is a calibration dataset type, an attempt will be made to find the timespan from any temporal coordinate in the data ID. 
- **kwargs
- Additional keyword arguments used to augment or construct a - DataCoordinate. See- DataCoordinate.standardizeparameters.
 
- datasetRefOrType
- Returns:
- objobject
- The dataset. 
 
- obj
- Raises:
- LookupError
- Raised if no matching dataset exists in the - Registry.
- TypeError
- Raised if no collections were provided. 
 
 - Notes - When looking up datasets in a - CALIBRATIONcollection, this method requires that the given data ID include temporal dimensions beyond the dimensions of the dataset type itself, in order to find the dataset with the appropriate validity range. For example, a “bias” dataset with native dimensions- {instrument, detector}could be fetched with a- {instrument, detector, exposure}data ID, because- exposureis a temporal dimension.
 - abstract getDeferred(datasetRefOrType: DatasetRef | DatasetType | str, /, dataId: DataId | None = None, *, parameters: dict | None = None, collections: Any = None, storageClass: str | StorageClass | None = None, timespan: Timespan | None = None, **kwargs: Any) DeferredDatasetHandle¶
- Create a - DeferredDatasetHandlewhich can later retrieve a dataset, after an immediate registry lookup.- Parameters:
- datasetRefOrTypeDatasetRef,DatasetType, orstr
- When - DatasetRefthe- dataIdshould be- None. Otherwise the- DatasetTypeor name thereof.
- dataIddictorDataCoordinate, optional
- A - dictof- Dimensionlink name, value pairs that label the- DatasetRefwithin a Collection. When- None, a- DatasetRefshould be provided as the first argument.
- parametersdict
- Additional StorageClass-defined options to control reading, typically used to efficiently read only a subset of the dataset. 
- collectionsAny, optional
- Collections to be searched, overriding - self.collections. Can be any of the types supported by the- collectionsargument to butler construction.
- storageClassStorageClassorstr, optional
- The storage class to be used to override the Python type returned by this method. By default the returned type matches the dataset type definition for this dataset. Specifying a read - StorageClasscan force a different type to be returned. This type must be compatible with the original type.
- timespanTimespanorNone, optional
- A timespan that the validity range of the dataset must overlap. If not provided and this is a calibration dataset type, an attempt will be made to find the timespan from any temporal coordinate in the data ID. 
- **kwargs
- Additional keyword arguments used to augment or construct a - DataId. See- DataIdparameters.
 
- datasetRefOrType
- Returns:
- objDeferredDatasetHandle
- A handle which can be used to retrieve a dataset at a later time. 
 
- obj
- Raises:
- LookupError
- Raised if no matching dataset exists in the - Registryor datastore.
- ValueError
- Raised if a resolved - DatasetRefwas passed as an input, but it differs from the one found in the registry.
- TypeError
- Raised if no collections were provided. 
 
 
 - getURI(datasetRefOrType: DatasetRef | DatasetType | str, /, dataId: DataId | None = None, *, predict: bool = False, collections: Any = None, run: str | None = None, **kwargs: Any) ResourcePath¶
- Return the URI to the Dataset. - Parameters:
- datasetRefOrTypeDatasetRef,DatasetType, orstr
- When - DatasetRefthe- dataIdshould be- None. Otherwise the- DatasetTypeor name thereof.
- dataIddictorDataCoordinate
- A - dictof- Dimensionlink name, value pairs that label the- DatasetRefwithin a Collection. When- None, a- DatasetRefshould be provided as the first argument.
- predictbool
- If - True, allow URIs to be returned of datasets that have not been written.
- collectionsAny, optional
- Collections to be searched, overriding - self.collections. Can be any of the types supported by the- collectionsargument to butler construction.
- runstr, optional
- Run to use for predictions, overriding - self.run.
- **kwargs
- Additional keyword arguments used to augment or construct a - DataCoordinate. See- DataCoordinate.standardizeparameters.
 
- datasetRefOrType
- Returns:
- urilsst.resources.ResourcePath
- URI pointing to the Dataset within the datastore. If the Dataset does not exist in the datastore, and if - predictis- True, the URI will be a prediction and will include a URI fragment “#predicted”. If the datastore does not have entities that relate well to the concept of a URI the returned URI string will be descriptive. The returned URI is not guaranteed to be obtainable.
 
- uri
- Raises:
- LookupError
- A URI has been requested for a dataset that does not exist and guessing is not allowed. 
- ValueError
- Raised if a resolved - DatasetRefwas passed as an input, but it differs from the one found in the registry.
- TypeError
- Raised if no collections were provided. 
- RuntimeError
- Raised if a URI is requested for a dataset that consists of multiple artifacts. 
 
 
 - abstract getURIs(datasetRefOrType: DatasetRef | DatasetType | str, /, dataId: DataId | None = None, *, predict: bool = False, collections: Any = None, run: str | None = None, **kwargs: Any) DatasetRefURIs¶
- Return the URIs associated with the dataset. - Parameters:
- datasetRefOrTypeDatasetRef,DatasetType, orstr
- When - DatasetRefthe- dataIdshould be- None. Otherwise the- DatasetTypeor name thereof.
- dataIddictorDataCoordinate
- A - dictof- Dimensionlink name, value pairs that label the- DatasetRefwithin a Collection. When- None, a- DatasetRefshould be provided as the first argument.
- predictbool
- If - True, allow URIs to be returned of datasets that have not been written.
- collectionsAny, optional
- Collections to be searched, overriding - self.collections. Can be any of the types supported by the- collectionsargument to butler construction.
- runstr, optional
- Run to use for predictions, overriding - self.run.
- **kwargs
- Additional keyword arguments used to augment or construct a - DataCoordinate. See- DataCoordinate.standardizeparameters.
 
- datasetRefOrType
- Returns:
- urisDatasetRefURIs
- The URI to the primary artifact associated with this dataset (if the dataset was disassembled within the datastore this may be - None), and the URIs to any components associated with the dataset artifact. (can be empty if there are no components).
 
- uris
 
 - abstract get_dataset(id: DatasetId, *, storage_class: str | StorageClass | None = None, dimension_records: bool = False, datastore_records: bool = False) DatasetRef | None¶
- Retrieve a Dataset entry. - Parameters:
- idDatasetId
- The unique identifier for the dataset. 
- storage_classstrorStorageClassorNone
- A storage class to use when creating the returned entry. If given it must be compatible with the default storage class. 
- dimension_recordsbool, optional
- If - Truethe ref will be expanded and contain dimension records.
- datastore_recordsbool, optional
- If - Truethe ref will contain associated datastore records.
 
- id
- Returns:
- refDatasetReforNone
- A ref to the Dataset, or - Noneif no matching Dataset was found.
 
- ref
 
 - abstract get_dataset_type(name: str) DatasetType¶
- Get the - DatasetType.- Parameters:
- namestr
- Name of the type. 
 
- name
- Returns:
- typeDatasetType
- The - DatasetTypeassociated with the given name.
 
- type
- Raises:
- lsst.daf.butler.MissingDatasetTypeError
- Raised if the requested dataset type has not been registered. 
 
 - Notes - This method handles component dataset types automatically, though most other operations do not. 
 - classmethod get_known_repos() set[str]¶
- Retrieve the list of known repository labels. - Notes - See - ButlerRepoIndexfor details on how the information is discovered.
 - classmethod get_repo_uri(label: str, return_label: bool = False) ResourcePath¶
- Look up the label in a butler repository index. - Parameters:
- labelstr
- Label of the Butler repository to look up. 
- return_labelbool, optional
- If - labelcannot be found in the repository index (either because index is not defined or- labelis not in the index) and- return_labelis- Truethen return- ResourcePath(label). If- return_labelis- False(default) then an exception will be raised instead.
 
- label
- Returns:
- urilsst.resources.ResourcePath
- URI to the Butler repository associated with the given label or default value if it is provided. 
 
- uri
- Raises:
- KeyError
- Raised if the label is not found in the index, or if an index is not defined, and - return_labelis- False.
 
 - Notes - See - ButlerRepoIndexfor details on how the information is discovered.
 - abstract import_(*, directory: str | ParseResult | ResourcePath | Path | None = None, filename: str | ParseResult | ResourcePath | Path | TextIO | None = None, format: str | None = None, transfer: str | None = None, skip_dimensions: set | None = None) None¶
- Import datasets into this repository that were exported from a different butler repository via - export.- Parameters:
- directoryResourcePathExpression, optional
- Directory containing dataset files to import from. If - None,- filenameand all dataset file paths specified therein must be absolute.
- filenameResourcePathExpressionorTextIO
- A stream or name of file that contains database information associated with the exported datasets, typically generated by - export. If this a string (name) or- ResourcePathand is not an absolute path, it will first be looked for relative to- directoryand if not found there it will be looked for in the current working directory. Defaults to “export.{format}”.
- formatstr, optional
- File format for - filename. If- None, the extension of- filenamewill be used.
- transferstr, optional
- Transfer mode passed to - ingest.
- skip_dimensionsset, optional
- Names of dimensions that should be skipped and not imported. 
 
- directory
- Raises:
- TypeError
- Raised if the set of arguments passed is inconsistent, or if the butler is read-only. 
 
 
 - abstract ingest(*datasets: FileDataset, transfer: str | None = 'auto', record_validation_info: bool = True) None¶
- Store and register one or more datasets that already exist on disk. - Parameters:
- *datasetsFileDataset
- Each positional argument is a struct containing information about a file to be ingested, including its URI (either absolute or relative to the datastore root, if applicable), a resolved - DatasetRef, and optionally a formatter class or its fully-qualified string name. If a formatter is not provided, the formatter that would be used for- putis assumed. On successful ingest all- FileDataset.formatterattributes will be set to the formatter class used.- FileDataset.pathattributes may be modified to put paths in whatever the datastore considers a standardized form.
- transferstr, optional
- If not - None, must be one of ‘auto’, ‘move’, ‘copy’, ‘direct’, ‘split’, ‘hardlink’, ‘relsymlink’ or ‘symlink’, indicating how to transfer the file.
- record_validation_infobool, optional
- If - True, the default, the datastore can record validation information associated with the file. If- Falsethe datastore will not attempt to track any information such as checksums or file sizes. This can be useful if such information is tracked in an external system or if the file is to be compressed in place. It is up to the datastore whether this parameter is relevant.
 
- *datasets
- Raises:
- TypeError
- Raised if the butler is read-only or if no run was provided. 
- NotImplementedError
- Raised if the - Datastoredoes not support the given transfer mode.
- DatasetTypeNotSupportedError
- Raised if one or more files to be ingested have a dataset type that is not supported by the - Datastore..
- FileNotFoundError
- Raised if one of the given files does not exist. 
- FileExistsError
- Raised if transfer is not - Nonebut the (internal) location the file would be moved to is already occupied.
 
 - Notes - This operation is not fully exception safe: if a database operation fails, the given - FileDatasetinstances may be only partially updated.- It is atomic in terms of database operations (they will either all succeed or all fail) providing the database engine implements transactions correctly. It will attempt to be atomic in terms of filesystem operations as well, but this cannot be implemented rigorously for most datastores. 
 - static makeRepo(root: str | ParseResult | ResourcePath | Path, config: Config | str | None = None, dimensionConfig: Config | str | None = None, standalone: bool = False, searchPaths: list[str] | None = None, forceConfigRoot: bool = True, outfile: str | ParseResult | ResourcePath | Path | None = None, overwrite: bool = False) Config¶
- Create an empty data repository by adding a butler.yaml config to a repository root directory. - Parameters:
- rootlsst.resources.ResourcePathExpression
- Path or URI to the root location of the new repository. Will be created if it does not exist. 
- configConfigorstr, optional
- Configuration to write to the repository, after setting any root-dependent Registry or Datastore config options. Can not be a - ButlerConfigor a- ConfigSubset. If- None, default configuration will be used. Root-dependent config options specified in this config are overwritten if- forceConfigRootis- True.
- dimensionConfigConfigorstr, optional
- Configuration for dimensions, will be used to initialize registry database. 
- standalonebool
- If True, write all expanded defaults, not just customized or repository-specific settings. This (mostly) decouples the repository from the default configuration, insulating it from changes to the defaults (which may be good or bad, depending on the nature of the changes). Future additions to the defaults will still be picked up when initializing - Butlersto repos created with- standalone=True.
- searchPathslistofstr, optional
- Directory paths to search when calculating the full butler configuration. 
- forceConfigRootbool, optional
- If - False, any values present in the supplied- configthat would normally be reset are not overridden and will appear directly in the output config. This allows non-standard overrides of the root directory for a datastore or registry to be given. If this parameter is- Truethe values for- rootwill be forced into the resulting config if appropriate.
- outfilelss.resources.ResourcePathExpression, optional
- If not- - None, the output configuration will be written to this location rather than into the repository itself. Can be a URI string. Can refer to a directory that will be used to write- butler.yaml.
- overwritebool, optional
- Create a new configuration file even if one already exists in the specified output location. Default is to raise an exception. 
 
- root
- Returns:
- Raises:
- ValueError
- Raised if a ButlerConfig or ConfigSubset is passed instead of a regular Config (as these subclasses would make it impossible to support - standalone=False).
- FileExistsError
- Raised if the output config file already exists. 
- os.error
- Raised if the directory does not exist, exists but is not a directory, or cannot be created. 
 
 - Notes - Note that when - standalone=False(the default), the configuration search path (see- ConfigSubset.defaultSearchPaths) that was used to construct the repository should also be used to construct any Butlers to avoid configuration inconsistencies.
 - abstract put(obj: Any, datasetRefOrType: DatasetRef | DatasetType | str, /, dataId: DataId | None = None, *, run: str | None = None, **kwargs: Any) DatasetRef¶
- Store and register a dataset. - Parameters:
- objobject
- The dataset. 
- datasetRefOrTypeDatasetRef,DatasetType, orstr
- When - DatasetRefis provided,- dataIdshould be- None. Otherwise the- DatasetTypeor name thereof. If a fully resolved- DatasetRefis given the run and ID are used directly.
- dataIddictorDataCoordinate
- A - dictof- Dimensionlink name, value pairs that label the- DatasetRefwithin a Collection. When- None, a- DatasetRefshould be provided as the second argument.
- runstr, optional
- The name of the run the dataset should be added to, overriding - self.run. Not used if a resolved- DatasetRefis provided.
- **kwargs
- Additional keyword arguments used to augment or construct a - DataCoordinate. See- DataCoordinate.standardizeparameters. Not used if a resolve- DatasetRefis provided.
 
- obj
- Returns:
- refDatasetRef
- A reference to the stored dataset, updated with the correct id if given. 
 
- ref
- Raises:
- TypeError
- Raised if the butler is read-only or if no run has been provided. 
 
 
 - abstract removeRuns(names: Iterable[str], unstore: bool = True) None¶
- Remove one or more - RUNcollections and the datasets within them.- Parameters:
- namesIterable[str]
- The names of the collections to remove. 
- unstorebool, optional
- If - True(default), delete datasets from all datastores in which they are present, and attempt to rollback the registry deletions if datastore deletions fail (which may not always be possible). If- False, datastore records for these datasets are still removed, but any artifacts (e.g. files) will not be.
 
- names
- Raises:
- TypeError
- Raised if one or more collections are not of type - RUN.
 
 
 - abstract retrieveArtifacts(refs: Iterable[DatasetRef], destination: ResourcePathExpression, transfer: str = 'auto', preserve_path: bool = True, overwrite: bool = False) list[ResourcePath]¶
- Retrieve the artifacts associated with the supplied refs. - Parameters:
- refsiterable of DatasetRef
- The datasets for which artifacts are to be retrieved. A single ref can result in multiple artifacts. The refs must be resolved. 
- destinationlsst.resources.ResourcePathorstr
- Location to write the artifacts. 
- transferstr, optional
- Method to use to transfer the artifacts. Must be one of the options supported by - transfer_from(). “move” is not allowed.
- preserve_pathbool, optional
- If - Truethe full path of the artifact within the datastore is preserved. If- Falsethe final file component of the path is used.
- overwritebool, optional
- If - Trueallow transfers to overwrite existing files at the destination.
 
- refsiterable of 
- Returns:
- targetslistoflsst.resources.ResourcePath
- URIs of file artifacts in destination location. Order is not preserved. 
 
- targets
 - Notes - For non-file datastores the artifacts written to the destination may not match the representation inside the datastore. For example a hierarchical data structure in a NoSQL database may well be stored as a JSON file. 
 - abstract transaction() AbstractContextManager[None]¶
- Context manager supporting - Butlertransactions.- Transactions can be nested. 
 - abstract transfer_dimension_records_from(source_butler: LimitedButler | Butler, source_refs: Iterable[DatasetRef]) None¶
- Transfer dimension records to this Butler from another Butler. - Parameters:
- source_butlerLimitedButlerorButler
- Butler from which the records are to be transferred. If data IDs in - source_refsare not expanded then this has to be a full- Butlerwhose registry will be used to expand data IDs. If the source refs contain coordinates that are used to populate other records then this will also need to be a full- Butler.
- source_refsiterable of DatasetRef
- Datasets defined in the source butler whose dimension records should be transferred to this butler. In most circumstances. transfer is faster if the dataset refs are expanded. 
 
- source_butler
 
 - abstract transfer_from(source_butler: LimitedButler, source_refs: Iterable[DatasetRef], transfer: str = 'auto', skip_missing: bool = True, register_dataset_types: bool = False, transfer_dimensions: bool = False, dry_run: bool = False) Collection[DatasetRef]¶
- Transfer datasets to this Butler from a run in another Butler. - Parameters:
- source_butlerLimitedButler
- Butler from which the datasets are to be transferred. If data IDs in - source_refsare not expanded then this has to be a full- Butlerwhose registry will be used to expand data IDs.
- source_refsiterable of DatasetRef
- Datasets defined in the source butler that should be transferred to this butler. In most circumstances, - transfer_fromis faster if the dataset refs are expanded.
- transferstr, optional
- Transfer mode passed to - transfer_from.
- skip_missingbool
- If - True, datasets with no datastore artifact associated with them are not transferred. If- Falsea registry entry will be created even if no datastore record is created (and so will look equivalent to the dataset being unstored).
- register_dataset_typesbool
- If - Trueany missing dataset types are registered. Otherwise an exception is raised.
- transfer_dimensionsbool, optional
- If - True, dimension record data associated with the new datasets will be transferred.
- dry_runbool, optional
- If - Truethe transfer will be processed without any modifications made to the target butler and as if the target butler did not have any of the datasets.
 
- source_butler
- Returns:
- refslistofDatasetRef
- The refs added to this Butler. 
 
- refs
 - Notes - The datastore artifact has to exist for a transfer to be made but non-existence is not an error. - Datasets that already exist in this run will be skipped. - The datasets are imported as part of a transaction, although dataset types are registered before the transaction is started. This means that it is possible for a dataset type to be registered even though transfer has failed. 
 - abstract validateConfiguration(logFailures: bool = False, datasetTypeNames: Iterable[str] | None = None, ignore: Iterable[str] | None = None) None¶
- Validate butler configuration. - Checks that each - DatasetTypecan be stored in the- Datastore.- Parameters:
- logFailuresbool, optional
- If - True, output a log message for every validation error detected.
- datasetTypeNamesiterable of str, optional
- The - DatasetTypenames that should be checked. This allows only a subset to be selected.
- ignoreiterable of str, optional
- Names of DatasetTypes to skip over. This can be used to skip known problems. If a named - DatasetTypecorresponds to a composite, all components of that- DatasetTypewill also be ignored.
 
- logFailures
- Raises:
- ButlerValidationError
- Raised if there is some inconsistency with how this Butler is configured.