Note
The butler
subcommands documented here are only those defined in the daf_butler
package itself; downstream packages can implement additional subcommands via the plugin system described at The Butler Command.
The best way to get a complete list of subcommands is to use butler --help
.
butler¶
butler [OPTIONS] COMMAND [ARGS]...
Options
-
--log-level
<LEVEL|COMPONENT=LEVEL>
¶ The logging level. Supported levels are [CRITICAL|ERROR|WARNING|INFO|DEBUG]
-
--long-log
¶
Make log messages appear in long format.
-
--progress
,
--no-progress
¶
Show a progress bar for slow operations when possible.
associate¶
Add existing datasets to a tagged collection; searches for datasets with the options and adds them to the named COLLECTION.
COLLECTION is the collection the datasets should be associated with.
butler associate [OPTIONS] REPO COLLECTION
Options
-
-d
,
--dataset-type
<dataset_type>
¶ One or more glob-style expressions that fully or partially identify the dataset type names to be queried.
-
--collections
<collections>
¶ One or more expressions that fully or partially identify the collections to search for datasets. If not provided all datasets are returned.
-
--where
<where>
¶ A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or a dimension name as a shortcut for the primary key column of a dimension table.
-
--find-first
¶
For each result data ID, only yield one DatasetRef of each DatasetType, from the first collection in which a dataset of that dataset type appears (according to the order of ‘collections’ passed in). If used, ‘collections’ must specify at least one expression and must not contain wildcards.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
COLLECTION
¶
Required argument
See ‘butler –help’ for more options.
certify-calibrations¶
Certify calibrations in a repository.
REPO is the URI or path to an existing data repository root or configuration file.
butler certify-calibrations [OPTIONS] REPO INPUT_COLLECTION OUTPUT_COLLECTION
DATASET_TYPE_NAME
Options
-
--begin-date
<begin_date>
¶ ISO-8601 datetime (TAI) of the beginning of the validity range for the certified calibrations.
-
--end-date
<end_date>
¶ ISO-8601 datetime (TAI) of the end of the validity range for the certified calibrations.
-
--search-all-inputs
¶
Search all children of the inputCollection if it is a CHAINED collection, instead of just the most recent one.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
INPUT_COLLECTION
¶
Required argument
-
OUTPUT_COLLECTION
¶
Required argument
-
DATASET_TYPE_NAME
¶
Required argument
See ‘butler –help’ for more options.
config-dump¶
Dump either a subset or full Butler configuration to standard output.
REPO is the URI or path to an existing data repository root or configuration file.
butler config-dump [OPTIONS] REPO
Options
-
-s
,
--subset
<subset>
¶ Subset of a configuration to report. This can be any key in the hierarchy such as ‘.datastore.root’ where the leading ‘.’ specified the delimiter for the hierarchy.
-
-p
,
--searchpath
<TEXT ...>
¶ Additional search paths to use for configuration overrides
-
--file
<outfile>
¶ Print the (possibly-expanded) configuration for a repository to a file, or to stdout by default.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
See ‘butler –help’ for more options.
config-validate¶
Validate the configuration files for a Gen3 Butler repository.
REPO is the URI or path to an existing data repository root or configuration file.
butler config-validate [OPTIONS] REPO
Options
-
-q
,
--quiet
¶
Do not report individual failures.
-
-d
,
--dataset-type
<dataset_type>
¶ Specific DatasetType(s) to validate.
-
-i
,
--ignore
<TEXT ...>
¶ DatasetType(s) to ignore for validation.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
See ‘butler –help’ for more options.
convert¶
Convert one or more Butler gen 2 repositories into a gen 3 repository.
REPO is the URI or path to the gen3 repository. Will be created if it does not already exist
This is a highly simplified interface that should only be used to convert suites of gen 2 repositories that contain at most one calibration repo and has no chained reruns. Custom scripts that call ConvertRepoTask should be used on more complex suites of repositories.
butler convert [OPTIONS] REPO
Options
-
--gen2root
<gen2root>
¶ Required Root path of the gen 2 repo to be converted.
-
--skymap-name
<skymap_name>
¶ Name of the new gen3 skymap (e.g. ‘discrete/ci_hsc’).
-
--skymap-config
<skymap_config>
¶ Path to skymap config file defining the new gen3 skymap.
-
--calibs
<calibs>
¶ Path to the gen 2 calibration repo. It can be absolute or relative to gen2root.
-
--reruns
<TEXT ...>
¶ List of rerun paths to convert. Output collection names will be guessed, which can fail if the Gen2 repository paths do not follow a recognized convention. In this case, the command-line interface cannot be used.
-
-t
,
--transfer
<transfer>
¶ Mode to use to transfer files into the new repository.
- Options
auto | link | symlink | hardlink | copy | move | relsymlink | direct
-
-j
,
--processes
<processes>
¶ Number of processes to use.
-
-C
,
--config-file
<config_file>
¶ Path to a
ConvertRepoConfig
override to be included after the Instrument config overrides are applied.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
See ‘butler –help’ for more options.
create¶
Create an empty Gen3 Butler repository.
REPO is the URI or path to the new repository. Will be created if it does not exist.
butler create [OPTIONS] REPO
Options
-
--seed-config
<seed_config>
¶ Path to an existing YAML config file to apply (on top of defaults).
-
--dimension-config
<dimension_config>
¶ Path to an existing YAML config file with dimension configuration.
-
--standalone
¶
Include all defaults in the config file in the repo, insulating the repo from changes in package defaults.
-
--override
¶
Allow values in the supplied config to override all repo settings.
-
-f
,
--outfile
<outfile>
¶ Name of output file to receive repository configuration. Default is to write butler.yaml into the specified repo.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
See ‘butler –help’ for more options.
define-visits¶
Define visits from exposures in the butler registry.
REPO is the URI or path to the gen3 repository. Will be created if it does not already exist
The name or fully-qualified class name of an instrument.
butler define-visits [OPTIONS] REPO INSTRUMENT
Options
-
-C
,
--config-file
<config_file>
¶ Path to a pex_config override to be included after the Instrument config overrides are applied.
-
--collections
<TEXT ...>
¶ The collections to be searched (in order) when reading datasets.
-
-j
,
--processes
<processes>
¶ Number of processes to use.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
INSTRUMENT
¶
Required argument
See ‘butler –help’ for more options.
import¶
Import data into a butler repository.
REPO is the URI or path to the new repository. Will be created if it does not exist.
DIRECTORY is the folder containing dataset files.
butler import [OPTIONS] REPO DIRECTORY
Options
-
-t
,
--transfer
<transfer>
¶ The external data transfer mode.
- Options
auto | link | symlink | hardlink | copy | move | relsymlink | direct
-
--export-file
<export_file>
¶ Name for the file that contains database information associated with the exported datasets. If this is not an absolute path, does not exist in the current working directory, and –dir is provided, it is assumed to be in that directory. Defaults to “export.yaml”.
-
-s
,
--skip-dimensions
<TEXT ...>
¶ Dimensions that should be skipped during import
-
--reuse-ids
¶
Force re-use of imported dataset IDs for integer IDs.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
DIRECTORY
¶
Required argument
See ‘butler –help’ for more options.
ingest-raws¶
Ingest raw frames into from a directory into the butler registry
REPO is the URI or path to the gen3 repository. Will be created if it does not already exist
LOCATIONS specifies files to ingest and/or locations to search for files.
butler ingest-raws [OPTIONS] REPO LOCATIONS ...
Options
-
--regex
<regex>
¶ Regex string used to find files in directories listed in LOCATIONS. Searches for fits files by default.
-
-c
,
--config
<TEXT=TEXT>
¶ Config override, as a key-value pair.
-
-C
,
--config-file
<config_file>
¶ Path to a pex config override to be included after the Instrument config overrides are applied.
-
--output-run
<output_run>
¶ The name of the run datasets should be output to.
-
-t
,
--transfer
<transfer>
¶ The external data transfer mode.
- Options
auto | link | symlink | hardlink | copy | move | relsymlink | direct
-
-j
,
--processes
<processes>
¶ Number of processes to use.
-
--ingest-task
<ingest_task>
¶ The fully qualified class name of the ingest task to use.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
LOCATIONS
¶
Required argument(s)
See ‘butler –help’ for more options.
make-discrete-skymap¶
Define a discrete skymap from calibrated exposures in the butler registry.
REPO is the URI or path to the gen3 repository. Will be created if it does not already exist
The fully-qualified name of an Instrument subclass.
butler make-discrete-skymap [OPTIONS] REPO INSTRUMENT
Options
-
-C
,
--config-file
<config_file>
¶ Path to a pex_config override to be included after the Instrument config overridesare applied.
-
--collections
<collections>
¶ Required The collections to be searched (in order) when reading datasets. This includes the seed skymap if –append is specified.
-
--skymap-id
<skymap_id>
¶ The identifier of the skymap to write.
- Default
discrete
-
--old-skymap-id
<old_skymap_id>
¶ The identifier of the previous skymap to append to, if config.doAppend is True.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
INSTRUMENT
¶
Required argument
See ‘butler –help’ for more options.
prune-collection¶
Remove a collection and possibly prune datasets within it.
REPO is the URI or path to an existing data repository root or configuration file.
COLLECTION is the Name of the collection to remove. If this is a tagged or chained collection, datasets within the collection are not modified unless –unstore is passed. If this is a run collection, –purge and –unstore must be passed, and all datasets in it are fully removed from the data repository.
butler prune-collection [OPTIONS] REPO COLLECTION
Options
-
--purge
¶
Permit RUN collections to be removed, fully removing datasets within them. Requires –unstore as an added precaution against accidental deletion. Must not be passed if the collection is not a RUN.
-
--unstore
¶
Remove all datasets in the collection from all datastores in which they appear.
-
--unlink
<unlink>
¶ Before removing the given
collection
unlink it from from this parent collection.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
COLLECTION
¶
Required argument
See ‘butler –help’ for more options.
prune-datasets¶
Query for and remove one or more datasets from a collection and/or storage.
REPO is the URI or path to an existing data repository root or configuration file.
COLLECTIONS is or more expressions that identify the collections to search for datasets. Glob-style expressions may be used but only if the –find-all flag is also passed.
butler prune-datasets [OPTIONS] REPO [COLLECTIONS] ...
Options
-
--datasets
<datasets>
¶ One or more glob-style expressions that identify the dataset types to be pruned.
-
--find-all
¶
Purge the dataset results from all of the collections in which a dataset of that dataset type + data id combination appear. (By default only the first found dataset type + data id is purged, according to the order of COLLECTIONS passed in).
-
--where
<where>
¶ A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or a dimension name as a shortcut for the primary key column of a dimension table.
-
--disassociate
<TAG>
¶ Disassociate pruned datasets from the given tagged collections. May not be used with –purge.
-
--purge
<RUN>
¶ Completely remove the dataset from the given RUN in the Registry. May not be used with –disassociate. Note, this may remove provenance information from datasets other than those provided, and should be used with extreme care.
-
--unstore
¶
Remove these datasets from all datastores configured with this data repository. If –disassociate and –purge are not used then –unstore will be used by default. Note that –unstore will make it impossible to retrieve these datasets even via other collections. Datasets that are already not stored are ignored by this option.
-
--dry-run
¶
Display the datasets that would be removed but do not remove them.
Note that a dataset can be in collections other than its RUN-type collection, and removing it will remove it from all of them, even though the only one this will show is its RUN collection.
-
--confirm
,
--no-confirm
¶
Print expected action and a confirmation prompt before executing. Default is –confirm.
-
--quiet
¶
Makes output quiet. Implies –no-confirm. Requires –dry-run not be passed.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
COLLECTIONS
¶
Optional argument(s)
See ‘butler –help’ for more options.
query-collections¶
Get the collections whose names match an expression.
REPO is the URI or path to an existing data repository root or configuration file.
GLOB is one or more glob-style expressions that fully or partially identify the collections to return.
butler query-collections [OPTIONS] REPO [GLOB] ...
Options
-
--collection-type
<collection_type>
¶ If provided, only list collections of this type.
- Options
RUN | TAGGED | CHAINED | CALIBRATION
-
--chains
<chains>
¶ Affects how results are presented. TABLE lists each dataset in a row with chained datasets’ children listed in a Definition column. TREE lists children below their parent in tree form. FLATTEN lists all datasets, including child datasets in one list.Defaults to TABLE.
- Options
TABLE | TREE | FLATTEN
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
GLOB
¶
Optional argument(s)
See ‘butler –help’ for more options.
query-data-ids¶
List the data IDs in a repository.
REPO is the URI or path to an existing data repository root or configuration file.
DIMENSIONS are the keys of the data IDs to yield, such as exposure, instrument, or tract. Will be expanded to include any dependencies.
butler query-data-ids [OPTIONS] REPO [DIMENSIONS] ...
Options
-
--collections
<collections>
¶ One or more expressions that fully or partially identify the collections to search for datasets. If not provided all datasets are returned.
-
--datasets
<datasets>
¶ An expression that fully or partially identifies dataset types that should constrain the yielded data IDs. For example, including “raw” here would constrain the yielded “instrument”, “exposure”, “detector”, and “physical_filter” values to only those for which at least one “raw” dataset exists in “collections”.
-
--where
<where>
¶ A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or a dimension name as a shortcut for the primary key column of a dimension table.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
DIMENSIONS
¶
Optional argument(s)
See ‘butler –help’ for more options.
query-dataset-types¶
Get the dataset types in a repository.
REPO is the URI or path to an existing data repository root or configuration file.
GLOB is one or more glob-style expressions that fully or partially identify the dataset types to return.
butler query-dataset-types [OPTIONS] REPO [GLOB] ...
Options
-
-v
,
--verbose
¶
Include dataset type name, dimensions, and storage class in output.
-
--components
,
--no-components
¶
For –components, apply all expression patterns to component dataset type names as well. For –no-components, never apply patterns to components. Default (where neither is specified) is to apply patterns to components only if their parent datasets were not matched by the expression. Fully-specified component datasets (
str
orDatasetType
instances) are always included.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
GLOB
¶
Optional argument(s)
See ‘butler –help’ for more options.
query-datasets¶
List the datasets in a repository.
REPO is the URI or path to an existing data repository root or configuration file.
GLOB is one or more glob-style expressions that fully or partially identify the dataset type names to be queried.
butler query-datasets [OPTIONS] REPO [GLOB] ...
Options
-
--collections
<collections>
¶ One or more expressions that fully or partially identify the collections to search for datasets. If not provided all datasets are returned.
-
--where
<where>
¶ A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or a dimension name as a shortcut for the primary key column of a dimension table.
-
--find-first
¶
For each result data ID, only yield one DatasetRef of each DatasetType, from the first collection in which a dataset of that dataset type appears (according to the order of ‘collections’ passed in). If used, ‘collections’ must specify at least one expression and must not contain wildcards.
-
--show-uri
¶
Show the dataset URI in results.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
GLOB
¶
Optional argument(s)
See ‘butler –help’ for more options.
query-dimension-records¶
Query for dimension information.
REPO is the URI or path to an existing data repository root or configuration file.
ELEMENT is the dimension element to obtain.
butler query-dimension-records [OPTIONS] REPO ELEMENT
Options
-
--datasets
<datasets>
¶ An expression that fully or partially identifies dataset types that should constrain the yielded records. Only affects results when used with –collections.
-
--collections
<collections>
¶ One or more expressions that fully or partially identify the collections to search for datasets. If not provided all datasets are returned. Only affects results when used with –datasets.
-
--where
<where>
¶ A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or a dimension name as a shortcut for the primary key column of a dimension table.
-
--no-check
¶
Don’t check the query before execution. By default the query is checked before it executed, this may reject some valid queries that resemble common mistakes.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
ELEMENT
¶
Required argument
See ‘butler –help’ for more options.
register-dcr-subfilters¶
Construct a set of subfilters for chromatic modeling and add them to a registry.
REPO is the URI or path to the gen3 repository. Will be created if it does not already exist
NUM_SUBFILTERS is the number of subfilters to be used for chromatic modeling.
BAND_NAMES names of the bands to define chromatic subfilters for in the registry. Each band will have the same number of subfilters defined, for example ‘g0’, ‘g1’, and ‘g2’ for three subfilters and band ‘g’.
butler register-dcr-subfilters [OPTIONS] REPO NUM_SUBFILTERS BAND_NAMES ...
Options
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
NUM_SUBFILTERS
¶
Required argument
-
BAND_NAMES
¶
Required argument(s)
See ‘butler –help’ for more options.
register-instrument¶
Add an instrument to the data repository.
REPO is the URI or path to the gen3 repository. Will be created if it does not already exist
The fully-qualified name of an Instrument subclass.
butler register-instrument [OPTIONS] REPO INSTRUMENT ...
Arguments
-
REPO
¶
Required argument
-
INSTRUMENT
¶
Required argument(s)
See ‘butler –help’ for more options.
register-skymap¶
Make a SkyMap and add it to a repository.
REPO is the URI or path to the gen3 repository. Will be created if it does not already exist
butler register-skymap [OPTIONS] REPO
Options
-
-c
,
--config
<TEXT=TEXT>
¶ Config override, as a key-value pair.
-
-C
,
--config-file
<config_file>
¶ Path to a config file overrides file.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
See ‘butler –help’ for more options.
remove-dataset-type¶
Remove a dataset type definition from a repository.
REPO is the URI or path to an existing data repository root or configuration file.
butler remove-dataset-type [OPTIONS] REPO DATASET_TYPE_NAME
Arguments
-
REPO
¶
Required argument
-
DATASET_TYPE_NAME
¶
Required argument
See ‘butler –help’ for more options.
write-curated-calibrations¶
Add an instrument’s curated calibrations to the data repository.
REPO is the URI or path to the gen3 repository. Will be created if it does not already exist
The fully-qualified name of an Instrument subclass.
butler write-curated-calibrations [OPTIONS] REPO INSTRUMENT
Options
-
--collection
<collection>
¶ Name of the calibration collection that associates datasets with validity ranges.
-
--label
<labels>
¶ Extra strings to include (with automatic delimiters) in all RUN collection names, as well as the calibration collection name if it is not provided via –collection.
-
-
@
,
--options-file
<options_file>
¶ URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.
Arguments
-
REPO
¶
Required argument
-
INSTRUMENT
¶
Required argument
See ‘butler –help’ for more options.