pipetask

Implement pipetask command line.

pipetask [OPTIONS] COMMAND [ARGS]...

Options

--log-level <LEVEL|COMPONENT=LEVEL>

The logging level. Without an explicit logger name, will only affect the default root loggers (lsst). To modify the root logger use ‘.=LEVEL’. Supported levels are [CRITICAL|ERROR|WARNING|INFO|VERBOSE|DEBUG|TRACE]

--long-log

Make log messages appear in long format.

--log-file <log_file>

File(s) to write log messages. If the path ends with ‘.json’ then JSON log records will be written, else formatted text log records will be written. This file can exist and records will be appended.

--log-tty, --no-log-tty

Log to terminal (default). If false logging to terminal is disabled.

--log-label <log_label>

Keyword=value pairs to add to MDC of log records.

aggregate-reports

Aggregate pipetask report output on disjoint data-id groups into one Summary over common tasks and datasets. Intended for use when the same pipeline has been run over all groups (i.e., to aggregate all reports for a given step). This functionality is only compatible with reports from the QuantumProvenanceGraph, so the reports must be run over multiple groups or with the --force-v2 option.

Save the report as a file (--full-output-filename) or print it to stdout (default). If the terminal is overwhelmed with data_ids from failures try the --brief option.

FILENAMES are the space-separated paths to json file output created by pipetask report.

pipetask aggregate-reports [OPTIONS] [FILENAMES]...

Options

--full-output-filename <full_output_filename>

Output report as a file with this name (json).

--brief

Only show counts in report (a brief summary). Note that counts are also printed to the screen when using the –full-output-filename option.

Arguments

FILENAMES

Optional argument(s)

See ‘pipetask –help’ for more options.

build

Build and optionally save pipeline definition.

This does not require input data to be specified.

pipetask build [OPTIONS]

Options

--show <ITEM|ITEM=VALUE>

Dump various info to standard output. Possible items are: config, config=[Task::]<PATTERN> or config=[Task::]<PATTERN>:NOIGNORECASE to dump configuration fields possibly matching given pattern and/or task label; history=<FIELD> to dump configuration history for a field, field name is specified as [Task::]<PATTERN>; dump-config, dump-config=Task to dump complete configuration for a task given its label or all tasks; pipeline to show pipeline composition; graph to show information about quanta; workflow to show information about quanta and their dependency; tasks to show task composition; uri to show predicted dataset URIs of quanta; pipeline-graph for a text-based visualization of the pipeline (tasks and dataset types); task-graph for a text-based visualization of just the tasks. With -b, pipeline-graph and task-graph include additional information.

-p, --pipeline <pipeline>

Location of a pipeline definition file in YAML format.

-t, --task <TASK[:LABEL>

Task name to add to pipeline, must be a fully qualified task name. Task name can be followed by colon and label name, if label is not given then task base name (class name) is used as label.

--delete <LABEL>

Delete task with given label from pipeline.

-c, --config <LABEL:NAME=VALUE>

Config override, as a key-value pair.

-C, --config-file <LABEL:FILE>

Configuration override file(s), applies to a task with a given label.

--order-pipeline

Order tasks in pipeline based on their data dependencies, ordering is performed as last step before saving or executing pipeline.

-s, --save-pipeline <save_pipeline>

Location for storing resulting pipeline definition in YAML format.

--pipeline-dot <pipeline_dot>

Location for storing GraphViz DOT representation of a pipeline.

--instrument <instrument>

Add an instrument which will be used to load config overrides when defining a pipeline. This must be the fully qualified class name.

-b, --butler-config <butler_config>

Location of the gen3 butler/registry config file.

-@, --options-file <options_file>

URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.

Notes:

–task, –delete, –config, –config-file, and –instrument action options can appear multiple times; all values are used, in order left to right.

FILE reads command-line options from the specified file. Data may be distributed among multiple lines (e.g. one option per line). Data after # is treated as a comment and ignored. Blank lines and lines starting with # are ignored.)

See ‘pipetask –help’ for more options.

cleanup

Remove non-members of CHAINED collections.

Removes collections that start with the same name as a CHAINED collection but are not members of that collection.

pipetask cleanup [OPTIONS] COLLECTION

Options

-b, --butler-config <butler_config>

Location of the gen3 butler/registry config file.

--confirm, --no-confirm

Print expected action and a confirmation prompt before executing. Default is –confirm.

Arguments

COLLECTION

Required argument

See ‘pipetask –help’ for more options.

pre-exec-init-qbb

Execute pre-exec-init on Quantum-Backed Butler.

REPO is a URI to a butler configuration that is used to configure the datastore of the quantum-backed butler.

REPO is the location of the butler/registry config file.

QGRAPH is the path to a serialized Quantum Graph file.

pipetask pre-exec-init-qbb [OPTIONS] REPO QGRAPH

Options

--config-search-path <PATH>

Additional search paths for butler configuration.

--qgraph-id <qgraph_id>

Quantum graph identifier, if specified must match the identifier of the graph loaded from a file. Ignored if graph is not loaded from a file.

--coverage

Enable coverage output (requires coverage package).

--cov-report, --no-cov-report

If coverage is enabled, controls whether to produce an HTML coverage report.

--cov-packages <cov_packages>

Python packages to restrict coverage to. If none are provided, runs coverage on all packages.

Arguments

REPO

Required argument

QGRAPH

Required argument

See ‘pipetask –help’ for more options.

purge

Remove a CHAINED collection and its contained collections.

COLLECTION is the name of the chained collection to purge. it must not be a child of any other CHAINED collections

Child collections must be members of exactly one collection.

The collections that will be removed will be printed, there will be an option to continue or abort (unless using –no-confirm).

pipetask purge [OPTIONS] COLLECTION

Options

-b, --butler-config <butler_config>

Location of the gen3 butler/registry config file.

--confirm, --no-confirm

Print expected action and a confirmation prompt before executing. Default is –confirm.

--recursive

If the parent CHAINED collection has child CHAINED collections, search the children until nested chains that start with the parent’s name are removed.

Arguments

COLLECTION

Required argument

See ‘pipetask –help’ for more options.

qgraph

Build and optionally save quantum graph.

pipetask qgraph [OPTIONS]

Options

--show <ITEM|ITEM=VALUE>

Dump various info to standard output. Possible items are: config, config=[Task::]<PATTERN> or config=[Task::]<PATTERN>:NOIGNORECASE to dump configuration fields possibly matching given pattern and/or task label; history=<FIELD> to dump configuration history for a field, field name is specified as [Task::]<PATTERN>; dump-config, dump-config=Task to dump complete configuration for a task given its label or all tasks; pipeline to show pipeline composition; graph to show information about quanta; workflow to show information about quanta and their dependency; tasks to show task composition; uri to show predicted dataset URIs of quanta; pipeline-graph for a text-based visualization of the pipeline (tasks and dataset types); task-graph for a text-based visualization of just the tasks. With -b, pipeline-graph and task-graph include additional information.

-p, --pipeline <pipeline>

Location of a pipeline definition file in YAML format.

-t, --task <TASK[:LABEL>

Task name to add to pipeline, must be a fully qualified task name. Task name can be followed by colon and label name, if label is not given then task base name (class name) is used as label.

--delete <LABEL>

Delete task with given label from pipeline.

-c, --config <LABEL:NAME=VALUE>

Config override, as a key-value pair.

-C, --config-file <LABEL:FILE>

Configuration override file(s), applies to a task with a given label.

--order-pipeline

Order tasks in pipeline based on their data dependencies, ordering is performed as last step before saving or executing pipeline.

-s, --save-pipeline <save_pipeline>

Location for storing resulting pipeline definition in YAML format.

--pipeline-dot <pipeline_dot>

Location for storing GraphViz DOT representation of a pipeline.

--instrument <instrument>

Add an instrument which will be used to load config overrides when defining a pipeline. This must be the fully qualified class name.

-b, --butler-config <butler_config>

Location of the gen3 butler/registry config file.

-g, --qgraph <qgraph>

Location for a serialized quantum graph definition (pickle file). If this option is given then all input data options and pipeline-building options cannot be used. Can be a URI.

--qgraph-id <qgraph_id>

Quantum graph identifier, if specified must match the identifier of the graph loaded from a file. Ignored if graph is not loaded from a file.

--qgraph-node-id <qgraph_node_id>

Only load a specified set of nodes when graph is loaded from a file, nodes are identified by UUID values. One or more comma-separated integers are accepted. By default all nodes are loaded. Ignored if graph is not loaded from a file.

--qgraph-datastore-records

Include datastore records into generated quantum graph, these records are used by a quantum-backed butler.

--skip-existing-in <COLLECTION>

If all Quantum outputs already exist in the specified list of collections then that Quantum will be excluded from the QuantumGraph.

--skip-existing

This option is equivalent to –skip-existing-in with the name of the output RUN collection. If both –skip-existing-in and –skip-existing are given then output RUN collection is appended to the list of collections.

--clobber-outputs

Remove outputs of failed quanta from the output run when they would block the execution of new quanta with the same data ID (or assume that this will be done, if just building a QuantumGraph). Does nothing if –extend-run is not passed.

-q, --save-qgraph <save_qgraph>

URI location for storing a serialized quantum graph definition (pickle file).

--save-single-quanta <save_single_quanta>

Format string of locations for storing individual quantum graph definition (pickle files). The curly brace {} in the input string will be replaced by a quantum number. Can be a URI.

--qgraph-dot <qgraph_dot>

Location for storing GraphViz DOT representation of a quantum graph.

--summary <summary>

Location for storing job summary (JSON file). Note that the structure of this file may not be stable.

--save-execution-butler <save_execution_butler>

Export location for an execution-specific butler after making QuantumGraph

--clobber-execution-butler

When creating execution butler overwrite any existing products

--target-datastore-root <target_datastore_root>

Root directory for datastore of execution butler. Default is to use the original datastore.

--transfer <transfer>

Data transfer mode for the execution butler datastore. Defaults to “copy” if –target-datastore-root is provided.

Options:

auto | link | symlink | hardlink | copy | move | relsymlink | direct

--dataset-query-constraint <dataset_query_constraint>

When constructing a quantum graph constrain by pre-existence of specified dataset types. Valid values are all for all inputs dataset types in pipeline, off to not consider dataset type existence as a constraint, single or comma separated list of dataset type names.

--show-qgraph-header

Print the headerData for Quantum Graph to the console

--mock

Mock pipeline execution.

--mock-failure <LABEL:EXCEPTION:WHERE>

Specifications for tasks that should be configured to fail when mocking execution. This is a colon-separated 3-tuple or 4-tuple, where the first entry the task label, the second the fully-qualified exception type (empty for ValueError, and the third a string (which typically needs to be quoted to be passed as one argument value by the shell) of the form passed to –where, indicating which data IDs should fail. The final optional term is the memory “required” by the task (with units recognized by astropy), which will cause the error to only occur if the “available” memory (according to ExecutionResources.max_mem) is less than this value. Note that actual memory usage is irrelevant here; this is all mock behavior.

--unmocked-dataset-types <COLLECTION>

Names of input dataset types that should not be mocked.

--coverage

Enable coverage output (requires coverage package).

--cov-report, --no-cov-report

If coverage is enabled, controls whether to produce an HTML coverage report.

--cov-packages <cov_packages>

Python packages to restrict coverage to. If none are provided, runs coverage on all packages.

-b, --butler-config <butler_config>

Required Location of the gen3 butler/registry config file.

-i, --input <COLLECTION>

Comma-separated names of the input collection(s).

-o, --output <COLL>

Name of the output CHAINED collection. This may either be an existing CHAINED collection to use as both input and output (incompatible with –input), or a new CHAINED collection created to include all inputs (requires –input). In both cases, the collection’s children will start with an output RUN collection that directly holds all new datasets (see –output-run).

--output-run <COLL>

Name of the new output RUN collection. If not provided then –output must be provided and a new RUN collection will be created by appending a timestamp to the value passed with –output. If this collection already exists then –extend-run must be passed.

--extend-run

Instead of creating a new RUN collection, insert datasets into either the one given by –output-run (if provided) or the first child collection of –output (which must be of type RUN). This also enables –skip-existing option when building a graph. When executing a graph this option skips quanta with all existing outputs.

--replace-run

Before creating a new RUN collection in an existing CHAINED collection, remove the first child collection (which must be of type RUN). This can be used to repeatedly write to the same (parent) collection during development, but it does not delete the datasets associated with the replaced run unless –prune-replaced is also passed. Requires –output, and incompatible with –extend-run.

--prune-replaced <prune_replaced>

Delete the datasets in the collection replaced by –replace-run, either just from the datastore (‘unstore’) or by removing them and the RUN completely (‘purge’). Requires –replace-run.

Options:

unstore | purge

-d, --data-query <QUERY>

User data selection expression.

--rebase

Reset output collection chain if it is inconsistent with –inputs

-@, --options-file <options_file>

URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.

Notes:

–task, –delete, –config, –config-file, and –instrument action options can appear multiple times; all values are used, in order left to right.

FILE reads command-line options from the specified file. Data may be distributed among multiple lines (e.g. one option per line). Data after # is treated as a comment and ignored. Blank lines and lines starting with # are ignored.)

See ‘pipetask –help’ for more options.

report

Summarize the state of executed quantum graph(s), with counts of failed, successful and expected quanta, as well as counts of output datasets and their query (visible/shadowed) states. Analyze one or more attempts at the same processing on the same dataquery-identified “group” and resolve recoveries and persistent failures. Identify mismatch errors between attempts.

REPO is a URI to a butler configuration that is used to configure the datastore of the quantum-backed butler.

Save the report as a file (--full-output-filename) or print it to stdout (default). If the terminal is overwhelmed with data_ids from failures try the --brief option.

Butler collections and where options are for use in lsst.daf.butler.queryDatasets if paring down the collections would be useful. Pass collections in order of most to least recent. By default the collections and query will be taken from the graphs.

REPO is the location of the butler/registry config file.

QGRAPHS is a Sequence of links to serialized Quantum Graphs which have been executed and are to be analyzed. Pass the graphs in order of first to last executed.

pipetask report [OPTIONS] REPO [QGRAPHS]...

Options

--collections <collections>

One or more expressions that fully or partially identify the collections to search for datasets. If not provided all datasets are returned.

--where <where>

A string expression similar to a SQL WHERE clause.

--full-output-filename <full_output_filename>

Output report as a file with this name. For pipetask report on one graph, this should be a yaml file. For multiple graphs or when using the –force-v2 option, this should be a json file. We will be deprecating the single-graph-only (QuantumGraphExecutionReport) option soon.

--logs, --no-logs

Get butler log datasets for extra information.

--brief

Only show counts in report (a brief summary). Note that counts are also printed to the screen when using the –full-output-filename option.

--curse-failed-logs

If log datasets are missing in v2 (QuantumProvenanceGraph), mark them as cursed

--force-v2

Use the QuantumProvenanceGraph instead of the QuantumGraphExecutionReport, even when there is only one qgraph. Otherwise, the QuantumGraphExecutionReport will run on one graph by default.

Arguments

REPO

Required argument

QGRAPHS

Optional argument(s)

See ‘pipetask –help’ for more options.

run

Build and execute pipeline and quantum graph.

pipetask run [OPTIONS]

Options

--debug

Enable debugging output using lsstDebug facility (imports debug.py).

--show <ITEM|ITEM=VALUE>

Dump various info to standard output. Possible items are: config, config=[Task::]<PATTERN> or config=[Task::]<PATTERN>:NOIGNORECASE to dump configuration fields possibly matching given pattern and/or task label; history=<FIELD> to dump configuration history for a field, field name is specified as [Task::]<PATTERN>; dump-config, dump-config=Task to dump complete configuration for a task given its label or all tasks; pipeline to show pipeline composition; graph to show information about quanta; workflow to show information about quanta and their dependency; tasks to show task composition; uri to show predicted dataset URIs of quanta; pipeline-graph for a text-based visualization of the pipeline (tasks and dataset types); task-graph for a text-based visualization of just the tasks. With -b, pipeline-graph and task-graph include additional information.

-p, --pipeline <pipeline>

Location of a pipeline definition file in YAML format.

-t, --task <TASK[:LABEL>

Task name to add to pipeline, must be a fully qualified task name. Task name can be followed by colon and label name, if label is not given then task base name (class name) is used as label.

--delete <LABEL>

Delete task with given label from pipeline.

-c, --config <LABEL:NAME=VALUE>

Config override, as a key-value pair.

-C, --config-file <LABEL:FILE>

Configuration override file(s), applies to a task with a given label.

--order-pipeline

Order tasks in pipeline based on their data dependencies, ordering is performed as last step before saving or executing pipeline.

-s, --save-pipeline <save_pipeline>

Location for storing resulting pipeline definition in YAML format.

--pipeline-dot <pipeline_dot>

Location for storing GraphViz DOT representation of a pipeline.

--instrument <instrument>

Add an instrument which will be used to load config overrides when defining a pipeline. This must be the fully qualified class name.

-b, --butler-config <butler_config>

Location of the gen3 butler/registry config file.

-g, --qgraph <qgraph>

Location for a serialized quantum graph definition (pickle file). If this option is given then all input data options and pipeline-building options cannot be used. Can be a URI.

--qgraph-id <qgraph_id>

Quantum graph identifier, if specified must match the identifier of the graph loaded from a file. Ignored if graph is not loaded from a file.

--qgraph-node-id <qgraph_node_id>

Only load a specified set of nodes when graph is loaded from a file, nodes are identified by UUID values. One or more comma-separated integers are accepted. By default all nodes are loaded. Ignored if graph is not loaded from a file.

--qgraph-datastore-records

Include datastore records into generated quantum graph, these records are used by a quantum-backed butler.

--skip-existing-in <COLLECTION>

If all Quantum outputs already exist in the specified list of collections then that Quantum will be excluded from the QuantumGraph.

--skip-existing

This option is equivalent to –skip-existing-in with the name of the output RUN collection. If both –skip-existing-in and –skip-existing are given then output RUN collection is appended to the list of collections.

--clobber-outputs

Remove outputs of failed quanta from the output run when they would block the execution of new quanta with the same data ID (or assume that this will be done, if just building a QuantumGraph). Does nothing if –extend-run is not passed.

-q, --save-qgraph <save_qgraph>

URI location for storing a serialized quantum graph definition (pickle file).

--save-single-quanta <save_single_quanta>

Format string of locations for storing individual quantum graph definition (pickle files). The curly brace {} in the input string will be replaced by a quantum number. Can be a URI.

--qgraph-dot <qgraph_dot>

Location for storing GraphViz DOT representation of a quantum graph.

--summary <summary>

Location for storing job summary (JSON file). Note that the structure of this file may not be stable.

--save-execution-butler <save_execution_butler>

Export location for an execution-specific butler after making QuantumGraph

--clobber-execution-butler

When creating execution butler overwrite any existing products

--target-datastore-root <target_datastore_root>

Root directory for datastore of execution butler. Default is to use the original datastore.

--transfer <transfer>

Data transfer mode for the execution butler datastore. Defaults to “copy” if –target-datastore-root is provided.

Options:

auto | link | symlink | hardlink | copy | move | relsymlink | direct

--dataset-query-constraint <dataset_query_constraint>

When constructing a quantum graph constrain by pre-existence of specified dataset types. Valid values are all for all inputs dataset types in pipeline, off to not consider dataset type existence as a constraint, single or comma separated list of dataset type names.

--show-qgraph-header

Print the headerData for Quantum Graph to the console

--mock

Mock pipeline execution.

--mock-failure <LABEL:EXCEPTION:WHERE>

Specifications for tasks that should be configured to fail when mocking execution. This is a colon-separated 3-tuple or 4-tuple, where the first entry the task label, the second the fully-qualified exception type (empty for ValueError, and the third a string (which typically needs to be quoted to be passed as one argument value by the shell) of the form passed to –where, indicating which data IDs should fail. The final optional term is the memory “required” by the task (with units recognized by astropy), which will cause the error to only occur if the “available” memory (according to ExecutionResources.max_mem) is less than this value. Note that actual memory usage is irrelevant here; this is all mock behavior.

--unmocked-dataset-types <COLLECTION>

Names of input dataset types that should not be mocked.

--coverage

Enable coverage output (requires coverage package).

--cov-report, --no-cov-report

If coverage is enabled, controls whether to produce an HTML coverage report.

--cov-packages <cov_packages>

Python packages to restrict coverage to. If none are provided, runs coverage on all packages.

-b, --butler-config <butler_config>

Required Location of the gen3 butler/registry config file.

-i, --input <COLLECTION>

Comma-separated names of the input collection(s).

-o, --output <COLL>

Name of the output CHAINED collection. This may either be an existing CHAINED collection to use as both input and output (incompatible with –input), or a new CHAINED collection created to include all inputs (requires –input). In both cases, the collection’s children will start with an output RUN collection that directly holds all new datasets (see –output-run).

--output-run <COLL>

Name of the new output RUN collection. If not provided then –output must be provided and a new RUN collection will be created by appending a timestamp to the value passed with –output. If this collection already exists then –extend-run must be passed.

--extend-run

Instead of creating a new RUN collection, insert datasets into either the one given by –output-run (if provided) or the first child collection of –output (which must be of type RUN). This also enables –skip-existing option when building a graph. When executing a graph this option skips quanta with all existing outputs.

--replace-run

Before creating a new RUN collection in an existing CHAINED collection, remove the first child collection (which must be of type RUN). This can be used to repeatedly write to the same (parent) collection during development, but it does not delete the datasets associated with the replaced run unless –prune-replaced is also passed. Requires –output, and incompatible with –extend-run.

--prune-replaced <prune_replaced>

Delete the datasets in the collection replaced by –replace-run, either just from the datastore (‘unstore’) or by removing them and the RUN completely (‘purge’). Requires –replace-run.

Options:

unstore | purge

-d, --data-query <QUERY>

User data selection expression.

--rebase

Reset output collection chain if it is inconsistent with –inputs

--clobber-outputs

Remove outputs of failed quanta from the output run when they would block the execution of new quanta with the same data ID (or assume that this will be done, if just building a QuantumGraph). Does nothing if –extend-run is not passed.

--pdb <pdb>

Post-mortem debugger to launch for exceptions (defaults to pdb if unspecified; requires a tty).

--profile <profile>

Dump cProfile statistics to file name.

-j, --processes <processes>

Number of processes to use.

--start-method <start_method>

Multiprocessing start method, default is platform-specific. Fork method is no longer supported, spawn is used instead if fork is selected.

Options:

spawn | fork | forkserver

--timeout <timeout>

Timeout for multiprocessing; maximum wall time (sec).

--fail-fast

Stop processing at first error, default is to process as many tasks as possible.

--raise-on-partial-outputs, --no-raise-on-partial-outputs

Consider partial outputs from a task an error instead of a qualified success.

--graph-fixup <graph_fixup>

Name of the class or factory method which makes an instance used for execution graph fixup.

--summary <summary>

Location for storing job summary (JSON file). Note that the structure of this file may not be stable.

--enable-implicit-threading

Do not disable implicit threading use by third-party libraries (e.g. OpenBLAS). Implicit threading is always disabled during execution with multiprocessing.

-n, --cores-per-quantum <cores_per_quantum>

Number of cores available to each quantum when executing. If ‘-j’ is used each subprocess will be allowed to use this number of cores.

--memory-per-quantum <memory_per_quantum>

Memory allocated for each quantum to use when executing. This memory allocation is not enforced by the execution system and is purely advisory. If ‘-j’ used each subprocess will be allowed to use this amount of memory. Units are allowed and the default units for a plain integer are MB. For example: ‘3GB’, ‘3000MB’ and ‘3000’ would all result in the same memory limit. Default is for no limit.

--skip-init-writes

Do not write collection-wide ‘init output’ datasets (e.g.schemas).

--init-only

Do not actually run; just register dataset types and/or save init outputs.

--register-dataset-types

Register DatasetTypes that do not already exist in the Registry.

--no-versions

Do not save or check package versions.

--coverage

Enable coverage output (requires coverage package).

--cov-report, --no-cov-report

If coverage is enabled, controls whether to produce an HTML coverage report.

--cov-packages <cov_packages>

Python packages to restrict coverage to. If none are provided, runs coverage on all packages.

-@, --options-file <options_file>

URI to YAML file containing overrides of command line options. The YAML should be organized as a hierarchy with subcommand names at the top level options for that subcommand below.

Notes:

–task, –delete, –config, –config-file, and –instrument action options can appear multiple times; all values are used, in order left to right.

FILE reads command-line options from the specified file. Data may be distributed among multiple lines (e.g. one option per line). Data after # is treated as a comment and ignored. Blank lines and lines starting with # are ignored.)

See ‘pipetask –help’ for more options.

run-qbb

Execute pipeline using Quantum-Backed Butler.

REPO is a URI to a butler configuration that is used to configure the datastore of the quantum-backed butler.

REPO is the location of the butler/registry config file.

QGRAPH is the path to a serialized Quantum Graph file.

pipetask run-qbb [OPTIONS] REPO QGRAPH

Options

--config-search-path <PATH>

Additional search paths for butler configuration.

--qgraph-id <qgraph_id>

Quantum graph identifier, if specified must match the identifier of the graph loaded from a file. Ignored if graph is not loaded from a file.

--qgraph-node-id <qgraph_node_id>

Only load a specified set of nodes when graph is loaded from a file, nodes are identified by UUID values. One or more comma-separated integers are accepted. By default all nodes are loaded. Ignored if graph is not loaded from a file.

-j, --processes <processes>

Number of processes to use.

--pdb <pdb>

Post-mortem debugger to launch for exceptions (defaults to pdb if unspecified; requires a tty).

--profile <profile>

Dump cProfile statistics to file name.

--coverage

Enable coverage output (requires coverage package).

--cov-report, --no-cov-report

If coverage is enabled, controls whether to produce an HTML coverage report.

--cov-packages <cov_packages>

Python packages to restrict coverage to. If none are provided, runs coverage on all packages.

--debug

Enable debugging output using lsstDebug facility (imports debug.py).

--start-method <start_method>

Multiprocessing start method, default is platform-specific. Fork method is no longer supported, spawn is used instead if fork is selected.

Options:

spawn | fork | forkserver

--timeout <timeout>

Timeout for multiprocessing; maximum wall time (sec).

--fail-fast

Stop processing at first error, default is to process as many tasks as possible.

--raise-on-partial-outputs, --no-raise-on-partial-outputs

Consider partial outputs from a task an error instead of a qualified success.

--summary <summary>

Location for storing job summary (JSON file). Note that the structure of this file may not be stable.

--enable-implicit-threading

Do not disable implicit threading use by third-party libraries (e.g. OpenBLAS). Implicit threading is always disabled during execution with multiprocessing.

-n, --cores-per-quantum <cores_per_quantum>

Number of cores available to each quantum when executing. If ‘-j’ is used each subprocess will be allowed to use this number of cores.

--memory-per-quantum <memory_per_quantum>

Memory allocated for each quantum to use when executing. This memory allocation is not enforced by the execution system and is purely advisory. If ‘-j’ used each subprocess will be allowed to use this amount of memory. Units are allowed and the default units for a plain integer are MB. For example: ‘3GB’, ‘3000MB’ and ‘3000’ would all result in the same memory limit. Default is for no limit.

Arguments

REPO

Required argument

QGRAPH

Required argument

See ‘pipetask –help’ for more options.

update-graph-run

Update existing quantum graph with new output run name and re-generate output dataset IDs.

QGRAPH is the URL to a serialized Quantum Graph file.

RUN is the new RUN collection name for output graph.

OUTPUT_QGRAPH is the URL to store the updated Quantum Graph.

pipetask update-graph-run [OPTIONS] QGRAPH RUN OUTPUT_QGRAPH

Options

--metadata-run-key <metadata_run_key>

Quantum graph metadata key for the name of the output run. Empty string disables update of the metadata. Default value: output_run.

--update-graph-id

Update graph ID with new unique value.

Arguments

QGRAPH

Required argument

RUN

Required argument

OUTPUT_QGRAPH

Required argument

See ‘pipetask –help’ for more options.