QuantumGraphSkeleton

class lsst.pipe.base.quantum_graph_skeleton.QuantumGraphSkeleton(task_labels: Iterable[str])

Bases: object

An under-construction quantum graph.

QuantumGraphSkeleton is intended for use inside QuantumGraphBuilder and its subclasses.

Parameters:
task_labelsIterable [ str ]

The labels of all tasks whose quanta may be included in the graph, in topological order.

Notes

QuantumGraphSkeleton models a bipartite version of the quantum graph, in which both quanta and datasets are represented as nodes and each type of node only has edges to the other type.

Square-bracket (getitem) indexing returns a mutable mapping of a node’s flexible attributes.

The details of the QuantumGraphSkeleton API (e.g. which operations operate on multiple nodes vs. a single node) are set by what’s actually needed by current quantum graph generation algorithms. New variants can be added as needed, but adding all operations that might be useful for some future algorithm seems premature.

Attributes Summary

global_init_outputs

The set of dataset nodes that are not associated with any task.

n_edges

The total number of edges.

n_nodes

The total number of nodes of all types.

Methods Summary

add_dataset_node(parent_dataset_type_name, ...)

Add a new node representing a dataset.

add_input_edge(task_key, dataset_key[, ...])

Add an edge connecting a dataset to a quantum that consumes it.

add_input_edges(task_key, dataset_keys)

Add edges connecting datasets to a quantum that consumes them.

add_output_edge(task_key, dataset_key)

Add an edge connecting a dataset to the quantum that produces it.

add_prerequisite_node(ref, **attrs)

Add a new node representing a prerequisite input dataset.

add_quantum_node(task_label, data_id, **attrs)

Add a new node representing a quantum.

discard_output_in_the_way(key)

Drop any DatasetRef associated with this node in the output RUN collection.

extract_overall_inputs()

Find overall input datasets.

get_dataset_ref(key)

Return the DatasetRef associated with the given node.

get_output_for_skip(key)

Return the DatasetRef associated with the given node in a collection where it could lead to a quantum being skipped.

get_output_in_the_way(key)

Return the DatasetRef associated with the given node in the output RUN collection.

get_quanta(task_label)

Return the quanta for the given task label.

get_task_init_node(task_label)

Return the graph node that represents a task's initialization.

has_task(task_label)

Test whether the given task is in this skeleton.

iter_all_quanta()

Iterate over all quanta from any task, in topological (but otherwise unspecified) order.

iter_inputs_of(quantum_key)

Iterate over the datasets consumed by the given quantum.

iter_outputs_of(quantum_key)

Iterate over the datasets produced by the given quantum.

remove_dataset_nodes(keys)

Remove nodes representing datasets.

remove_input_edges(task_key, dataset_keys)

Remove edges connecting datasets to a quantum that consumes them.

remove_orphan_datasets()

Remove any dataset nodes that do not have any edges.

remove_quantum_node(key, remove_outputs)

Remove a node representing a quantum.

remove_task(task_label)

Fully remove a task from the skeleton.

set_data_id(key, data_id)

Set the data ID associated with a node.

set_dataset_ref(ref[, key])

Associate a dataset node with a DatasetRef instance.

set_output_for_skip(ref)

Associate a dataset node with a DatasetRef that represents an existing output in a collection where such outputs can cause a quantum to be skipped.

set_output_in_the_way(ref)

Associate a dataset node with a DatasetRef that represents an existing output in the output RUN collectoin.

update(other)

Copy all nodes from other to self.

Attributes Documentation

global_init_outputs

The set of dataset nodes that are not associated with any task.

n_edges

The total number of edges.

n_nodes

The total number of nodes of all types.

Methods Documentation

add_dataset_node(parent_dataset_type_name: str, data_id: DataCoordinate, is_global_init_output: bool = False, **attrs: Any) DatasetKey

Add a new node representing a dataset.

Parameters:
parent_dataset_type_namestr

Name of the parent dataset type.

data_idDataCoordinate

The dataset data ID.

is_global_init_outputbool, optional

Whether this dataset is a global init output.

**attrsAny

Additional attributes for the node.

add_input_edge(task_key: QuantumKey | TaskInitKey, dataset_key: DatasetKey | PrerequisiteDatasetKey, ignore_unrecognized_quanta: bool = False) bool

Add an edge connecting a dataset to a quantum that consumes it.

Parameters:
task_keyQuantumKey or TaskInitKey

Identifier for the quantum node.

dataset_keyDatasetKey or PrerequisiteKey

Identifier for the dataset node.

ignore_unrecognized_quantabool, optional

If False, do nothing if the quantum node is not already present. If True, the quantum node is assumed to be present.

Returns:
addedbool

True if an edge was actually added, False if the quantum was not recognized and the edge was not added as a result.

Notes

Dataset nodes that are not already present will be created.

add_input_edges(task_key: QuantumKey | TaskInitKey, dataset_keys: Iterable[DatasetKey | PrerequisiteDatasetKey]) None

Add edges connecting datasets to a quantum that consumes them.

Parameters:
task_keyQuantumKey or TaskInitKey

Quantum to connect.

dataset_keysIterable of DatasetKey or PrequisiteDatasetKey

Datasets to join to the quantum.

Notes

This must only be called if the task node has already been added. Use add_input_edge if this cannot be assumed.

Dataset nodes that are not already present will be created.

add_output_edge(task_key: QuantumKey | TaskInitKey, dataset_key: DatasetKey) None

Add an edge connecting a dataset to the quantum that produces it.

Parameters:
task_keyQuantumKey or TaskInitKey

Identifier for the quantum node. Must identify a node already present in the graph.

dataset_keyDatasetKey

Identifier for the dataset node. Must identify a node already present in the graph.

add_prerequisite_node(ref: DatasetRef, **attrs: Any) PrerequisiteDatasetKey

Add a new node representing a prerequisite input dataset.

Parameters:
refDatasetRef

The dataset ref of the prerequisite.

**attrsAny

Additional attributes for the node.

Notes

This automatically sets the ‘existing_input’ ref attribute (see set_existing_input_ref), since prerequisites are always overall inputs.

add_quantum_node(task_label: str, data_id: DataCoordinate, **attrs: Any) QuantumKey

Add a new node representing a quantum.

Parameters:
task_labelstr

Name of task.

data_idDataCoordinate

The data ID of the quantum.

**attrsAny

Additional attributes.

discard_output_in_the_way(key: DatasetKey) None

Drop any DatasetRef associated with this node in the output RUN collection.

Does nothing if there is no such DatasetRef.

Parameters:
keyDatasetKey

Identifier for the graph node.

extract_overall_inputs() dict[lsst.pipe.base.quantum_graph_skeleton.DatasetKey | lsst.pipe.base.quantum_graph_skeleton.PrerequisiteDatasetKey, lsst.daf.butler._dataset_ref.DatasetRef]

Find overall input datasets.

Returns:
datasetsdict [ DatasetKey or PrerequisiteDatasetKey, DatasetRef ]

Overall-input datasets, including prerequisites and init-inputs.

get_dataset_ref(key: DatasetKey | PrerequisiteDatasetKey) DatasetRef | None

Return the DatasetRef associated with the given node.

This does not return “output for skip” and “output in the way” datasets.

Parameters:
keyDatasetKey or PrerequisiteDatasetKey

Identifier for the graph node.

Returns:
refDatasetRef or None

Dataset reference associated with the node.

get_output_for_skip(key: DatasetKey) DatasetRef | None

Return the DatasetRef associated with the given node in a collection where it could lead to a quantum being skipped.

Parameters:
keyDatasetKey

Identifier for the graph node.

Returns:
refDatasetRef or None

Dataset reference associated with the node.

get_output_in_the_way(key: DatasetKey) DatasetRef | None

Return the DatasetRef associated with the given node in the output RUN collection.

Parameters:
keyDatasetKey

Identifier for the graph node.

Returns:
refDatasetRef or None

Dataset reference associated with the node.

get_quanta(task_label: str) Set[QuantumKey]

Return the quanta for the given task label.

Parameters:
task_labelstr

Label for the task.

Returns:
quantaSet [ QuantumKey ]

A set-like object with the identifiers of all quanta for the given task. The skeleton object’s set of quanta must not be modified while iterating over this container; make a copy if mutation during iteration is necessary.

get_task_init_node(task_label: str) TaskInitKey

Return the graph node that represents a task’s initialization.

Parameters:
task_labelstr

The task label to use.

Returns:
nodeTaskInitKey

The graph node representing this task’s initialization.

has_task(task_label: str) bool

Test whether the given task is in this skeleton.

Tasks are only added to the skeleton at initialization, but may be removed by remove_task if they end up having no quanta.

Parameters:
task_labelstr

Task to check for.

Returns:
hasbool

True if the task is in this skeleton.

iter_all_quanta() Iterator[QuantumKey]

Iterate over all quanta from any task, in topological (but otherwise unspecified) order.

iter_inputs_of(quantum_key: QuantumKey | TaskInitKey) Iterator[DatasetKey | PrerequisiteDatasetKey]

Iterate over the datasets consumed by the given quantum.

Parameters:
quantum_keyQuantumKey or TaskInitKey

Quantum to iterate over.

Returns:
datasetsIterator of DatasetKey or PrequisiteDatasetKey

Datasets consumed by the given quanta.

iter_outputs_of(quantum_key: QuantumKey | TaskInitKey) Iterator[DatasetKey]

Iterate over the datasets produced by the given quantum.

Parameters:
quantum_keyQuantumKey or TaskInitKey

Quantum to iterate over.

Returns:
datasetsIterator of DatasetKey

Datasets produced by the given quanta.

remove_dataset_nodes(keys: Iterable[DatasetKey | PrerequisiteDatasetKey]) None

Remove nodes representing datasets.

Parameters:
keysIterable of DatasetKey or PrerequisiteDatasetKey

Nodes to remove.

remove_input_edges(task_key: QuantumKey | TaskInitKey, dataset_keys: Iterable[DatasetKey | PrerequisiteDatasetKey]) None

Remove edges connecting datasets to a quantum that consumes them.

Parameters:
task_keyQuantumKey or TaskInitKey

Quantum to disconnect.

dataset_keysIterable of DatasetKey or PrequisiteDatasetKey

Datasets to remove from the quantum.

remove_orphan_datasets() None

Remove any dataset nodes that do not have any edges.

remove_quantum_node(key: QuantumKey, remove_outputs: bool) None

Remove a node representing a quantum.

Parameters:
keyQuantumKey

Identifier for the node.

remove_outputsbool

If True, also remove all dataset nodes produced by this quantum. If False, any such dataset nodes will become overall inputs.

remove_task(task_label: str) None

Fully remove a task from the skeleton.

All init-output datasets and quanta for the task must already have been removed.

Parameters:
task_labelstr

Name of task to remove.

set_data_id(key: QuantumKey | TaskInitKey | DatasetKey | PrerequisiteDatasetKey, data_id: DataCoordinate) None

Set the data ID associated with a node.

This updates the data ID in any DatasetRef objects associated with the node via set_ref, set_output_for_skip, or set_output_in_the_way as well, assuming it is an expanded version of the original data ID.

Parameters:
keyKey

Identifier for the graph node.

data_idDataCoordinate

Data ID for the node.

set_dataset_ref(ref: DatasetRef, key: DatasetKey | PrerequisiteDatasetKey | None = None) None

Associate a dataset node with a DatasetRef instance.

Parameters:
refDatasetRef

DatasetRef to associate with the node.

keyDatasetKey or PrerequisiteDatasetKey, optional

Identifier for the graph node. If not provided, a DatasetKey is constructed from the dataset type name and data ID of ref.

set_output_for_skip(ref: DatasetRef) None

Associate a dataset node with a DatasetRef that represents an existing output in a collection where such outputs can cause a quantum to be skipped.

Parameters:
refDatasetRef

DatasetRef to associate with the node.

set_output_in_the_way(ref: DatasetRef) None

Associate a dataset node with a DatasetRef that represents an existing output in the output RUN collectoin.

Parameters:
refDatasetRef

DatasetRef to associate with the node.

update(other: QuantumGraphSkeleton) None

Copy all nodes from other to self.

Parameters:
otherQuantumGraphSkeleton

Source of nodes. The tasks in other must be a subset of the tasks in self (this method is expected to be used to populate a skeleton for a full from independent-subgraph skeletons).