ConsolidateSourceTableTask

ConsolidateSourceTableTask concatenates per-detector Source Tables (dataset sourceTable) into one per-visit Source Table (dataset sourceTable_visit). It only does I/O, and therefore has no run method. The inputs have already been transformed to the DPDD-specified columns. This task assumes that they are sufficiently narrow to fit all tables for a given visit in memory at once.

It is the third of three postprocessing tasks to convert a src table to a per-visit Source Table that conforms to the standard data model. The first is WriteSourceTableTask. The second is TransformSourceTableTask.

ConsolidateSourceTableTask is available as a command-line task, consolidateSourceTable.py.

Processing summary

ConsolidateSourceTableTask reads in all detector-level Source Tables (dataset sourceTable) for a given visit, concatenates them, and writes the result out as a visit-level Source Table (dataset sourceTable_visit)

consolidateSourceTable.py command-line interface

consolidateSourceTable.py REPOPATH [@file [@file2 ...]] [--output OUTPUTREPO | --rerun RERUN] [--id] [other options]

Key arguments:

REPOPATH

The input Butler repository’s URI or file path.

Key options:

--id

The data IDs to process.

See also

See Command-line task argument reference for details and additional options.

Python API summary

from lsst.pipe.tasks.postprocess import ConsolidateSourceTableTask
classConsolidateSourceTableTask(*, config: Optional[PipelineTaskConfig] = None, log: Optional[Union[logging.Logger, LsstLogAdapter]] = None, initInputs: Optional[Dict[str, Any]] = None, **kwargs)

Concatenate `sourceTable` list into a per-visit `sourceTable_visit`...

attributeconfig

Access configuration fields and retargetable subtasks.

methodrun(**kwargs) -> Struct)

Run task algorithm on in-memory data...

methodrunDataRef(dataRefList)

Undocumented...

See also

See the ConsolidateSourceTableTask API reference for complete details.

Butler datasets

When run as the consolidateSourceTable.py command-line task, or directly through the runDataRef method, ConsolidateSourceTableTask obtains datasets from the input Butler data repository and persists outputs to the output Butler data repository. Note that configurations for ConsolidateSourceTableTask, and its subtasks, affect what datasets are persisted and what their content is.

Input datasets

sourceTable
Per-detector, parquet-formatted Source Table that has been transformed to DPDD-specification

Output datasets

sourceTable_visit
Per-visit, parquet-formatted Source Table that has been transformed to DPDD-specification

Retargetable subtasks

No subtasks.

Configuration fields

connections

Data type
lsst.pipe.base.config.Connections
Field type
ConfigField
Configurations describing the connections of the PipelineTask to datatypes

saveLogOutput

Default
True
Field type
bool Field
Flag to enable/disable saving of log output for a task, enabled by default.

saveMetadata

Default
True
Field type
bool Field
Flag to enable/disable metadata saving for a task, enabled by default.

Examples

The following command shows an example of how to run the task on an example HSC repository.

consolidateSourceTable.py /datasets/hsc/repo  --calib /datasets/hsc/repo/CALIB --rerun <rerun name> --id visit=30504