WriteSourceTableTask

WriteSourceTableTask converts table of sources measured on a calexp (dataset src) to a parquet file. All data is copied without transformation, and column names are unchanged, except for the "id" column, which is replaced by a DataFrame index.

It is the first of three postprocessing tasks to convert a src table to a per-visit Source Table that conforms to the standard data model. The second is TransformSourceTableTask, and the third is ConsolidateSourceTableTask.

WriteSourceTableTask is available as a command-line task, writeSourceTable.py.

Processing summary

WriteSourceTableTask reads in the src table, calls its asAstropy method to produce a DataFrame, and writes it out in parquet format.

writeSourceTable.py command-line interface

writeSourceTable.py REPOPATH [@file [@file2 ...]] [--output OUTPUTREPO | --rerun RERUN] [--id] [other options]

Key arguments:

REPOPATH
The input Butler repository’s URI or file path.

Key options:

--id:
The data IDs to process.

See also

See Command-line task argument reference for details and additional options.

Python API summary

from lsst.pipe.tasks.postprocess import WriteSourceTableTask
classWriteSourceTableTask(*, config: Optional[PipelineTaskConfig] = None, log: Optional[Union[logging.Logger, LsstLogAdapter]] = None, initInputs: Optional[Dict[str, Any]] = None, **kwargs)

Write source table to parquet...

attributeconfig

Access configuration fields and retargetable subtasks.

methodrun(catalog, ccdVisitId=None)

Convert `src` catalog to parquet...

methodrunDataRef(dataRef)

Undocumented...

See also

See the WriteSourceTableTask API reference for complete details.

Butler datasets

When run as the writeSourceTable.py command-line task, or directly through the runDataRef method, WriteSourceTableTask obtains datasets from the input Butler data repository and persists outputs to the output Butler data repository. Note that configurations for WriteSourceTableTask, and its subtasks, affect what datasets are persisted and what their content is.

Input datasets

src
Full depth source catalog (lsst.afw.table) produced by ProcessCcdTask

Output datasets

source
Full depth source catalog (parquet)

Retargetable subtasks

No subtasks.

Configuration fields

connections

Data type
lsst.pipe.base.config.Connections
Field type
ConfigField
Configurations describing the connections of the PipelineTask to datatypes

doApplyExternalPhotoCalib

Default
False
Field type
bool Field
Add local photoCalib columns from the calexp.photoCalib? Should only set True if generating Source Tables from older src tables which do not already have local calib columns

doApplyExternalSkyWcs

Default
False
Field type
bool Field
Add local WCS columns from the calexp.wcs? Should only set True if generating Source Tables from older src tables which do not already have local calib columns

saveLogOutput

Default
True
Field type
bool Field
Flag to enable/disable saving of log output for a task, enabled by default.

saveMetadata

Default
True
Field type
bool Field
Flag to enable/disable metadata saving for a task, enabled by default.

Examples

writeSourceTable.py /datasets/hsc/repo  --calib /datasets/hsc/repo/CALIB --rerun <rerun name> --id visit=30504 ccd=0..8^10..103