.. lsst-task-topic:: lsst.pipe.tasks.postprocess.TransformSourceTableTask ######################## TransformSourceTableTask ######################## ``TransformSourceTableTask`` transforms the full-width source table (a ``source`` dataset) to a narrower Source Table (a ``sourceTable`` dataset) as specified by the Data Products Definition Document (`DPDD `). It extracts, transforms, and renames columns per a yaml specification, by default the `Source.yaml` in obs_package/policy. Inputs and outpus are both per-detector. The input is typically a wide table and output a narrow table appropriate for concatenating into a per-visit table by ConsolidateSourceTableTask. It is the second of three postprocessing tasks to convert a `src` table to a per-visit Source Table that conforms to the standard data model. The first is :doc:`lsst.pipe.tasks.postprocess.WriteSourceTableTask`, and the third is :doc:`lsst.pipe.tasks.postprocess.ConsolidateSourceTableTask`. ``TransformSourceTableTask`` is available as a :ref:`command-line task `, :command:`transformSourceTableTask.py`. .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-summary: Processing summary ================== ``TransformSourceTableTask`` #. Read in `source`. #. Generate functors (by instantiating a `lsst.pipe.tasks.functors.CompositeFunctor`) from the yaml specification. Apply functors to the columns. #. Store output DataFrame in parquet-formatted `sourceTable` .. lsst.pipe.tasks.postprocess.TransformSourceTableTask-cli: transformSourceTableTask.py command-line interface ================================================== .. code-block:: text transformSourceTableTask.py REPOPATH [@file [@file2 ...]] [--output OUTPUTREPO | --rerun RERUN] [--id] [other options] Key arguments: :option:`REPOPATH` The input Butler repository's URI or file path. Key options: :option:`--id`: The data IDs to process. .. seealso:: See :ref:`command-line-task-argument-reference` for details and additional options. .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-api: Python API summary ================== .. lsst-task-api-summary:: lsst.pipe.tasks.postprocess.TransformSourceTableTask .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler: Butler datasets =============== When run as the ``transformSourceTableTask.py`` command-line task, or directly through the `~lsst.pipe.tasks.postprocess.TransformSourceTableTask.runDataRef` method, ``TransformSourceTableTask`` obtains datasets from the input Butler data repository and persists outputs to the output Butler data repository. Note that configurations for ``TransformSourceTableTask``, and its subtasks, affect what datasets are persisted and what their content is. .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler-inputs: Input datasets -------------- ``source`` Full-width parquet version of the ``src`` catalog. It is generated by ``WriteSourceTableTask`` .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler-outputs: Output datasets --------------- ``sourceTable`` Source Table in parquet format (per-detector) .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-subtasks: Retargetable subtasks ===================== .. lsst-task-config-subtasks:: lsst.pipe.tasks.postprocess.TransformSourceTableTask .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-configs: Configuration fields ==================== .. lsst-task-config-fields:: lsst.pipe.tasks.postprocess.TransformSourceTableTask .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-examples: Examples ======== The following command shows an example of how to run the task on an example HSC repository. .. code-block:: bash transformSourceTable.py /datasets/hsc/repo --calib /datasets/hsc/repo/CALIB --rerun --id visit=30504 ccd=0..8^10..103 Using the python API .. code-block:: python import os from lsst.utils import getPackageDir from lsst.daf.persistence import Butler from lsst.pipe.tasks.postprocess import TransformSourceTableTask # get input catalogs butler = Butler('/path/to/repo') dataId = {'visit': 30504, 'ccd': 51} source = butler.get('source', dataId=dataId) # setup task using the obs_subaru Source.yaml specification config = TransformSourceTableTask.ConfigClass() config.functorFile = os.path.join(getPackageDir("obs_subaru"), 'policy', 'Source.yaml') task = TransformSourceTableTask(config=config) defaultFunctors = task.getFunctors() # run the task to get a DataFrame df = task.run(source, funcs=defaultFunctors, dataId=dataId) You may also specify your own functors to apply: .. code-block:: python import yaml from lsst.pipe.tasks.functors import CompositeFunctor str = """ funcs: ApFlux: functor: LocalNanojansky args: - slot_CalibFlux_instFlux - slot_CalibFlux_instFluxErr - base_LocalPhotoCalib - base_LocalPhotoCalibErr ApFluxErr: functor: LocalNanojanskyErr args: - slot_CalibFlux_instFlux - slot_CalibFlux_instFluxErr - base_LocalPhotoCalib - base_LocalPhotoCalibErr """ exampleFunctors = CompositeFunctor.from_yaml(yaml.load(str)) df = task.run(source, funcs=exampleFunctors, dataId=dataId) .. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-debug: