.. lsst-task-topic:: lsst.pipe.tasks.postprocess.TransformSourceTableTask
########################
TransformSourceTableTask
########################
``TransformSourceTableTask`` transforms the full-width source table
(a ``source`` dataset) to a narrower Source Table (a ``sourceTable`` dataset)
as specified by the Data Products Definition Document (`DPDD <https://lse-163.lsst.io>`).
It extracts, transforms, and renames columns per a yaml specification, by default the
`Source.yaml` in obs_package/policy. Inputs and outpus are both per-detector.
The input is typically a wide table and output a narrow table appropriate for
concatenating into a per-visit table by ConsolidateSourceTableTask.
It is the second of three postprocessing tasks to convert a `src` table to a
per-visit Source Table that conforms to the standard data model. The first is
:doc:`lsst.pipe.tasks.postprocess.WriteSourceTableTask`, and the third is :doc:`lsst.pipe.tasks.postprocess.ConsolidateSourceTableTask`.
``TransformSourceTableTask`` is available as a
:ref:`command-line task <pipe-tasks-command-line-tasks>`,
:command:`transformSourceTableTask.py`.
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-summary:
Processing summary
==================
``TransformSourceTableTask``
#. Read in `source`.
#. Generate functors (by instantiating a `lsst.pipe.tasks.functors.CompositeFunctor`)
from the yaml specification. Apply functors to the columns.
#. Store output DataFrame in parquet-formatted `sourceTable`
.. lsst.pipe.tasks.postprocess.TransformSourceTableTask-cli:
transformSourceTableTask.py command-line interface
==================================================
.. code-block:: text
transformSourceTableTask.py REPOPATH [@file [@file2 ...]] [--output OUTPUTREPO | --rerun RERUN] [--id] [other options]
Key arguments:
:option:`REPOPATH`
The input Butler repository's URI or file path.
Key options:
:option:`--id`:
The data IDs to process.
.. seealso::
See :ref:`command-line-task-argument-reference` for details and additional options.
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-api:
Python API summary
==================
.. lsst-task-api-summary:: lsst.pipe.tasks.postprocess.TransformSourceTableTask
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler:
Butler datasets
===============
When run as the ``transformSourceTableTask.py`` command-line task, or directly through the `~lsst.pipe.tasks.postprocess.TransformSourceTableTask.runDataRef` method, ``TransformSourceTableTask`` obtains datasets from the input Butler data repository and persists outputs to the output Butler data repository.
Note that configurations for ``TransformSourceTableTask``, and its subtasks, affect what datasets are persisted and what their content is.
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler-inputs:
Input datasets
--------------
``source``
Full-width parquet version of the ``src`` catalog.
It is generated by ``WriteSourceTableTask``
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler-outputs:
Output datasets
---------------
``sourceTable``
Source Table in parquet format (per-detector)
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-subtasks:
Retargetable subtasks
=====================
.. lsst-task-config-subtasks:: lsst.pipe.tasks.postprocess.TransformSourceTableTask
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-configs:
Configuration fields
====================
.. lsst-task-config-fields:: lsst.pipe.tasks.postprocess.TransformSourceTableTask
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-examples:
Examples
========
The following command shows an example of how to run the task on an example HSC repository.
.. code-block:: bash
transformSourceTable.py /datasets/hsc/repo --calib /datasets/hsc/repo/CALIB --rerun <rerun name> --id visit=30504 ccd=0..8^10..103
Using the python API
.. code-block:: python
import os
from lsst.utils import getPackageDir
from lsst.daf.persistence import Butler
from lsst.pipe.tasks.postprocess import TransformSourceTableTask
# get input catalogs
butler = Butler('/path/to/repo')
dataId = {'visit': 30504, 'ccd': 51}
source = butler.get('source', dataId=dataId)
# setup task using the obs_subaru Source.yaml specification
config = TransformSourceTableTask.ConfigClass()
config.functorFile = os.path.join(getPackageDir("obs_subaru"), 'policy', 'Source.yaml')
task = TransformSourceTableTask(config=config)
defaultFunctors = task.getFunctors()
# run the task to get a DataFrame
df = task.run(source, funcs=defaultFunctors, dataId=dataId)
You may also specify your own functors to apply:
.. code-block:: python
import yaml
from lsst.pipe.tasks.functors import CompositeFunctor
str = """
funcs:
ApFlux:
functor: LocalNanojansky
args:
- slot_CalibFlux_instFlux
- slot_CalibFlux_instFluxErr
- base_LocalPhotoCalib
- base_LocalPhotoCalibErr
ApFluxErr:
functor: LocalNanojanskyErr
args:
- slot_CalibFlux_instFlux
- slot_CalibFlux_instFluxErr
- base_LocalPhotoCalib
- base_LocalPhotoCalibErr
"""
exampleFunctors = CompositeFunctor.from_yaml(yaml.load(str))
df = task.run(source, funcs=exampleFunctors, dataId=dataId)
.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-debug: