.. lsst-task-topic:: lsst.pipe.tasks.postprocess.TransformSourceTableTask

########################
TransformSourceTableTask
########################


``TransformSourceTableTask`` transforms the full-width source table
(a ``source`` dataset) to a narrower Source Table (a ``sourceTable`` dataset)
as specified by the Data Products Definition Document (`DPDD <https://lse-163.lsst.io>`).
It extracts, transforms, and renames columns per a yaml specification, by default the `schemas/Source.yaml` in this package.
Inputs and outpus are both per-detector.
The input is typically a wide table and output a narrow table appropriate for
concatenating into a per-visit table by ConsolidateSourceTableTask.

It is the second of three postprocessing tasks to convert a `src` table to a
per-visit Source Table that conforms to the standard data model. The first is
:doc:`lsst.pipe.tasks.postprocess.WriteSourceTableTask`, and the third is :doc:`lsst.pipe.tasks.postprocess.ConsolidateSourceTableTask`.

.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-summary:

Processing summary
==================

``TransformSourceTableTask``

#. Read in `source`.

#. Generate functors (by instantiating a `lsst.pipe.tasks.functors.CompositeFunctor`)
from the yaml specification. Apply functors to the columns.

#. Store output DataFrame in parquet-formatted `sourceTable`

.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-api:

Python API summary
==================

.. lsst-task-api-summary:: lsst.pipe.tasks.postprocess.TransformSourceTableTask

.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler:

Butler datasets
===============

When run through the `~lsst.pipe.tasks.postprocess.TransformSourceTableTask.runQuantum` method, ``TransformSourceTableTask`` obtains datasets from the input Butler data repository and persists outputs to the output Butler data repository.
Note that configurations for ``TransformSourceTableTask``, and its subtasks, affect what datasets are persisted and what their content is.

.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler-inputs:

Input datasets
--------------

``source``
    Full-width parquet version of the ``src`` catalog.
    It is generated by ``WriteSourceTableTask``

.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-butler-outputs:

Output datasets
---------------

``sourceTable``
    Source Table in parquet format (per-detector)


.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-subtasks:

Retargetable subtasks
=====================

.. lsst-task-config-subtasks:: lsst.pipe.tasks.postprocess.TransformSourceTableTask

.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-configs:

Configuration fields
====================

.. lsst-task-config-fields:: lsst.pipe.tasks.postprocess.TransformSourceTableTask

.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-examples:

Examples
========

The following command shows an example of how to run the task on an example HSC repository using the python API

.. code-block:: python

    import os
    from lsst.utils import getPackageDir
    from lsst.daf.butler import Butler
    from lsst.pipe.tasks.postprocess import TransformSourceTableTask

    # get input catalogs
    butler = Butler('/path/to/repo')
    dataId = {'visit': 30504, 'ccd': 51}
    source = butler.get('source', dataId=dataId)

    # setup task using the obs_subaru Source.yaml specification
    config =  TransformSourceTableTask.ConfigClass()
    config.functorFile = os.path.join(getPackageDir("obs_subaru"), 'policy', 'Source.yaml')
    task = TransformSourceTableTask(config=config)
    defaultFunctors = task.getFunctors()

    # run the task to get a DataFrame
    df = task.run(source, funcs=defaultFunctors, dataId=dataId)

You may also specify your own functors to apply:

.. code-block:: python

    import yaml
    from  lsst.pipe.tasks.functors import CompositeFunctor

    str = """
    funcs:
        ApFlux:
            functor: LocalNanojansky
            args:
                - slot_CalibFlux_instFlux
                - slot_CalibFlux_instFluxErr
                - base_LocalPhotoCalib
                - base_LocalPhotoCalibErr
        ApFluxErr:
            functor: LocalNanojanskyErr
            args:
                - slot_CalibFlux_instFlux
                - slot_CalibFlux_instFluxErr
                - base_LocalPhotoCalib
                - base_LocalPhotoCalibErr
    """
    exampleFunctors = CompositeFunctor.from_yaml(yaml.load(str))
    df = task.run(source, funcs=exampleFunctors, dataId=dataId)

.. _lsst.pipe.tasks.postprocess.TransformSourceTableTask-debug: