Inject Synthetic Sources¶
Injecting Synthetic Sources Into Visit-Level or Coadd-Level Dataset Types¶
Synthetic sources can be injected into any imaging data product output by the LSST Science Pipelines. This is useful for testing algorithmic performance on simulated data, where the truth is known, and for various subsequent quality assurance tasks.
The sections below describe how to inject synthetic sources into a visit-level exposure-type or visit-type datasets (i.e., datasets with the dimension exposure
or visit
), or into a coadd-level coadded dataset.
Options for injection on the command line and in Python are presented.
Prior to injection, the instructions on this page assume that the user will already have in-place a fully qualified source injection pipeline definition YAML (see Make an Injection Pipeline) and a suitable synthetic source injection catalog describing the sources to be injected (see Generate an Injection Catalog) which has been ingested into the data butler (see Ingest an Injection Catalog).
Injection on the Command Line¶
Source injection on the command line is performed using the pipetask run command.
The process for injection into visit-level imaging (i.e., exposure
or visit
type data) or injection into coadd-level imaging (e.g., a deepCoadd`
) is largely the same, save for the use of a different data query and a different injection task or pipeline subset.
The following command line example injects synthetic sources into the HSC exposure 1228, detector 51, postISRCCD
dataset.
For the purposes of this example, we will run the entirety of the HSC DRP RC2 step 1 subset.
This subset contains all the tasks necessary to process raw science data through to initial visit-level calibrated outputs.
The step 1 subset will have had the inject_exposure
task (ExposureInjectTask
) merged into it following a successful run of make_injection_pipeline.
Tip
Injection into a coadd-level data product such as a deepCoadd
can easily be achieved by substituting step1
for step3
in the command below and modifying the -d
data query.
For the injection catalog generated in these notes, this coadd-level data query would work well:
-d "instrument='HSC' AND skymap='hsc_rings_v1' AND tract=9813 AND patch=42 AND band='i'"
pipetask --long-log --log-file $LOGFILE \
run --register-dataset-types \
-b $REPO \
-i $INPUT_DATA_COLL,$INJECTION_CATALOG_COLL \
-o $OUTPUT_COLL \
-p DRP-RC2+injection.yaml#step1 \
-d "instrument='HSC' AND exposure=1228 AND detector=51"
where
$LOGFILE
The full path to a user-defined output log file.
$REPO
The path to the butler repository.
$INPUT_DATA_COLL
The name of the input data collection.
$INJECTION_CATALOG_COLL
The name of the input injection catalog collection.
$OUTPUT_COLL
The name of the injected output collection.
Caution
Standard processing should not normally have to make use of the --register-dataset-types
flag.
This flag is only required to register a new output dataset type with the butler for the very first time.
If injection outputs have already been generated within your butler repository, you should omit this flag from your run command to prevent any accidental registration of unwanted dataset types.
Note
Similar to stepN
subsets are injected_stepN
subsets.
These only run tasks including and after the injection task.
The injected_stepN
subsets can save memory and runtime if the tasks prior to injection have already been run.
Assuming processing completes successfully, the injected_postISRCCD
and associated injected_postISRCCD_catalog
will be written to the butler repository.
Various downstream step1
data products should also exist, including the injected_calexp
dataset type (see example images below).
Standard log messages that get printed as part of a successful run may include lines similar to:
Retrieved 25 injection sources from 1 HTM trixel.
Identified 19 injection sources with centroids outside the padded image bounding box.
Catalog cleaning removed 19 of 25 sources; 6 remaining for catalog checking.
Catalog checking flagged 0 of 6 sources; 6 remaining for source generation.
Adding INJECTED and INJECTED_CORE mask planes to the exposure.
Generating 6 injection sources consisting of 1 unique type: Sersic(6).
Injected 6 of 6 potential sources. 0 sources flagged and skipped.
An example injected output produced by the above snippet is shown below.
Injection in Python¶
Source injection in Python is achieved by using the source injection task classes directly. As on the command line, the process for injection into visit-level imaging or coadd-level imaging is largely the same, save for the use of a different task class, a different data query, and use of different calibration data products (see the notes in the Python snippet below).
The following Python example injects synthetic sources into the HSC i-band tract 9813, patch 42, deepCoadd
dataset.
For the purposes of this example, we will just run the source injection task alone.
from lsst.daf.butler import Butler
from lsst.source.injection import CoaddInjectConfig,CoaddInjectTask
# NOTE: For injections into other dataset types, use the following instead:
# from lsst.source.injection import ExposureInjectConfig,ExposureInjectTask
# from lsst.source.injection import VisitInjectConfig,VisitInjectTask
# Instantiate a butler.
butler = Butler(REPO)
# Load an input deepCoadd dataset.
dataId = dict(
instrument="HSC",
skymap="hsc_rings_v1",
tract=9813,
patch=42,
band="i",
)
input_exposure = butler.get(
"deepCoadd",
dataId=dataId,
collections=INPUT_DATA_COLL,
)
# NOTE: Visit-level injections also require a visit summary table.
# visit_summary = butler.get(
# "finalVisitSummary",
# dataId=dataId,
# collections=INPUT_DATA_COLL,
# )
# Get calibration data products.
psf = input_exposure.getPsf()
photo_calib = input_exposure.getPhotoCalib()
wcs = input_exposure.getWcs()
# NOTE: Visit-level injections should instead use the visit summary table.
# detector_summary = visit_summary.find(dataId["detector"])
# psf = detector_summary.getPsf()
# photo_calib = detector_summary.getPhotoCalib()
# wcs = detector_summary.getWcs()
# Load input injection catalogs, here just for i-band catalogs.
injection_refs = butler.registry.queryDatasets(
"injection_catalog",
band="i",
collections=INJECTION_CATALOG_COLL,
)
injection_catalogs = [
butler.get(injection_ref) for injection_ref in injection_refs
]
# Instantiate the injection classes.
inject_config = CoaddInjectConfig()
inject_task = CoaddInjectTask(config=inject_config)
# Run the source injection task.
injected_output = inject_task.run(
injection_catalogs=injection_catalogs,
input_exposure=input_exposure.clone(),
psf=psf,
photo_calib=photo_calib,
wcs=wcs,
)
injected_exposure=injected_output.output_exposure
injected_catalog=injected_output.output_catalog
where
REPO
The path to the butler repository.
INPUT_DATA_COLL
The name of the input data collection.
INJECTION_CATALOG_COLL
The name of the input injection catalog collection.
An example injected output produced by the above snippet is shown below.
Injecting Postage Stamps¶
The commands above have focussed on injecting synthetic parametric models produced by GalSim. It’s also possible to inject FITS postage stamps directly into the data. These may be real astronomical images, or they may be simulated images produced by other software.
By way of example, lets inject multiple copies of the 2dFGRS galaxy TGN420Z151, a \(z\sim0.17\) galaxy of brightness \(m_{i}\sim17.2\) mag located in HSC tract 9813, patch 42. First, lets construct a small postage stamp using existing HSC data products:
from lsst.daf.butler import Butler
from lsst.geom import Box2I, Extent2I, Point2I
# Instantiate a butler.
butler = Butler(REPO)
# Get the deepCoadd for HSC i-band tract 9813, patch 42.
dataId = dict(
instrument="HSC",
skymap="hsc_rings_v1",
tract=9813,
patch=42,
band="i",
)
t9813p42i = butler.get(
"deepCoadd",
dataId=dataId,
collections=INPUT_DATA_COLL,
)
# Find the x/y coordinates for the 2dFGRS TGN420Z151 galaxy.
wcs = t9813p42i.wcs
x0, y0 = wcs.skyToPixelArray(149.8599524, 2.1487149, degrees=True)
# Create a 181x181 pixel postage stamp centered on the galaxy.
bbox = Box2I(Point2I(x0, y0), Extent2I(1,1))
bbox.grow(90)
tgn420z151 = t9813p42i[bbox]
# Save the postage stamp image to a FITS file.
tgn420z151.image.writeFits(POSTAGE_STAMP_FILE)
where
REPO
The path to the butler repository.
INPUT_DATA_COLL
The name of the input data collection.
POSTAGE_STAMP_FILE
The file name for the postage stamp FITS file.
This postage stamp looks like this:
Next, lets construct a simple injection catalog and ingest it into the butler.
Injection of FITS-file postage stamps only requires the ra
, dec
, source_type
, mag
and stamp
columns to be specified in the injection catalog.
Note that below we switch from Python to the command line interface:
generate_injection_catalog \
-a 149.7 150.1 \
-d 2.0 2.4 \
-n 50 \
-p source_type Stamp \
-p mag 17.2 \
-p stamp $POSTAGE_STAMP_FILE \
-b $REPO \
-w deepCoadd_calexp \
-c $INPUT_DATA_COLL \
--where "instrument='HSC' AND skymap='hsc_rings_v1' AND tract=9813 AND patch=42 AND band='i'" \
-i i \
-o $INJECTION_CATALOG_COLL
where
$POSTAGE_STAMP_FILE
The file name for the postage stamp FITS file.
$REPO
The path to the butler repository.
$INPUT_DATA_COLL
The name of the input data collection.
$INJECTION_CATALOG_COLL
The name of the input injection catalog collection.
The first several rows from the injection catalog produced by the above snippet look like this:
injection_id ra dec source_type mag stamp
------------ ------------------ ------------------ ----------- ---- ---------------
0 150.0403162981621 2.076877152109224 Stamp 17.2 tgn420z151.fits
1 149.94655709194345 2.0422859082646854 Stamp 17.2 tgn420z151.fits
2 150.02155685175438 2.116390565528664 Stamp 17.2 tgn420z151.fits
3 149.92773562242124 2.358408570029682 Stamp 17.2 tgn420z151.fits
4 149.82770694427973 2.338624350977013 Stamp 17.2 tgn420z151.fits
...
Finally, lets inject our postage stamp multiple times into the HSC i-band tract 9813, patch 42 image:
pipetask --long-log --log-file $LOGFILE \
run --register-dataset-types \
-b $REPO \
-i $INPUT_DATA_COLL,$INJECTION_CATALOG_COLL \
-o $OUTPUT_COLL \
-p $SOURCE_INJECTION_DIR/pipelines/inject_coadd.yaml \
-d "instrument='HSC' AND skymap='hsc_rings_v1' AND tract=9813 AND patch=42 AND band='i'"
where
$LOGFILE
The full path to a user-defined output log file.
$REPO
The path to the butler repository.
$INPUT_DATA_COLL
The name of the input data collection.
$INJECTION_CATALOG_COLL
The name of the input injection catalog collection.
$OUTPUT_COLL
The name of the injected output collection.
$SOURCE_INJECTION_DIR
The path to the source injection package directory.
Tip
If the injection FITS file is not in the same directory as the working directory where the pipetask run
command is run, the stamp_prefix
configuration option can be used.
This appends a string to the beginning of the FITS file name taken from the catalog, allowing for your FITS files to be stored in a different directory to the current working directory.
Running the above snippet produces the following:
Coadd-level (deepCoadd
and injected_deepCoadd
) data for HSC tract 9813, patch 42 in the i-band, showcasing the injection of multiple copies of 2dFGRS galaxy TGN420Z151.
Images are log scaled across the central 99% flux range and smoothed with a Gaussian kernel of FWHM 5 pixels.
See also
For a “Rubin themed” example postage stamp injection, see the top of the FAQs page.
Wrap Up¶
This page has described how to inject synthetic sources into a visit-level exposure-type or visit-type dataset, or into a coadd-level coadded dataset. Options for injection on the command line and in Python have been presented. The special case of injecting FITS-file postage stamp images has also been covered.
Move on to another quick reference guide, consult the FAQs, or head back to the main page.