MemoryMetricTask¶

MemoryMetricTask creates a resident set size Measurement based on data collected by @timeMethod. It reads the raw timing data from the top-level CmdLineTask’s metadata, which is identified by the task configuration.

In general, it’s only useful to measure this metric for the top-level task being run. @timeMethod measures the peak memory usage from process start, so the results for any subtask will be contaminated by previous subtasks run on the same data ID.

Because @timeMethod gives platform-dependent results, this task may give incorrect results (e.g., units) when run in a distributed system with heterogeneous nodes.

Processing summary¶

MemoryMetricTask searches the metadata for @timeMethod-generated keys corresponding to the method of interest. If it finds matching keys, it stores the maximum memory usage as a Measurement.

Python API summary¶

from lsst.verify.tasks.commonMetrics import MemoryMetricTask

classMemoryMetricTask(**kwargs): A Task that computes the maximum resident set size using metadata produced by the `lsst.utils.timer.timeMethod` decorator...

attributeconfig: Access configuration fields and retargetable subtasks.

methodrun(metadata): Compute a measurement from science task metadata...

Butler datasets¶

Input datasets¶

metadata: The metadata of the top-level command-line task (e.g., ProcessCcdTask, ApPipeTask) being instrumented. Because the metadata produced by each top-level task is a different Butler dataset type, this dataset must be explicitly configured when running MemoryMetricTask or a MetricsControllerTask that contains it.

Output datasets¶

measurement: The value of the metric. The dataset type should not be configured directly, but should be set changing the package and metric template variables to the metric’s namespace (package, by convention) and in-package name, respectively. Subclasses that only support one metric should set these variables automatically.

Retargetable subtasks¶

No subtasks.

Configuration fields¶

connections¶

Data type: lsst.pipe.base.config.Connections
Field type: ConfigField

Configurations describing the connections of the PipelineTask to datatypes

metadataDimensions¶

Default: ['detector', 'instrument', 'visit']
Field type: str ListField

Override for the dimensions of the ‘metadata’ input, when instrumenting Tasks that don’t produce one metadata object per visit.

metric¶

Default: None
Field type: str Field (optional)

The fully qualified name of the metric to store the profiling information. Deprecated: This field has been replaced by connections.package and connections.metric. It will be removed along with daf_persistence.

saveLogOutput¶

Default: True
Field type: bool Field

Flag to enable/disable saving of log output for a task, enabled by default.

saveMetadata¶

Default: True
Field type: bool Field

Flag to enable/disable metadata saving for a task, enabled by default.

target¶

Default: None
Field type: str Field

The method to profile, optionally prefixed by one or more tasks in the format of lsst.pipe.base.Task.getFullMetadata().

Examples¶

from lsst.verify.tasks import MemoryMetricTask

config = MemoryMetricTask.ConfigClass()
config.connections.metadata = "apPipe_metadata"
config.connections.package = "pipe_tasks"
cofig.connections.metric = "ProcessCcdMemory"
config.target = "apPipe:ccdProcessor.runDataRef"
task = MemoryMetricTask(config=config)

# config.connections provided for benefit of MetricsControllerTask/Pipeline
# but since we've defined it we might as well use it
metadata = butler.get(config.connections.metadata)
processCcdTime = task.run(metadata).measurement

Navigation