roastcoffea.aggregation.fine_metrics

Parse Dask Spans fine-grained performance metrics.

Dask Spans provide detailed breakdown of task activity via cumulative_worker_metrics. This module parses those metrics into a standardized format.

Functions

parse_fine_metrics(cumulative_worker_metrics)

Parse Dask Spans cumulative_worker_metrics into fine metrics.

roastcoffea.aggregation.fine_metrics.parse_fine_metrics(cumulative_worker_metrics, processor_name=None)[source]

Parse Dask Spans cumulative_worker_metrics into fine metrics.

Parameters:
  • cumulative_worker_metrics (dict) – Raw metrics from span.cumulative_worker_metrics with tuple keys like: (‘execute’, task_prefix, activity, unit) -> value Activities include: thread-cpu, thread-noncpu, disk-read, disk-write, compress, decompress, serialize, deserialize

  • processor_name (str, optional) – Name of processor class to filter metrics for. If provided, only metrics from this processor are included in processor_* fields, and other metrics go into overhead_* fields.

Returns:

Parsed fine metrics with keys: - processor_cpu_time_seconds: CPU time in processor - processor_io_wait_time_seconds: I/O and waiting time in processor (I/O, GIL, blocking) - processor_cpu_percent: CPU / (CPU + I/O wait) x 100 for processor - processor_io_wait_percent: I/O wait / (CPU + I/O wait) x 100 for processor - overhead_cpu_time_seconds: CPU time in Dask overhead (if processor_name given) - overhead_io_wait_time_seconds: I/O and waiting time in Dask overhead - disk_read_bytes: Bytes read from disk - disk_write_bytes: Bytes written to disk - decompression_time_seconds: Time spent decompressing - compression_time_seconds: Time spent compressing - deserialization_time_seconds: Time spent deserializing - serialization_time_seconds: Time spent serializing - total_serialization_overhead_seconds: Sum of serialize + deserialize - total_compression_overhead_seconds: Sum of compress + decompress

Return type:

dict