Introduction¶
roastcoffea is a comprehensive performance monitoring package for Coffea-based High Energy Physics (HEP) analysis workflows. It provides detailed metrics collection, analysis, and visualization for distributed data processing on Dask clusters.
Why roastcoffea?¶
When running large-scale HEP analyses on distributed systems, understanding performance bottlenecks is critical. roastcoffea helps answer questions like:
How fast is my analysis? Track throughput (Gbps, kHz) and processing rates
Where is time spent? Separate CPU time from I/O wait, identify bottlenecks
Is my cluster efficient? Monitor worker utilization, memory usage, task distribution
How do chunks perform? Track per-chunk timing, memory, and metadata
What’s the overhead? Distinguish processor work from Dask scheduling overhead
Key Features¶
Three Levels of Collection¶
roastcoffea provides progressively detailed metrics:
Workflow-level - Overall throughput, timing, resource usage (no code changes needed)
Chunk-level - Per-chunk performance data (add
@track_metricsdecorator)Fine-grained - Section-by-section profiling (use
track_time()andtrack_memory())
Automatic Integration¶
Works seamlessly with existing Coffea workflows
Minimal code changes required
Context manager API for clean setup/teardown
Automatic detection of Dask Spans for detailed metrics
Comprehensive Metrics¶
Throughput: Data rates (Gbps, MB/s), event rates (kHz)
Resources: Worker counts, memory usage, CPU utilization
Timing: Wall time, CPU time, I/O time breakdown
Efficiency: Core utilization, speedup factors, parallelism metrics
Activity: CPU vs non-CPU time, disk I/O, compression overhead
I/O Analysis: File compression ratios, branch access patterns, data read percentages
Chunk Performance: Runtime distributions, event processing correlations
Visualization & Export¶
Rich terminal tables with formatted output
17 built-in Matplotlib plots including:
Resource timelines (workers, memory, CPU utilization, data rates)
Efficiency metrics (CPU efficiency, task distribution)
I/O analysis (compression ratios, data access patterns)
Chunk-level performance (runtime distributions, event correlations)
Save/load measurements for comparison
JSON export for custom analysis
How It Works¶
roastcoffea collects metrics from multiple sources:
Coffea Report - Built-in metrics from
coffea.processor.Runner(throughput, events, columns)Wall Clock Timing - Elapsed time measurement
Worker Tracking - Periodic scheduler sampling for resource data (CPU, memory, worker counts)
Chunk Decorator - Per-chunk timing, memory, and file-level metadata extraction
Dask Spans - Fine-grained activity breakdown (CPU, I/O, disk, memory)
Instrumentation - User-defined section tracking
These are aggregated into a unified metrics dictionary, providing both high-level summaries and detailed breakdowns.
Who Should Use This?¶
roastcoffea is designed for:
Analysts optimizing their Coffea workflows
Computing teams monitoring cluster performance
Developers profiling distributed HEP applications
Researchers studying performance characteristics of data processing
Next Steps¶
Jump to Quickstart to install and run your first collection.
Read Core Concepts to understand how metrics are collected and what they mean.
Step through Tutorial for detailed examples at each collection level.
Explore Architecture to learn about backends, aggregators, and exporters.