A Robust Snakemake Pipeline for Reducing Complex CII Datasets: The M51 Case Study

PO
Not scheduled
15m
Wichernhaus

Wichernhaus

Board: A215
poster presentation Automation of data pipeline and workflows Poster

Speaker

Juan Luis Verbena (Universität zu Köln)

Description

In this poster, we present a modular and scalable data reduction pipeline designed to process the challenging M51 [C II] dataset, observed with GREAT onboard SOFIA. The raw data comprise 125 GB and over one million spectra, collected across multiple flights and observing cycles (2016–2018). A key complication is contamination by a telluric ozone line, whose frequency shifts throughout the campaign, significantly complicating baseline characterization.

Our solution integrates CLASS, Python, and Snakemake to deliver a reproducible and portable reduction workflow. Principal component analysis (PCA) forms the backbone of our baseline correction strategy, enabling robust separation of instrumental and atmospheric features from the astronomical signal. The workflow is orchestrated by Snakemake and automated through Python wrappers, which generate tailored CLASS scripts for each flight, backend, and scan—ensuring optimal characterization of baseline components under varying conditions.

This pipeline achieves consistent, reliable data products while maximizing transparency and reproducibility, offering a powerful template for the reduction of similarly complex spectroscopic datasets.

Affiliation of the submitter Universität zu Köln
Attendance in-person

Primary author

Juan Luis Verbena (Universität zu Köln)

Co-author

Christof Buchbender (I. Physikalische Institut Universität zu Köln)

Presentation materials

There are no materials yet.