Sustainable data life cycle management for data intensive instruments

P7
11 Nov 2025, 14:45
15m
Synagoge

Synagoge

Görlitz
oral presentation Technical and social aspects of data lifecycle management Plenary Session 7

Speaker

Hanno Holties (NWO-I ASTRON)

Description

LOFAR is a high throughput data facility facing several non-trivial technical
challenges in data processing and storage. Since start of science operations, LOFAR has accumulated over 50 petabytes of data in its science data archive. Following a major upgrade of the instrument, it is expected that over the course of the next five years of operations the archive will grow to well over 100 petabytes of science data. Other astronomical instruments that will start and ramp up over the coming years will result in similar (RUBIN) or significantly larger (SKA) data volumes. There are viable technical solutions to scale storage to the required capacity, but sustainability considerations (storage cost, network and processing capacity, data access and re-use by a wide community) are increasingly putting constraints on acceptable volumes of data delivery and long-term storage. In this contribution we will present a spectrum of measures that are applied to data systems and operations to address the sustainability challenges for LOFAR. The measures include the application of (lossy and lossless) data compression, retirement of archived data without legacy value, the application of data retirement policies for raw and intermediate level data, the adoption of full lifecycle data management plans for science projects, engaging the community to realize more efficient data processing, data storage, as well as scaling out to newly available infrastructure. We will discuss the relation to FAIR practices, impact on scientific legacy value, and concerns in, and needs from, the science community.

Affiliation of the submitter NWO-I ASTRON
Attendance in-person

Primary authors

Hanno Holties (NWO-I ASTRON) Roberto Pizzo (ASTRON)

Presentation materials