Addressing Big Data Challenges for ESA Space Science Missions

P9
12 Nov 2025, 09:00
30m
Synagoge

Synagoge

Görlitz
invited talk Science platforms in the big data era Plenary Session 9

Speaker

Sandor Kruk (European Space Agency)

Description

The exponential growth in size and complexity of astronomical datasets from space missions presents significant computational and infrastructural challenges. ESA’s Euclid mission has already produced petabytes (PB) of processed data and is projected to produce 30 PB over its operational lifetime. Analysing and processing data on this scale requires specialised infrastructure and toolchains.

ESA has developed a science platform, ESA Datalabs, which provides essential infrastructure to access and analyse data from missions such as the Hubble Space Telescope, James Webb Space Telescope, Gaia, and Euclid. Leveraging software like JupyterLab, users can interact with mission data without downloading it. The platform fosters collaborative science by enabling direct connection to ESA archives and shared computational workspaces, facilitating creation and deployment of user-built applications and analysis pipelines, and ensuring accessibility to a broad research community.

In this presentation, we outline the need for science platforms in the Big Data era, the motivation behind ESA Datalabs, its key functionalities, and its role in addressing challenges such as scalable data processing, infrastructure development, and reproducible research. We demonstrate how integration of archives, visualisation tools and science platform into a unified portal for Euclid, Euclid Data Space, creates a powerful, single-entry point for the mission’s scientific community.

We showcase recent use cases, including ESA Datalabs’ role in the first public Euclid quick data release and the first large internal data release. Additionally, we highlight how the platform supports analysis of data stored in ESA science archives using data mining and machine learning techniques, for use cases such as large-scale classification of galaxies and identification of anomalies.

Our discussion highlights how science platforms can maximise the scientific potential of current and future space missions and shape the future of data-intensive space science.

Affiliation of the submitter European Space Agency
Attendance in-person

Primary author

Sandor Kruk (European Space Agency)

Co-authors

Jan Reerink (European Space Agency) Pablo Gómez (European Space Agency) Sebastian Maksym Vicente Navarro (ESA)

Presentation materials