XMMGPT: Integrating Agentic RAG and Autonomous Agents for Astronomical Data Processing

P14
13 Nov 2025, 11:15
15m
Synagoge

Synagoge

Görlitz
oral presentation Collaborating with other software ecosystems and disciplines Plenary Session 14

Speaker

Lorenz Ehrlich (Telespazio)

Description

XMMGPT is a dual-purpose project which aims to serve as a unique access point to aid astronomers in their research with XMM-Newton data, and as an exploration of language models and AI systems within European Space Agency (ESA) workflows. 

The system is comprised of 4 main parts, a heavily customized Agentic Retrieval Augmented Generation (RAG) pipeline, a visibility checker tool, a long-term light curve generator, and an autonomous agent to execute data processing tasks from the Scientific Analysis System (SAS). 

The RAG system is built from SAS technical documentation with contextual embeddings, naive knowledge graphs, a hybrid fine-tuned text and vector similarity search, as well as an agent that routes user queries to appropriate documentation types. 

The visibility checker tool transforms natural language queries into API calls to a visibility server which implements the IVOA ObjObsSAP protocol while the long-term light curve generator transforms natural language queries into API calls to gather flux data and generate a representative plot.

The autonomous SAS Agent takes natural language queries and goes through a series of sequential steps in order to send API calls to RISA (Remote Interface to SAS Analysis) or ULS (Upper Limit Server) and retrieve valid scientific products.

Data privacy and cost concerns have led to the project being developed on relatively small local hardware and a constant challenge has been to find state-of-the-art, smaller footprint, language models which give good performance.

Approaches to autonomous SAS code creation and execution are currently being explored while the project is steadily being updated to take advantage of evolving industry standards.

Affiliation of the submitter Telespazio
Attendance in-person

Primary author

Lorenz Ehrlich (Telespazio)

Co-authors

Aitor Ibarra (Telespazio) Richard Saxton (Telespazio)

Presentation materials