Speaker
Description
XMMGPT is a dual-purpose project which aims to serve as a unique access point to aid astronomers in their research with XMM-Newton data, and as an exploration of language models and AI systems within European Space Agency (ESA) workflows.
The system is comprised of 4 main parts, a heavily customized Agentic Retrieval Augmented Generation (RAG) pipeline, a visibility checker tool, a long-term light curve generator, and an autonomous agent to execute data processing tasks from the Scientific Analysis System (SAS).
The RAG system is built from SAS technical documentation with contextual embeddings, naive knowledge graphs, a hybrid fine-tuned text and vector similarity search, as well as an agent that routes user queries to appropriate documentation types.
The visibility checker tool transforms natural language queries into API calls to a visibility server which implements the IVOA ObjObsSAP protocol while the long-term light curve generator transforms natural language queries into API calls to gather flux data and generate a representative plot.
The autonomous SAS Agent takes natural language queries and goes through a series of sequential steps in order to send API calls to RISA (Remote Interface to SAS Analysis) or ULS (Upper Limit Server) and retrieve valid scientific products.
Data privacy and cost concerns have led to the project being developed on relatively small local hardware and a constant challenge has been to find state-of-the-art, smaller footprint, language models which give good performance.
Approaches to autonomous SAS code creation and execution are currently being explored while the project is steadily being updated to take advantage of evolving industry standards.
| Affiliation of the submitter | Telespazio |
|---|---|
| Attendance | in-person |