Machine Learning-based Anomaly Detection for Astrometry

P6
11 Nov 2025, 12:00
15m
Synagoge

Synagoge

Görlitz
oral presentation Quality Assurance and Software Testing Plenary Session 6

Speaker

Nelly Gaillard (ESA/ESAC)

Description

The data processing task of the Gaia mission is large and complex. One of its central elements is the Astrometric Global Iterative Solution (AGIS), which produces and delivers the core astrometry data products. A major challenge in the software producing Gaia’s astrometric solution is the creation of a calibration model accurate enough to capture subtle effects, which may have an impact on the quality of the solution at the micro-arcsecond level.
Among AGIS related data, a key product is the post-fit residuals. These are the differences between the observations and the predictions obtained using the AGIS source, attitude and calibration model.
This work introduces a framework for the automated analysis of residuals and the detection of anomalies that can be either indicators of non-convergence of AGIS, or problems in the calibration model. One of the methods in the framework consists of a statistical approach, which uses user-defined thresholds to identify deviations of the estimated distribution of anomalous observations with respect to the one of nominal points.
Another method, based on ML, interprets the residuals as time-series, analysing the observations across key dimensions (such as magnitude, pixel value, star color). After having identified anomalous segments, they are grouped into similar classes by means of a clustering algorithm.
Finally, a classifier is trained to distinguish between the identified anomaly classes. By analysing the classifier feature importances with the SHAP library, we can reveal which features influence the model decisions the most for each anomaly class, offering insights into the underlying patterns.
Given the absence of ground truth and the unknown characteristics of each anomaly, the framework is evaluated by comparing the results of the two methods and by manually checking randomly selected anomaly examples from each detected class.

Affiliation of the submitter ESA/ESAC
Attendance in-person

Primary authors

Presentation materials