AI-Ready Multimodal Astronomical Datasets from LAMOST and Complementary Surveys

PO
Not scheduled
15m
Wichernhaus

Wichernhaus

Board: A230
poster presentation Automation of data pipeline and workflows Poster

Speaker

Xiaolan Hou

Description

The rapid expansion of astronomical surveys has created a pressing need for preparing datasets in formats that can be directly utilized in machine learning applications. An AI-ready dataset has been constructed from the LAMOST Low Resolution Survey DR10, which covers nearly 11 years of observations (2011–2022) and contains more than 11 million spectra of stars, galaxies, and quasars. A uniform workflow has been applied, including the selection of high-quality spectra, retention of the highest signal-to-noise observation for repeated targets, resampling onto a common wavelength grid, and packaging in standardized HDF5 format.

To extend beyond spectroscopy, the dataset has been cross-matched with external surveys, adding ultraviolet and infrared photometry and optical imaging cutouts from the SDSS. The result is a multi-modal dataset that integrates spectra, photometry, and images, designed for direct use in machine learning workflows.

The dataset provides a benchmark for algorithm development as well as scientific applications. Demonstrated use cases include automated identification of M-type stars and the detection of binary star systems from spectroscopic observations.

This work is part of the broader initiative of the National Astronomical Data Center (NADC) to deliver standardized AI-ready data products from Chinese astronomical facilities. By offering both reusable methods and accessible data resources, it supports the astronomical community in advancing data-intensive and AI-driven research.

Affiliation of the submitter National Astronomical Observatories, Chinese Academy of Science; School of Physics and Astronomy, Beijing Normal University
Attendance in-person

Primary author

Co-authors

Chenzhou Cui (National Astronomical Observatories, Chinese Academy of Sciences) Yunfei Xu (National Astronomical Observatories, Chinese Academy of Sciences)

Presentation materials