The GAIA Datamining Platform provides interactive, JupyterHub-based access to the GAIA Data Release 3 dataset, which comprises 7TB of data.
The GAIA Data Release 4 dataset is expected to be in excess of 600TB.
We describe our progress in evolving the GAIA Data Mining Platform to a modern, kubernetes-based, platform-independent deployment, named Astroflow, adding dask functionality to...
The James Webb Space Telescope is producing a firehose of extragalactic imaging data through its diversity of legacy programs. Community organized initiatives, such as the Dawn JWST Archive, have come to fill the gap between archive products to uniformly-reduced data that enable large-scale exploration and analysis. These programs are catalyzing further initiatives to generate value-added...
Modern astronomical surveys such as HST, JWST, Euclid, and LSST are generating petabyte-scale imaging archives across multiple wavelengths and epochs. Traditional image retrieval methods, which are based solely on metadata, such as sky position, filter, or exposure time- are insufficient to identify objects with similar visual or physical characteristics. To enable efficient discovery in these...
With the new generations of large-scale surveys, we are faced with an avalanche of data that are no longer “images” but “cubes”, and whose third dimension is either temporal or spectral. In this new area, traditional hierarchical science platform visualisation methods must evolve to exploit this third dimension.
Building on the Hierarchical Progressive Survey method – endorsed by the IVOA and...
Arrays of Cherenkov telescopes detect ultra-short (nanosecond) flashes of blue light produced when high-energy gamma rays hit Earth’s atmosphere, triggering particle cascades. The upcoming Cherenkov Telescope Array Observatory (CTAO) will generate hundreds of petabytes of data annually, requiring extensive atmospheric monitoring and rich metadata to reconstruct event lists, images, spectra,...
The amounts of raw data in next-generation observatories, such as the Square Kilometre Array Observatory (SKAO), will be so large that they cannot be archived in their entirety, but must be significantly reduced. This is well known in high-energy physics, particularly at the Large Hadron Collider (LHC), where the data streams captured by the detectors are reduced by several orders of magnitude...