--- datasets: - MultimodalUniverse/legacysurvey - MultimodalUniverse/hsc - MultimodalUniverse/gaia - MultimodalUniverse/sdss - MultimodalUniverse/desi license: mit tags: - model_hub_mixin - pytorch_model_hub_mixin pipeline_tag: any-to-any library_name: aion --- # AION-1: Astronomical Omnimodal Network [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![GitHub Repo](https://img.shields.io/badge/GitHub-Repo-blue?logo=github)](https://github.com/PolymathicAI/AION) [![Paper](https://img.shields.io/badge/Paper-2510.17960-b31b1b.svg)](https://huggingface.co/papers/2510.17960) [![arXiv](https://img.shields.io/badge/arXiv-2510.17960-b31b1b.svg)](https://arxiv.org/abs/2510.17960) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PolymathicAI/AION/blob/main/notebooks/Tutorial.ipynb) **AION-base** is a 300M parameter large omnimodal model specifically designed for astronomical surveys, presented in the paper [AION-1: Omnimodal Foundation Model for Astronomical Sciences](https://huggingface.co/papers/2510.17960). It integrates 39 distinct astronomical data types and enables adaptation to a wide range of astronomical tasks through multimodal masked modeling. Project Homepage: https://polymathic-ai.org/ ## Model Details - **Architecture**: Encoder-Decoder Transformer (12 blocks each, 768 dim, 12 heads) - **Parameters**: 300M - **Training**: Multimodal Masked Modeling (4M) on astronomical survey data - **Modalities**: 39 data types including imaging, spectra, catalogs, and photometry ## Installation Assuming you have PyTorch installed, you can install AION trivially with: ```bash pip install polymathic-aion ``` For advanced installation options, including specific PyTorch versions or developer installations, refer to the [GitHub repository](https://github.com/PolymathicAI/AION). ## Usage After installation, you can load the pretrained model and start analyzing astronomical data. ```python import torch from aion import AION from aion.codecs import CodecManager from aion.modalities import LegacySurveyImage, Z # Load model and codec manager model = AION.from_pretrained('polymathic-ai/aion-base').to('cuda') # or 'aion-large', 'aion-xlarge' codec_manager = CodecManager(device='cuda') # Example: Prepare your astronomical data (e.g., a dummy Legacy Survey image) # In a real scenario, 'your_image_tensor' would come from your dataset. your_image_tensor = torch.randn(1, 4, 96, 96) # Example: batch_size=1, 4 bands, 96x96 resolution image = LegacySurveyImage( flux=your_image_tensor, bands=['DES-G', 'DES-R', 'DES-I', 'DES-Z'] ) # Encode data to tokens tokens = codec_manager.encode(image) # Option 1: Extract embeddings for downstream tasks embeddings = model.encode(tokens, num_encoder_tokens=600) print(f"Extracted embeddings shape: {embeddings.shape}") # Option 2: Generate predictions (e.g., redshift) # For this example, we predict redshift (Z) from the image. # The target_mask tells the model which modality to generate. preds = model( codec_manager.encode(image), target_modality=Z, ) print(f"Predicted redshift logits shape: {preds['tok_z'].shape}") ``` ### Supported Data Types AION-Base processes data from major astronomical surveys. Here's an overview of the supported categories: | **Category** | **Description** | **Token Name(s)** | |:------------------------|:----------------------------------------|:-------------------------| | **Imaging (2)** | Legacy Survey, HSC Wide | `tok_image_ls`, `tok_image_hsc` | | **Catalog (1)** | Legacy Survey catalog entries | `catalog` | | **Spectra (2)** | SDSS, DESI | `tok_spectrum_sdss`, `tok_spectrum_desi` | | **Gaia (4)** | BP/RP spectra, parallax, sky coords | `tok_xp_bp`, `tok_xp_rp`, `tok_parallax`, `tok_ra`, `tok_dec` | | **Gaia Photometry (3)** | G/BP/RP flux | `tok_flux_g_gaia`, `tok_flux_bp_gaia`, `tok_flux_rp_gaia` | | **Legacy Survey (9)** | g,r,i,z bands & WISE W1–W4 flux, E(B–V) | `tok_flux_g`,…,`tok_flux_w4`, `tok_ebv` | | **Legacy Shape (3)** | Ellipticity components & effective radius | `tok_shape_e1`, `tok_shape_e2`, `tok_shape_r` | | **HSC Photometry (5)** | g,r,i,z,y magnitudes | `tok_mag_g`,…,`tok_mag_y` | | **HSC Extinction (5)** | g,r,i,z,y extinctions | `tok_a_g`,…,`tok_a_y` | | **HSC Shape (3)** | Shape components 11,22,12 | `tok_shape11`, `tok_shape22`, `tok_shape12` | | **Other (1)** | Spectroscopic redshift | `tok_z` | More details and interactive examples are available in the [Colab Tutorial](https://colab.research.google.com/github/PolymathicAI/AION/blob/main/notebooks/Tutorial.ipynb). ## Resources - GitHub Repository: https://github.com/PolymathicAI/AION - Interactive Tutorial: https://colab.research.google.com/github/PolymathicAI/AION/blob/main/notebooks/Tutorial.ipynb ## License This project is licensed under the MIT License. See the [LICENSE](https://github.com/PolymathicAI/AION/blob/main/LICENSE) file in the GitHub repository for full details. --- Built with ❤️ for the astronomical community by https://polymathic-ai.org/