File size: 5,438 Bytes
aace99b
57270a3
 
 
 
 
 
012a478
 
 
 
 
 
aace99b
012a478
3cb8240
aace99b
423e97e
cab0446
012a478
3731704
 
423e97e
012a478
423e97e
012a478
423e97e
012a478
423e97e
012a478
 
 
 
423e97e
012a478
423e97e
012a478
 
 
423e97e
012a478
423e97e
012a478
423e97e
012a478
423e97e
012a478
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
423e97e
 
012a478
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
423e97e
 
 
012a478
 
423e97e
 
 
012a478
423e97e
 
012a478
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
---
datasets:
- MultimodalUniverse/legacysurvey
- MultimodalUniverse/hsc
- MultimodalUniverse/gaia
- MultimodalUniverse/sdss
- MultimodalUniverse/desi
license: mit
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
pipeline_tag: any-to-any
library_name: aion
---

# AION-1: Astronomical Omnimodal Network

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![GitHub Repo](https://img.shields.io/badge/GitHub-Repo-blue?logo=github)](https://github.com/PolymathicAI/AION)
[![Paper](https://img.shields.io/badge/Paper-2510.17960-b31b1b.svg)](https://huggingface.co/papers/2510.17960)
[![arXiv](https://img.shields.io/badge/arXiv-2510.17960-b31b1b.svg)](https://arxiv.org/abs/2510.17960)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PolymathicAI/AION/blob/main/notebooks/Tutorial.ipynb)

**AION-base** is a 300M parameter large omnimodal model specifically designed for astronomical surveys, presented in the paper [AION-1: Omnimodal Foundation Model for Astronomical Sciences](https://huggingface.co/papers/2510.17960). It integrates 39 distinct astronomical data types and enables adaptation to a wide range of astronomical tasks through multimodal masked modeling.

Project Homepage: https://polymathic-ai.org/

## Model Details

-   **Architecture**: Encoder-Decoder Transformer (12 blocks each, 768 dim, 12 heads)
-   **Parameters**: 300M
-   **Training**: Multimodal Masked Modeling (4M) on astronomical survey data
-   **Modalities**: 39 data types including imaging, spectra, catalogs, and photometry

## Installation

Assuming you have PyTorch installed, you can install AION trivially with:
```bash
pip install polymathic-aion
```
For advanced installation options, including specific PyTorch versions or developer installations, refer to the [GitHub repository](https://github.com/PolymathicAI/AION).

## Usage

After installation, you can load the pretrained model and start analyzing astronomical data.

```python
import torch
from aion import AION
from aion.codecs import CodecManager
from aion.modalities import LegacySurveyImage, Z

# Load model and codec manager
model = AION.from_pretrained('polymathic-ai/aion-base').to('cuda')  # or 'aion-large', 'aion-xlarge'
codec_manager = CodecManager(device='cuda')

# Example: Prepare your astronomical data (e.g., a dummy Legacy Survey image)
# In a real scenario, 'your_image_tensor' would come from your dataset.
your_image_tensor = torch.randn(1, 4, 96, 96) # Example: batch_size=1, 4 bands, 96x96 resolution
image = LegacySurveyImage(
    flux=your_image_tensor,
    bands=['DES-G', 'DES-R', 'DES-I', 'DES-Z']
)

# Encode data to tokens
tokens = codec_manager.encode(image)

# Option 1: Extract embeddings for downstream tasks
embeddings = model.encode(tokens, num_encoder_tokens=600)
print(f"Extracted embeddings shape: {embeddings.shape}")

# Option 2: Generate predictions (e.g., redshift)
# For this example, we predict redshift (Z) from the image.
# The target_mask tells the model which modality to generate.
preds = model(
    codec_manager.encode(image),
    target_modality=Z,
)
print(f"Predicted redshift logits shape: {preds['tok_z'].shape}")
```

### Supported Data Types
AION-Base processes data from major astronomical surveys. Here's an overview of the supported categories:

| **Category**            | **Description**                         | **Token Name(s)**        |
|:------------------------|:----------------------------------------|:-------------------------|
| **Imaging (2)**         | Legacy Survey, HSC Wide                 | `tok_image_ls`, `tok_image_hsc` |
| **Catalog (1)**         | Legacy Survey catalog entries           | `catalog`                |
| **Spectra (2)**         | SDSS, DESI                              | `tok_spectrum_sdss`, `tok_spectrum_desi` |
| **Gaia (4)**            | BP/RP spectra, parallax, sky coords     | `tok_xp_bp`, `tok_xp_rp`, `tok_parallax`, `tok_ra`, `tok_dec` |
| **Gaia Photometry (3)** | G/BP/RP flux                            | `tok_flux_g_gaia`, `tok_flux_bp_gaia`, `tok_flux_rp_gaia` |
| **Legacy Survey (9)**   | g,r,i,z bands & WISE W1–W4 flux, E(B–V) | `tok_flux_g`,…,`tok_flux_w4`, `tok_ebv` |
| **Legacy Shape (3)**    | Ellipticity components & effective radius | `tok_shape_e1`, `tok_shape_e2`, `tok_shape_r` |
| **HSC Photometry (5)**  | g,r,i,z,y magnitudes                    | `tok_mag_g`,…,`tok_mag_y` |
| **HSC Extinction (5)**  | g,r,i,z,y extinctions                   | `tok_a_g`,…,`tok_a_y`    |
| **HSC Shape (3)**       | Shape components 11,22,12               | `tok_shape11`, `tok_shape22`, `tok_shape12` |
| **Other (1)**           | Spectroscopic redshift                  | `tok_z`                  |

More details and interactive examples are available in the [Colab Tutorial](https://colab.research.google.com/github/PolymathicAI/AION/blob/main/notebooks/Tutorial.ipynb).

## Resources

-   GitHub Repository: https://github.com/PolymathicAI/AION
-   Interactive Tutorial: https://colab.research.google.com/github/PolymathicAI/AION/blob/main/notebooks/Tutorial.ipynb

## License

This project is licensed under the MIT License. See the [LICENSE](https://github.com/PolymathicAI/AION/blob/main/LICENSE) file in the GitHub repository for full details.

---
Built with ❤️ for the astronomical community by https://polymathic-ai.org/