aion-base / README.md

nielsr HF Staff

Improve model card: Add pipeline tag, library name, paper link, and expanded usage

012a478 verified 3 months ago

preview code

raw

history blame

5.44 kB

metadata

datasets:
  - MultimodalUniverse/legacysurvey
  - MultimodalUniverse/hsc
  - MultimodalUniverse/gaia
  - MultimodalUniverse/sdss
  - MultimodalUniverse/desi
license: mit
tags:
  - model_hub_mixin
  - pytorch_model_hub_mixin
pipeline_tag: any-to-any
library_name: aion

AION-1: Astronomical Omnimodal Network

AION-base is a 300M parameter large omnimodal model specifically designed for astronomical surveys, presented in the paper AION-1: Omnimodal Foundation Model for Astronomical Sciences. It integrates 39 distinct astronomical data types and enables adaptation to a wide range of astronomical tasks through multimodal masked modeling.

Project Homepage: https://polymathic-ai.org/

Model Details

Architecture: Encoder-Decoder Transformer (12 blocks each, 768 dim, 12 heads)
Parameters: 300M
Training: Multimodal Masked Modeling (4M) on astronomical survey data
Modalities: 39 data types including imaging, spectra, catalogs, and photometry

Installation

Assuming you have PyTorch installed, you can install AION trivially with:

pip install polymathic-aion

For advanced installation options, including specific PyTorch versions or developer installations, refer to the GitHub repository.

Usage

After installation, you can load the pretrained model and start analyzing astronomical data.

import torch
from aion import AION
from aion.codecs import CodecManager
from aion.modalities import LegacySurveyImage, Z

# Load model and codec manager
model = AION.from_pretrained('polymathic-ai/aion-base').to('cuda')  # or 'aion-large', 'aion-xlarge'
codec_manager = CodecManager(device='cuda')

# Example: Prepare your astronomical data (e.g., a dummy Legacy Survey image)
# In a real scenario, 'your_image_tensor' would come from your dataset.
your_image_tensor = torch.randn(1, 4, 96, 96) # Example: batch_size=1, 4 bands, 96x96 resolution
image = LegacySurveyImage(
    flux=your_image_tensor,
    bands=['DES-G', 'DES-R', 'DES-I', 'DES-Z']
)

# Encode data to tokens
tokens = codec_manager.encode(image)

# Option 1: Extract embeddings for downstream tasks
embeddings = model.encode(tokens, num_encoder_tokens=600)
print(f"Extracted embeddings shape: {embeddings.shape}")

# Option 2: Generate predictions (e.g., redshift)
# For this example, we predict redshift (Z) from the image.
# The target_mask tells the model which modality to generate.
preds = model(
    codec_manager.encode(image),
    target_modality=Z,
)
print(f"Predicted redshift logits shape: {preds['tok_z'].shape}")

Supported Data Types

AION-Base processes data from major astronomical surveys. Here's an overview of the supported categories:

Category	Description	Token Name(s)
Imaging (2)	Legacy Survey, HSC Wide	`tok_image_ls`, `tok_image_hsc`
Catalog (1)	Legacy Survey catalog entries	`catalog`
Spectra (2)	SDSS, DESI	`tok_spectrum_sdss`, `tok_spectrum_desi`
Gaia (4)	BP/RP spectra, parallax, sky coords	`tok_xp_bp`, `tok_xp_rp`, `tok_parallax`, `tok_ra`, `tok_dec`
Gaia Photometry (3)	G/BP/RP flux	`tok_flux_g_gaia`, `tok_flux_bp_gaia`, `tok_flux_rp_gaia`
Legacy Survey (9)	g,r,i,z bands & WISE W1–W4 flux, E(B–V)	`tok_flux_g`,…,`tok_flux_w4`, `tok_ebv`
Legacy Shape (3)	Ellipticity components & effective radius	`tok_shape_e1`, `tok_shape_e2`, `tok_shape_r`
HSC Photometry (5)	g,r,i,z,y magnitudes	`tok_mag_g`,…,`tok_mag_y`
HSC Extinction (5)	g,r,i,z,y extinctions	`tok_a_g`,…,`tok_a_y`
HSC Shape (3)	Shape components 11,22,12	`tok_shape11`, `tok_shape22`, `tok_shape12`
Other (1)	Spectroscopic redshift	`tok_z`

More details and interactive examples are available in the Colab Tutorial.

Resources

GitHub Repository: https://github.com/PolymathicAI/AION
Interactive Tutorial: https://colab.research.google.com/github/PolymathicAI/AION/blob/main/notebooks/Tutorial.ipynb

License

This project is licensed under the MIT License. See the LICENSE file in the GitHub repository for full details.

Built with ❤️ for the astronomical community by https://polymathic-ai.org/