---
license: apache-2.0
tags:
  - materials
  - multimodal-projector
---
# ALM · structure-to-language projector

The projector that maps frozen **OrbV3** machine-learning-interatomic-potential
features into the Qwen3-8B token space: a small MLP (`Linear(256→4096) → GELU →
Linear(4096→4096)`, ~21M params) whose outputs are spliced into the input sequence
as **soft tokens** at the `<atoms>` position. The encoder produces one feature
vector per atom; the projector emits one soft token per atom. Frozen in the
generation models; trained in **ALM Core**.

**Inputs:** OrbV3 (`orb_v3_direct_20_omat`) 256-d per-atom features → 4096-d soft tokens.

## Links
Paper: [arXiv](https://arxiv.org/abs/2606.21395) · [HuggingFace](https://huggingface.co/papers/2606.21395) · Code: [GitHub](https://github.com/learningmatter-mit/alm)

## License
Apache-2.0.

## Citation
```bibtex
@article{edamadaka2026atomistic,
  title   = {Atomistic Language Models Understand and Generate Materials},
  author  = {Edamadaka, Sathya and Ramesh, Krithik and Li, Ju and G\'omez-Bombarelli, Rafael},
  journal = {arXiv preprint arXiv:2606.21395},
  year    = {2026}
}
```