--- license: apache-2.0 tags: - materials - multimodal-projector --- # ALM · structure-to-language projector The projector that maps frozen **OrbV3** machine-learning-interatomic-potential features into the Qwen3-8B token space: a small MLP (`Linear(256→4096) → GELU → Linear(4096→4096)`, ~21M params) whose outputs are spliced into the input sequence as **soft tokens** at the `` position. The encoder produces one feature vector per atom; the projector emits one soft token per atom. Frozen in the generation models; trained in **ALM Core**. **Inputs:** OrbV3 (`orb_v3_direct_20_omat`) 256-d per-atom features → 4096-d soft tokens. ## Links Paper: [arXiv](https://arxiv.org/abs/2606.21395) · [HuggingFace](https://huggingface.co/papers/2606.21395) · Code: [GitHub](https://github.com/learningmatter-mit/alm) ## License Apache-2.0. ## Citation ```bibtex @article{edamadaka2026atomistic, title = {Atomistic Language Models Understand and Generate Materials}, author = {Edamadaka, Sathya and Ramesh, Krithik and Li, Ju and G\'omez-Bombarelli, Rafael}, journal = {arXiv preprint arXiv:2606.21395}, year = {2026} } ```