sathyae's picture
Upload folder using huggingface_hub
4dbee48 verified
|
Raw
History Blame Contribute Delete
1.16 kB
metadata
license: apache-2.0
tags:
  - materials
  - multimodal-projector

ALM · structure-to-language projector

The projector that maps frozen OrbV3 machine-learning-interatomic-potential features into the Qwen3-8B token space: a small MLP (Linear(256→4096) → GELU → Linear(4096→4096), ~21M params) whose outputs are spliced into the input sequence as soft tokens at the <atoms> position. The encoder produces one feature vector per atom; the projector emits one soft token per atom. Frozen in the generation models; trained in ALM Core.

Inputs: OrbV3 (orb_v3_direct_20_omat) 256-d per-atom features → 4096-d soft tokens.

Links

Paper: arXiv · HuggingFace · Code: GitHub

License

Apache-2.0.

Citation

@article{edamadaka2026atomistic,
  title   = {Atomistic Language Models Understand and Generate Materials},
  author  = {Edamadaka, Sathya and Ramesh, Krithik and Li, Ju and G\'omez-Bombarelli, Rafael},
  journal = {arXiv preprint arXiv:2606.21395},
  year    = {2026}
}