sathyae commited on
Commit
4dbee48
·
verified ·
1 Parent(s): 5619883

Upload folder using huggingface_hub

Browse files
stage1-projector/README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - materials
5
+ - multimodal-projector
6
+ ---
7
+ # ALM · structure-to-language projector
8
+
9
+ The projector that maps frozen **OrbV3** machine-learning-interatomic-potential
10
+ features into the Qwen3-8B token space: a small MLP (`Linear(256→4096) → GELU →
11
+ Linear(4096→4096)`, ~21M params) whose outputs are spliced into the input sequence
12
+ as **soft tokens** at the `<atoms>` position. The encoder produces one feature
13
+ vector per atom; the projector emits one soft token per atom. Frozen in the
14
+ generation models; trained in **ALM Core**.
15
+
16
+ **Inputs:** OrbV3 (`orb_v3_direct_20_omat`) 256-d per-atom features → 4096-d soft tokens.
17
+
18
+ ## Links
19
+ Paper: [arXiv](https://arxiv.org/abs/2606.21395) · [HuggingFace](https://huggingface.co/papers/2606.21395) · Code: [GitHub](https://github.com/learningmatter-mit/alm)
20
+
21
+ ## License
22
+ Apache-2.0.
23
+
24
+ ## Citation
25
+ ```bibtex
26
+ @article{edamadaka2026atomistic,
27
+ title = {Atomistic Language Models Understand and Generate Materials},
28
+ author = {Edamadaka, Sathya and Ramesh, Krithik and Li, Ju and G\'omez-Bombarelli, Rafael},
29
+ journal = {arXiv preprint arXiv:2606.21395},
30
+ year = {2026}
31
+ }
32
+ ```
stage1-projector/projector.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:487fc12619fc4d4597af5b7645930f0857ae1b16e95d5f1ec2a172dcd32f5267
3
+ size 71338481