thomaswang commited on
Commit
df21f73
Β·
verified Β·
1 Parent(s): 91c0f33

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +170 -0
README.md ADDED
@@ -0,0 +1,170 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ tags:
4
+ - materials-science
5
+ - property-prediction
6
+ - modular-learning
7
+ - graph-neural-network
8
+ - crystal
9
+ datasets:
10
+ - matminer
11
+ language:
12
+ - en
13
+ library_name: pytorch
14
+ pipeline_tag: other
15
+ ---
16
+
17
+ # MoMa Hub: Pretrained Modules for Material Property Prediction
18
+
19
+ <div align="center">
20
+
21
+ [![arXiv](https://img.shields.io/badge/arXiv-2502.15483-b31b1b.svg)](https://arxiv.org/abs/2502.15483)
22
+ [![GitHub](https://img.shields.io/badge/GitHub-MoMa-blue.svg)](https://github.com/Thomaswbt/MoMa)
23
+
24
+ </div>
25
+
26
+ This repository hosts the **18 pretrained full modules** of the MoMa Hub, from the paper:
27
+
28
+ > **MoMa: A Modular Deep Learning Framework for Material Property Prediction**
29
+ >
30
+ > Botian Wang, Yawen Ouyang, Yaohui Li, Yiqun Wang, Haorui Cui, Jianbing Zhang, Xiaonan Wang, Wei-Ying Ma, Hao Zhou
31
+ >
32
+ > *ICLR 2026*
33
+
34
+ ## Model Description
35
+
36
+ MoMa (**Mo**dular learning for **Ma**terials) is a modular deep learning framework that addresses the diversity and disparity challenges in material property prediction. Instead of forcing all tasks into one shared model, MoMa trains specialized modules on diverse high-resource material tasks and adaptively composes synergistic modules for each downstream scenario.
37
+
38
+ Each module in this repository is a **full module** β€” a complete GemNet-OC backbone (initialized from the [JMP-L](https://github.com/facebookresearch/JMP) pretrained model) that has been fully fine-tuned on a specific material property prediction task. These modules are designed to be composed via weighted averaging for adaptation to new downstream tasks.
39
+
40
+ ## Modules
41
+
42
+ This repository contains 18 `.pt` checkpoint files, each trained on a distinct material property prediction task from the [Matminer](https://hackingmaterials.lbl.gov/matminer/) datasets. The modules cover **electronic, thermal, mechanical, and optical** properties across different material databases:
43
+
44
+ | File | Source | Property | Category |
45
+ |------|--------|----------|----------|
46
+ | `mp_eform.pt` | Materials Project | Formation Energy | Thermal |
47
+ | `mp_bandgap.pt` | Materials Project | Band Gap | Electronic |
48
+ | `mp_gvrh.pt` | Materials Project | Shear Modulus (VRH) | Mechanical |
49
+ | `mp_kvrh.pt` | Materials Project | Bulk Modulus (VRH) | Mechanical |
50
+ | `castelli_eform.pt` | Castelli et al. | Formation Energy | Thermal |
51
+ | `jarvis_eform.pt` | JARVIS-DFT | Formation Energy | Thermal |
52
+ | `jarvis_bandgap.pt` | JARVIS-DFT | Band Gap (OPT) | Electronic |
53
+ | `jarvis_gvrh.pt` | JARVIS-DFT | Shear Modulus (VRH) | Mechanical |
54
+ | `jarvis_kvrh.pt` | JARVIS-DFT | Bulk Modulus (VRH) | Mechanical |
55
+ | `jarvis_dielectric_opt.pt` | JARVIS-DFT | Dielectric Constant (OPT) | Electronic |
56
+ | `n_Seebeck.pt` | Ricci et al. | n-type Seebeck Coefficient | Thermoelectric |
57
+ | `n_avg_eff_mass.pt` | Ricci et al. | n-type Average Effective Mass | Thermoelectric |
58
+ | `n_e_cond.pt` | Ricci et al. | n-type Electrical Conductivity | Thermoelectric |
59
+ | `n_th_cond.pt` | Ricci et al. | n-type Thermal Conductivity | Thermoelectric |
60
+ | `p_Seebeck.pt` | Ricci et al. | p-type Seebeck Coefficient | Thermoelectric |
61
+ | `p_avg_eff_mass.pt` | Ricci et al. | p-type Average Effective Mass | Thermoelectric |
62
+ | `p_e_cond.pt` | Ricci et al. | p-type Electrical Conductivity | Thermoelectric |
63
+ | `p_th_cond.pt` | Ricci et al. | p-type Thermal Conductivity | Thermoelectric |
64
+
65
+ ## Architecture
66
+
67
+ - **Backbone**: GemNet-OC (Large)
68
+ - **Initialization**: [JMP-L](https://github.com/facebookresearch/JMP) pretrained checkpoint
69
+ - **Module type**: Full module (all backbone parameters fine-tuned)
70
+ - **Parameters per module**: ~165M
71
+ - **File size per module**: ~615 MB
72
+ - **Total repository size**: ~10.8 GB
73
+
74
+ ## Usage
75
+
76
+ ### Download
77
+
78
+ **Option 1: Using `huggingface_hub` (Python)**
79
+
80
+ ```python
81
+ from huggingface_hub import snapshot_download
82
+
83
+ snapshot_download(
84
+ repo_id="GenSI/MoMa-modules-ICLR",
85
+ repo_type="model",
86
+ local_dir="./hub",
87
+ )
88
+ ```
89
+
90
+ **Option 2: Using Hugging Face CLI**
91
+
92
+ ```bash
93
+ pip install huggingface_hub
94
+ hf download GenSI/MoMa-modules-ICLR --repo-type model --local-dir ./hub
95
+ ```
96
+
97
+ ### Integration with MoMa
98
+
99
+ After downloading, place the `hub/` directory under the [MoMa codebase](https://github.com/Xiaonan-Wang-AIR/MoMa) root:
100
+
101
+ ```
102
+ MoMa/
103
+ β”œβ”€β”€ hub/
104
+ β”‚ β”œβ”€β”€ mp_eform.pt
105
+ β”‚ β”œβ”€β”€ mp_bandgap.pt
106
+ β”‚ └── ... (18 modules)
107
+ β”œβ”€β”€ configs/
108
+ β”œβ”€β”€ scripts/
109
+ └── ...
110
+ ```
111
+
112
+ Then follow the instructions in the [MoMa repository](https://github.com/Xiaonan-Wang-AIR/MoMa) to run Adaptive Module Composition and downstream fine-tuning:
113
+
114
+ ```bash
115
+ # Adaptive Module Assembly (can be skipped using precomputed results in json/)
116
+ bash scripts/extract_embeddings.sh
117
+ python scripts/run_knn.py
118
+ python scripts/weight_optimize.py
119
+
120
+ # Downstream Fine-tuning with MoMa
121
+ bash scripts/finetune_moma.sh
122
+ ```
123
+
124
+ ### Loading a Single Module
125
+
126
+ Each `.pt` file is a standard PyTorch checkpoint containing a `state_dict`:
127
+
128
+ ```python
129
+ import torch
130
+
131
+ ckpt = torch.load("hub/mp_eform.pt", map_location="cpu")
132
+ state_dict = ckpt["state_dict"]
133
+ ```
134
+
135
+ ## Results
136
+
137
+ MoMa achieves state-of-the-art performance on 17 material property prediction benchmarks, with an average improvement of **14%** over the strongest baseline (JMP fine-tuning). See the full results in our [paper](https://arxiv.org/abs/2502.15483).
138
+
139
+ | Method | Average Rank |
140
+ |--------|:------------:|
141
+ | CGCNN | 6.00 |
142
+ | MoE-(18) | 4.12 |
143
+ | JMP-MT | 3.94 |
144
+ | JMP-FT | 2.88 |
145
+ | **MoMa (Adapter)** | **2.47** |
146
+ | **MoMa (Full)** | **1.35** |
147
+
148
+ ## Citation
149
+
150
+ ```bibtex
151
+ @article{wang2025moma,
152
+ title={MoMa: A Modular Deep Learning Framework for Material Property Prediction},
153
+ author={Wang, Botian and Ouyang, Yawen and Li, Yaohui and Wang, Yiqun and Cui, Haorui and Zhang, Jianbing and Wang, Xiaonan and Ma, Wei-Ying and Zhou, Hao},
154
+ journal={arXiv preprint arXiv:2502.15483},
155
+ year={2025}
156
+ }
157
+ ```
158
+
159
+ ```bibtex
160
+ @article{shoghi2023molecules,
161
+ title={From molecules to materials: Pre-training large generalizable models for atomic property prediction},
162
+ author={Shoghi, Nima and Kolluru, Adeesh and Kitchin, John R and Ulissi, Zachary W and Zitnick, C Lawrence and Wood, Brandon M},
163
+ journal={arXiv preprint arXiv:2310.16802},
164
+ year={2023}
165
+ }
166
+ ```
167
+
168
+ ## License
169
+
170
+ This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/).