cyrusyc
/

mace-universal

Machine Learning Interatomic Potential

Model card Files Files and versions

cyrusyc commited on Oct 24, 2023

Commit

57ca36b

·

1 Parent(s): 08a0aef

add example training script

Files changed (2) hide show

2023-08-14-mace-universal.sbatch +59 -0
README.md +8 -2

2023-08-14-mace-universal.sbatch ADDED Viewed

	@@ -0,0 +1,59 @@

+#!/bin/bash
+#SBATCH -C gpu
+#SBATCH -G 40
+#SBATCH -N 10
+#SBATCH --ntasks=40
+#SBATCH --ntasks-per-node=4
+#SBATCH --cpus-per-task=4
+#SBATCH --time=6:00:00
+#SBATCH --time-min=02:00:00
+#SBATCH --error=%x-%j.err
+#SBATCH --output=%x-%j.out
+#SBATCH --requeue
+#SBATCH --exclusive
+#SBATCH --open-mode=append
+exp_name=$(basename "$SLURM_SUBMIT_DIR")
+srun python run_train.py \
+    --name=$exp_name \
+    --train_file="train.h5" \
+    --valid_file="valid.h5" \
+    --statistics_file="statistics.json" \
+    --energy_weight=1 \
+    --forces_weight=1 \
+    --eval_interval=1 \
+    --config_type_weights='{"Default":1.0}' \
+    --E0s='average' \
+    --error_table='PerAtomMAE' \
+    --stress_key='stress' \
+    --model="ScaleShiftMACE" \
+    --MLP_irreps="64x0e" \
+    --interaction_first="RealAgnosticResidualInteractionBlock" \
+    --interaction="RealAgnosticResidualInteractionBlock" \
+    --num_interactions=2 \
+    --num_channels=128 \
+    --max_ell=3 \
+    --hidden_irreps='64x0e + 64x1o + 64x2e' \
+    --num_cutoff_basis=10 \
+    --lr=1e-2 \
+    --correlation=3 \
+    --r_max=6.0 \
+    --num_radial_basis=10 \
+    --scaling='rms_forces_scaling' \
+    --distributed \
+    --num_workers=4 \
+    --batch_size=10 \
+    --valid_batch_size=30 \
+    --max_num_epochs=500 \
+    --patience=250 \
+    --amsgrad \
+    --weight_decay=1e-8 \
+    --ema \
+    --ema_decay=0.999 \
+    --default_dtype="float32"\
+    --clip_grad=100 \
+    --device=cuda \
+    --seed=3 \
+    --save_cpu \
+    --restart_latest &

README.md CHANGED Viewed

@@ -79,11 +79,17 @@ If you use the pretrained models in this repository, please cite all the followi
 }
 ```
-# Training Details
 ## Training Data
 <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-## Training Procedure

 }
 ```
+# Training Guide
 ## Training Data
 <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+For now, please download MPTrj data from [figshare](https://figshare.com/articles/dataset/Materials_Project_Trjectory_MPtrj_Dataset/23713842). We may upload to HuggingFace Datasets in the future.
+## Fine-tuning
+<!-- This should link to a Training Procedure Card, perhaps with a short stub of information on what the training procedure is all about as well as documentation related to hyperparameters or additional training details. -->
+We provide an example multi-GPU training script [2023-08-14-mace-universal.sbatch]([2023-08-14-mace-universal.model](https://huggingface.co/cyrusyc/mace-universal/blob/main/2023-08-14-mace-universal.sbatch)), which uses 40 A100s on NERSC Perlmutter. Please see MACE `multi-gpu` branch for more detailed instructions.