jonghyunlee
/

MoLLaMA

transformers, DeepChem

Model card Files Files and versions

jonghyunlee commited on Mar 6

Commit

26004d1

·

verified ·

1 Parent(s): 9f2bf97

Update README.md

Files changed (1) hide show

README.md +2 -6

README.md CHANGED Viewed

@@ -5,9 +5,7 @@ tags:
 ---
-```markdown
 # 🧬 MoLLaMA-Small
 MoLLaMA-Small is a lightweight LLaMA-based causal language model (57.2M parameters) trained from scratch to generate valid chemical molecules using SMILES strings.
 This model uses DeepChem's `SmilesTokenizer` and was trained on a combined dataset of ZINC15 and MuMOInstruct. It is designed for unconditional molecule generation.
@@ -31,11 +29,11 @@ A custom, scaled-down LLaMA architecture was used to optimize for chemical langu
 * **Max Position Embeddings**: 1024
 ## 🚀 How to Use
 You can easily load this model using the standard `transformers` library. The model generates SMILES strings by prompting it with the `[bos]` (Beginning of Sequence) token.
 ### Prerequisites
 Make sure you have the required libraries installed:
 ```bash
 pip install transformers torch deepchem
@@ -86,6 +84,4 @@ print(f"Generated SMILES: {generated_smiles}")
 * **Batch Size**: 512 (with gradient accumulation steps of 4)
 * **Learning Rate**: 1e-4 (Cosine scheduler, 10% Warmup)
 * **Precision**: bf16 (Mixed Precision)
-* **Early Stopping Patience**: 5 epochs
-```

 ---
 # 🧬 MoLLaMA-Small
 MoLLaMA-Small is a lightweight LLaMA-based causal language model (57.2M parameters) trained from scratch to generate valid chemical molecules using SMILES strings.
 This model uses DeepChem's `SmilesTokenizer` and was trained on a combined dataset of ZINC15 and MuMOInstruct. It is designed for unconditional molecule generation.
 * **Max Position Embeddings**: 1024
 ## 🚀 How to Use
 You can easily load this model using the standard `transformers` library. The model generates SMILES strings by prompting it with the `[bos]` (Beginning of Sequence) token.
 ### Prerequisites
 Make sure you have the required libraries installed:
 ```bash
 pip install transformers torch deepchem
 * **Batch Size**: 512 (with gradient accumulation steps of 4)
 * **Learning Rate**: 1e-4 (Cosine scheduler, 10% Warmup)
 * **Precision**: bf16 (Mixed Precision)
+* **Early Stopping Patience**: 5 epochs