nfsrulesFR
/

mega-grpo

@@ -1,121 +1,121 @@
----
-base_model: meta-llama/Meta-Llama-3-8B
-tags:
-- molecular-optimization
-- chemistry
-- llama-3
-- grpo
-- rlhf
-license: apache-2.0
-language:
-- en
-pipeline_tag: text-generation
----
-# MEGA-GRPO
-Post-trained molecular optimization model using Tanimoto-aware GRPO (Group Relative Policy Optimization). Based on **Llama 3 8B**.
-**Paper**: [MEGA: A Large-Scale Molecular Editing Dataset for Guided-Action Optimization](https://openreview.net/pdf?id=wzou4rm3Tt)
-**Official Repository**: [https://github.com/nfsrules/MEGA.git](https://github.com/nfsrules/MEGA.git)
-## Installation
-```bash
-pip install unsloth torch
-```
-## Usage
-```python
-from unsloth import FastLanguageModel
-from unsloth.chat_templates import get_chat_template
-# Configuration
-max_seq_length = 1024
-lora_rank = 32
-# Load model
-model, tokenizer = FastLanguageModel.from_pretrained(
-    model_name = "nfsrulesFR/mega-grpo",
-    max_seq_length = max_seq_length,
-    load_in_4bit = True,
-    fast_inference = True,
-    max_lora_rank = lora_rank,
-    gpu_memory_utilization = 0.6,
-)
-# Configure tokenizer
-tokenizer.padding_side = 'left'
-if tokenizer.pad_token is None:
-    tokenizer.pad_token = tokenizer.eos_token
-tokenizer = get_chat_template(
-    tokenizer,
-    chat_template="llama-3",
-    mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"},
-)
-# Generate
-input_smiles = "CCO"
-task = "Can you make molecule CCO more soluble in water? The output molecule should be similar to the input molecule."
-messages = [{"from": "human", "value": task}]
-encoded = tokenizer.apply_chat_template(
-    messages,
-    tokenize=True,
-    add_generation_prompt=True,
-    return_tensors="pt",
-    padding=True,
-)
-outputs = model.generate(
-    input_ids=encoded["input_ids"].cuda(),
-    attention_mask=encoded["attention_mask"].cuda(),
-    max_new_tokens=64,
-    use_cache=True,
-    pad_token_id=tokenizer.pad_token_id,
-)
-response = tokenizer.decode(outputs[0][encoded["input_ids"].shape[1]:], skip_special_tokens=True)
-print(response)
-```
-## Supported Tasks
-| Task ID | Description |
-|---------|-------------|
-| 101 | Increase water solubility |
-| 102 | Decrease water solubility |
-| 103 | Increase drug-likeness |
-| 104 | Decrease drug-likeness |
-| 105 | Increase permeability |
-| 106 | Decrease permeability |
-| 107 | Increase hydrogen bond acceptors |
-| 108 | Increase hydrogen bond donors |
-| 201 | Increase solubility + HBA |
-| 202 | Decrease solubility + increase HBA |
-| 203 | Increase solubility + HBD |
-| 204 | Decrease solubility + increase HBD |
-| 205 | Increase solubility + permeability |
-| 206 | Increase solubility + decrease permeability |
-## Model Details
-- **Base Model**: Meta-Llama-3-8B
-- **Training**: Tanimoto-aware GRPO on 500K molecular transformations
-- **Input**: SMILES string + task description
-- **Output**: Modified SMILES string
-## Citation
-```bibtex
-@article{mega2025,
-  title={MEGA: A Large-Scale Molecular Editing Dataset for Guided-Action Optimization},
-  author={Fernandez, Nelson and Illouz, Maxime and Pinto, Luis and Yang, Entao and Amadou Boubacar, Habiboulaye},
-  journal={Under review at International Conference on Learning Representations},
-  year={2025},
-  url={https://openreview.net/forum?id=wzou4rm3Tt}
-}
-```

+---
+base_model: meta-llama/Meta-Llama-3-8B
+tags:
+- molecular-optimization
+- chemistry
+- llama-3
+- grpo
+- rlhf
+license: apache-2.0
+language:
+- en
+pipeline_tag: text-generation
+---
+# MEGA-GRPO
+Fine-tuned molecular optimization model using Tanimoto-aware GRPO (Group Relative Policy Optimization) on 500K molecular transformations. Based on **Llama 3 8B**.
+**Paper**: [MEGA: A Large-Scale Molecular Editing Dataset for Guided-Action Optimization](https://openreview.net/pdf?id=wzou4rm3Tt)
+**Official Repository**: [https://github.com/nfsrules/MEGA-moledit](https://github.com/nfsrules/MEGA-moledit)
+## Installation
+```bash
+pip install unsloth torch
+```
+## Usage
+```python
+from unsloth import FastLanguageModel
+from unsloth.chat_templates import get_chat_template
+# Configuration
+max_seq_length = 1024
+lora_rank = 32
+# Load model
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name = "nfsrulesFR/mega-grpo",
+    max_seq_length = max_seq_length,
+    load_in_4bit = True,
+    fast_inference = True,
+    max_lora_rank = lora_rank,
+    gpu_memory_utilization = 0.6,
+)
+# Configure tokenizer
+tokenizer.padding_side = 'left'
+if tokenizer.pad_token is None:
+    tokenizer.pad_token = tokenizer.eos_token
+tokenizer = get_chat_template(
+    tokenizer,
+    chat_template="llama-3",
+    mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"},
+)
+# Generate
+input_smiles = "CCO"
+task = "Can you make molecule CCO more soluble in water? The output molecule should be similar to the input molecule."
+messages = [{"from": "human", "value": task}]
+encoded = tokenizer.apply_chat_template(
+    messages,
+    tokenize=True,
+    add_generation_prompt=True,
+    return_tensors="pt",
+    padding=True,
+)
+outputs = model.generate(
+    input_ids=encoded["input_ids"].cuda(),
+    attention_mask=encoded["attention_mask"].cuda(),
+    max_new_tokens=64,
+    use_cache=True,
+    pad_token_id=tokenizer.pad_token_id,
+)
+response = tokenizer.decode(outputs[0][encoded["input_ids"].shape[1]:], skip_special_tokens=True)
+print(response)
+```
+## Supported Tasks
+| Task ID | Description |
+|---------|-------------|
+| 101 | Increase water solubility |
+| 102 | Decrease water solubility |
+| 103 | Increase drug-likeness |
+| 104 | Decrease drug-likeness |
+| 105 | Increase permeability |
+| 106 | Decrease permeability |
+| 107 | Increase hydrogen bond acceptors |
+| 108 | Increase hydrogen bond donors |
+| 201 | Increase solubility + HBA |
+| 202 | Decrease solubility + increase HBA |
+| 203 | Increase solubility + HBD |
+| 204 | Decrease solubility + increase HBD |
+| 205 | Increase solubility + permeability |
+| 206 | Increase solubility + decrease permeability |
+## Model Details
+- **Base Model**: Meta-Llama-3-8B
+- **Training**: Tanimoto-aware GRPO on 500K molecular transformations
+- **Input**: SMILES string + task description
+- **Output**: Modified SMILES string
+## Citation
+```bibtex
+@article{mega2025,
+  title={MEGA: A Large-Scale Molecular Editing Dataset for Guided-Action Optimization},
+  author={Fernandez, Nelson and Illouz, Maxime and Pinto, Luis and Yang, Entao and Amadou Boubacar, Habiboulaye},
+  journal={Under review at International Conference on Learning Representations},
+  year={2025},
+  url={https://openreview.net/forum?id=wzou4rm3Tt}
+}
+```