|
|
---
|
|
|
base_model: meta-llama/Meta-Llama-3-8B
|
|
|
tags:
|
|
|
- molecular-optimization
|
|
|
- chemistry
|
|
|
- llama-3
|
|
|
- grpo
|
|
|
- rlhf
|
|
|
license: apache-2.0
|
|
|
language:
|
|
|
- en
|
|
|
pipeline_tag: text-generation
|
|
|
---
|
|
|
|
|
|
# MEGA-GRPO
|
|
|
|
|
|
Fine-tuned molecular optimization model using Tanimoto-aware GRPO (Group Relative Policy Optimization) on 500K molecular transformations. Based on **Llama 3 8B**.
|
|
|
|
|
|
**Paper**: [MEGA: A Large-Scale Molecular Editing Dataset for Guided-Action Optimization](https://openreview.net/pdf?id=wzou4rm3Tt)
|
|
|
**Official Repository**: [https://github.com/nfsrules/MEGA-moledit](https://github.com/nfsrules/MEGA-moledit)
|
|
|
|
|
|
## Installation
|
|
|
|
|
|
```bash
|
|
|
pip install unsloth torch
|
|
|
```
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
```python
|
|
|
from unsloth import FastLanguageModel
|
|
|
from unsloth.chat_templates import get_chat_template
|
|
|
|
|
|
# Configuration
|
|
|
max_seq_length = 1024
|
|
|
lora_rank = 32
|
|
|
|
|
|
# Load model
|
|
|
model, tokenizer = FastLanguageModel.from_pretrained(
|
|
|
model_name = "nfsrulesFR/mega-grpo",
|
|
|
max_seq_length = max_seq_length,
|
|
|
load_in_4bit = True,
|
|
|
fast_inference = True,
|
|
|
max_lora_rank = lora_rank,
|
|
|
gpu_memory_utilization = 0.6,
|
|
|
)
|
|
|
|
|
|
# Configure tokenizer
|
|
|
tokenizer.padding_side = 'left'
|
|
|
if tokenizer.pad_token is None:
|
|
|
tokenizer.pad_token = tokenizer.eos_token
|
|
|
|
|
|
tokenizer = get_chat_template(
|
|
|
tokenizer,
|
|
|
chat_template="llama-3",
|
|
|
mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"},
|
|
|
)
|
|
|
|
|
|
# Generate
|
|
|
input_smiles = "CCO"
|
|
|
task = "Can you make molecule CCO more soluble in water? The output molecule should be similar to the input molecule."
|
|
|
|
|
|
messages = [{"from": "human", "value": task}]
|
|
|
|
|
|
encoded = tokenizer.apply_chat_template(
|
|
|
messages,
|
|
|
tokenize=True,
|
|
|
add_generation_prompt=True,
|
|
|
return_tensors="pt",
|
|
|
padding=True,
|
|
|
)
|
|
|
|
|
|
outputs = model.generate(
|
|
|
input_ids=encoded["input_ids"].cuda(),
|
|
|
attention_mask=encoded["attention_mask"].cuda(),
|
|
|
max_new_tokens=64,
|
|
|
use_cache=True,
|
|
|
pad_token_id=tokenizer.pad_token_id,
|
|
|
)
|
|
|
|
|
|
response = tokenizer.decode(outputs[0][encoded["input_ids"].shape[1]:], skip_special_tokens=True)
|
|
|
print(response)
|
|
|
```
|
|
|
|
|
|
## Supported Tasks
|
|
|
|
|
|
| Task ID | Description |
|
|
|
|---------|-------------|
|
|
|
| 101 | Increase water solubility |
|
|
|
| 102 | Decrease water solubility |
|
|
|
| 103 | Increase drug-likeness |
|
|
|
| 104 | Decrease drug-likeness |
|
|
|
| 105 | Increase permeability |
|
|
|
| 106 | Decrease permeability |
|
|
|
| 107 | Increase hydrogen bond acceptors |
|
|
|
| 108 | Increase hydrogen bond donors |
|
|
|
| 201 | Increase solubility + HBA |
|
|
|
| 202 | Decrease solubility + increase HBA |
|
|
|
| 203 | Increase solubility + HBD |
|
|
|
| 204 | Decrease solubility + increase HBD |
|
|
|
| 205 | Increase solubility + permeability |
|
|
|
| 206 | Increase solubility + decrease permeability |
|
|
|
|
|
|
## Model Details
|
|
|
|
|
|
- **Base Model**: Meta-Llama-3-8B
|
|
|
- **Training**: Tanimoto-aware GRPO on 500K molecular transformations
|
|
|
- **Input**: SMILES string + task description
|
|
|
- **Output**: Modified SMILES string
|
|
|
|
|
|
## Citation
|
|
|
|
|
|
```bibtex
|
|
|
@article{mega2025,
|
|
|
title={MEGA: A Large-Scale Molecular Editing Dataset for Guided-Action Optimization},
|
|
|
author={Fernandez, Nelson and Illouz, Maxime and Pinto, Luis and Yang, Entao and Amadou Boubacar, Habiboulaye},
|
|
|
journal={Under review at International Conference on Learning Representations},
|
|
|
year={2025},
|
|
|
url={https://openreview.net/forum?id=wzou4rm3Tt}
|
|
|
}
|
|
|
```
|
|
|
|