| --- |
| license: apache-2.0 |
| base_model: google/txgemma-9b-predict |
| tags: |
| - gemma |
| - txgemma |
| - ec-number |
| - biochemical-reactions |
| - lora |
| - gemma4ec |
| --- |
| |
| # Gemma4EC-9B-Predict |
|
|
| This repository contains the **LoRA adapter** for **Gemma4EC**, a model fine-tuned to |
| predict **Enzyme Commission (EC) numbers** from biochemical reaction SMILES. |
|
|
| ## Base model |
| - google/txgemma-9b-predict |
|
|
| ## Task |
| - Input: biochemical reaction SMILES |
| - Output: EC number (up to sub-subclass level) |
|
|
| ## Training |
| - Parameter-efficient fine-tuning using **LoRA** |
| - Few-shot prompt format |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| from peft import PeftModel |
| |
| base_model = "google/txgemma-9b-predict" |
| adapter_repo = "PlanesLab/Gemma4EC-9B-Predict" |
| |
| tokenizer = AutoTokenizer.from_pretrained(base_model) |
| model = AutoModelForCausalLM.from_pretrained( |
| base_model, |
| torch_dtype="auto", |
| device_map="auto" |
| ) |
| |
| model = PeftModel.from_pretrained(model, adapter_repo) |
| model.eval() |
| |
| ``` |
|
|
| ## Code |
|
|
| Full source code including training, inference and benchmarking scripts are available on: |
|
|
| https://github.com/PlanesLab/Gemma4EC |
|
|
|
|
| ## Citation |
|
|
|
|
|
|
|
|