| --- |
| license: other |
| license_name: cambrian-open |
| license_link: https://huggingface.co/Moomboh/ESMC-300M-mutafitup/blob/main/CAMBRIAN_OPEN_LICENSE.md |
| tags: |
| - protein-language-model |
| - onnx |
| - fine-tuning |
| - multi-task |
| - esm |
| --- |
| |
| # Moomboh/ESMC-300M-mutafitup |
|
|
| Multi-task LoRA fine-tuned ONNX models derived from |
| [ESM-C 300M](https://huggingface.co/EvolutionaryScale/esmc-300m-2024-12) |
| by [EvolutionaryScale](https://www.evolutionaryscale.ai/). |
|
|
| Built with ESM. |
|
|
| ## ONNX Models |
|
|
| | Model | Section | Tasks | Variant | |
| |-------|---------|-------|---------| |
| | `ESMC-300M-mutafitup-accgrad-all-r4-best-overall` | accgrad_lora | disorder, gpsite_atp, gpsite_ca, ... (16 total) | best_overall | |
| | `ESMC-300M-mutafitup-align-all-r4-best-overall` | align_lora | disorder, gpsite_atp, gpsite_ca, ... (16 total) | best_overall | |
|
|
| Each ONNX model directory contains: |
| - `model.onnx` -- merged ONNX model (LoRA weights folded into backbone) |
| - `export_metadata.json` -- task configuration and preprocessing settings |
| - `normalization_stats.json` -- per-task normalization statistics |
| - `tokenizer/` -- HuggingFace tokenizer files |
| - `history.json` -- training history (per-epoch metrics) |
| - `best_checkpoints.json` -- checkpoint selection metadata |
|
|
| ## PyTorch Checkpoints |
|
|
| The `checkpoints/` directory contains minimal trainable-parameter |
| PyTorch checkpoints for **all** training runs (45 runs across |
| 4 training sections). These checkpoints contain only the |
| parameters that were updated during fine-tuning (LoRA adapters and task |
| heads), not the frozen backbone weights. |
|
|
| Each run directory (`checkpoints/{section}/{run}/`) contains: |
| - `history.json` -- training history |
| - `best_checkpoints.json` -- checkpoint selection metadata |
| - `best_overall_model/model.pt` -- best checkpoint by overall metric |
| - `best_loss_overall_model/model.pt` -- best checkpoint by overall loss |
| - `best_task_models/{task}/model.pt` -- best checkpoint per task metric |
| - `best_loss_task_models/{task}/model.pt` -- best checkpoint per task loss |
|
|
| To load a checkpoint, use `MultitaskModel.load_trainable_weights()` from |
| the [mutafitup](https://github.com/Moomboh/mutafitup) training library. |
|
|
| ## License |
|
|
| The ESMC 300M base model is licensed under the |
| [EvolutionaryScale Cambrian Open License Agreement](CAMBRIAN_OPEN_LICENSE.md). |
|
|
| Fine-tuning code and pipeline are licensed under the MIT License. |
|
|
| See [NOTICE](NOTICE) for full attribution details. |
|
|