--- license: other license_name: cambrian-open license_link: https://huggingface.co/Moomboh/ESMC-300M-mutafitup/blob/main/CAMBRIAN_OPEN_LICENSE.md tags: - protein-language-model - onnx - fine-tuning - multi-task - esm --- # Moomboh/ESMC-300M-mutafitup Multi-task LoRA fine-tuned ONNX models derived from [ESM-C 300M](https://huggingface.co/EvolutionaryScale/esmc-300m-2024-12) by [EvolutionaryScale](https://www.evolutionaryscale.ai/). Built with ESM. ## ONNX Models | Model | Section | Tasks | Variant | |-------|---------|-------|---------| | `ESMC-300M-mutafitup-accgrad-all-r4-best-overall` | accgrad_lora | disorder, gpsite_atp, gpsite_ca, ... (16 total) | best_overall | | `ESMC-300M-mutafitup-align-all-r4-best-overall` | align_lora | disorder, gpsite_atp, gpsite_ca, ... (16 total) | best_overall | Each ONNX model directory contains: - `model.onnx` -- merged ONNX model (LoRA weights folded into backbone) - `export_metadata.json` -- task configuration and preprocessing settings - `normalization_stats.json` -- per-task normalization statistics - `tokenizer/` -- HuggingFace tokenizer files - `history.json` -- training history (per-epoch metrics) - `best_checkpoints.json` -- checkpoint selection metadata ## PyTorch Checkpoints The `checkpoints/` directory contains minimal trainable-parameter PyTorch checkpoints for **all** training runs (45 runs across 4 training sections). These checkpoints contain only the parameters that were updated during fine-tuning (LoRA adapters and task heads), not the frozen backbone weights. Each run directory (`checkpoints/{section}/{run}/`) contains: - `history.json` -- training history - `best_checkpoints.json` -- checkpoint selection metadata - `best_overall_model/model.pt` -- best checkpoint by overall metric - `best_loss_overall_model/model.pt` -- best checkpoint by overall loss - `best_task_models/{task}/model.pt` -- best checkpoint per task metric - `best_loss_task_models/{task}/model.pt` -- best checkpoint per task loss To load a checkpoint, use `MultitaskModel.load_trainable_weights()` from the [mutafitup](https://github.com/Moomboh/mutafitup) training library. ## License The ESMC 300M base model is licensed under the [EvolutionaryScale Cambrian Open License Agreement](CAMBRIAN_OPEN_LICENSE.md). Fine-tuning code and pipeline are licensed under the MIT License. See [NOTICE](NOTICE) for full attribution details.