Instructions to use BoomJules/molly-software-engineering with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use BoomJules/molly-software-engineering with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct") model = PeftModel.from_pretrained(base_model, "BoomJules/molly-software-engineering") - Notebooks
- Google Colab
- Kaggle
card: gated-base quickstart + auth + 4-bit (Fable5 review)
Browse files
README.md
CHANGED
|
@@ -1,17 +1,108 @@
|
|
| 1 |
---
|
| 2 |
-
base_model: meta-llama/Llama-3.1-8B-Instruct
|
| 3 |
library_name: peft
|
| 4 |
-
|
| 5 |
-
- lora
|
| 6 |
-
- molly-os
|
| 7 |
-
- specialist
|
| 8 |
-
- software_engineering
|
| 9 |
license: cc-by-nc-4.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
-
# Molly OS
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
-
|
| 15 |
-
|
| 16 |
|
| 17 |
-
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
library_name: peft
|
| 3 |
+
base_model: meta-llama/Llama-3.1-8B-Instruct
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
license: cc-by-nc-4.0
|
| 5 |
+
pipeline_tag: text-generation
|
| 6 |
+
tags:
|
| 7 |
+
- lora
|
| 8 |
+
- peft
|
| 9 |
+
- molly-os
|
| 10 |
+
- software-engineering
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# Molly OS - Specialist Adapter: Software Engineering
|
| 14 |
+
|
| 15 |
+
Frontier-distilled **LoRA specialist** (PEFT, rank 32; target modules
|
| 16 |
+
`q_proj`, `k_proj`, `v_proj`, `o_proj`) for the Molly OS model-agnostic
|
| 17 |
+
orchestration layer. Base model: **[meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)**.
|
| 18 |
+
Domain: **Software Engineering**.
|
| 19 |
+
|
| 20 |
+
Adapter weights are released under **CC BY-NC 4.0**. The base model is governed by
|
| 21 |
+
its own (Llama 3.1) license.
|
| 22 |
+
|
| 23 |
+
## Before you run: the base model is gated
|
| 24 |
+
|
| 25 |
+
This adapter needs the base weights, and the base is **access-gated**. Do this **once**:
|
| 26 |
+
|
| 27 |
+
1. Open the base page and **accept its license**: <https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct>
|
| 28 |
+
2. Create a **read token**: <https://huggingface.co/settings/tokens>
|
| 29 |
+
3. Make the token available to your environment:
|
| 30 |
+
- **Google Colab:** open the **Secrets** panel (key icon, left sidebar) -> *Add new secret* -> Name `HF_TOKEN`, paste the value, enable **Notebook access**.
|
| 31 |
+
- **Kaggle:** *Add-ons -> Secrets* -> add `HF_TOKEN`.
|
| 32 |
+
- **Local:** run `huggingface-cli login` or `export HF_TOKEN=...`.
|
| 33 |
+
|
| 34 |
+
If you skip this you will get `GatedRepoError` / `401 Unauthorized` when the **base** loads.
|
| 35 |
+
A stored Colab secret is **not** used automatically - you must authenticate in code (see below).
|
| 36 |
+
|
| 37 |
+
## Quickstart
|
| 38 |
+
|
| 39 |
+
```python
|
| 40 |
+
# pip install -U transformers peft accelerate
|
| 41 |
+
import os
|
| 42 |
+
from huggingface_hub import login
|
| 43 |
+
|
| 44 |
+
# Authenticate (Colab secret -> env var -> interactive prompt)
|
| 45 |
+
try:
|
| 46 |
+
from google.colab import userdata
|
| 47 |
+
login(userdata.get("HF_TOKEN"))
|
| 48 |
+
except Exception:
|
| 49 |
+
tok = os.environ.get("HF_TOKEN")
|
| 50 |
+
login(tok) if tok else login()
|
| 51 |
+
|
| 52 |
+
import torch
|
| 53 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 54 |
+
from peft import PeftModel
|
| 55 |
+
|
| 56 |
+
BASE = "meta-llama/Llama-3.1-8B-Instruct"
|
| 57 |
+
ADAPTER = "BoomJules/molly-software-engineering"
|
| 58 |
+
|
| 59 |
+
tok = AutoTokenizer.from_pretrained(BASE)
|
| 60 |
+
base = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype=torch.bfloat16, device_map="auto")
|
| 61 |
+
model = PeftModel.from_pretrained(base, ADAPTER).eval()
|
| 62 |
+
|
| 63 |
+
msgs = [{"role": "user", "content": "Your question here"}]
|
| 64 |
+
ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device)
|
| 65 |
+
out = model.generate(ids, max_new_tokens=300)
|
| 66 |
+
print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
## Low-VRAM (4-bit) - fits a free Colab/Kaggle GPU (~6-7 GB)
|
| 70 |
+
|
| 71 |
+
Use a **GPU runtime** (Colab: *Runtime -> Change runtime type -> T4 GPU*).
|
| 72 |
+
|
| 73 |
+
```python
|
| 74 |
+
# pip install -U transformers peft accelerate bitsandbytes
|
| 75 |
+
import os, torch
|
| 76 |
+
from huggingface_hub import login
|
| 77 |
+
try:
|
| 78 |
+
from google.colab import userdata
|
| 79 |
+
login(userdata.get("HF_TOKEN"))
|
| 80 |
+
except Exception:
|
| 81 |
+
login(os.environ.get("HF_TOKEN"))
|
| 82 |
+
|
| 83 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
|
| 84 |
+
from peft import PeftModel
|
| 85 |
+
|
| 86 |
+
BASE = "meta-llama/Llama-3.1-8B-Instruct"
|
| 87 |
+
ADAPTER = "BoomJules/molly-software-engineering"
|
| 88 |
+
|
| 89 |
+
bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
|
| 90 |
+
bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True)
|
| 91 |
+
tok = AutoTokenizer.from_pretrained(BASE)
|
| 92 |
+
base = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
|
| 93 |
+
model = PeftModel.from_pretrained(base, ADAPTER).eval()
|
| 94 |
+
```
|
| 95 |
+
|
| 96 |
+
## Troubleshooting
|
| 97 |
+
|
| 98 |
+
- **`GatedRepoError` / `401 Unauthorized`** - base license not accepted, or `HF_TOKEN`
|
| 99 |
+
missing/invalid, or you stored the Colab secret but did not call `login(...)` in code.
|
| 100 |
+
- **CUDA out of memory** - use the 4-bit snippet and a GPU runtime.
|
| 101 |
+
- **Adapter seems to have no effect** - confirm the base id matches `base_model` above.
|
| 102 |
+
|
| 103 |
+
## License & intended use
|
| 104 |
|
| 105 |
+
Adapter: **CC BY-NC 4.0** (attribution, non-commercial). Base model: Llama 3.1 license.
|
| 106 |
+
Intended for research and evaluation in Software Engineering.
|
| 107 |
|
| 108 |
+
(c) 2026 Core Labs R&D.
|