BoomJules
/

molly-software-engineering

@@ -1,17 +1,108 @@
 ---
-base_model: meta-llama/Llama-3.1-8B-Instruct
 library_name: peft
-tags:
-- lora
-- molly-os
-- specialist
-- software_engineering
 license: cc-by-nc-4.0
 ---
-# Molly OS — Specialist Adapter: Software Engineering
-Frontier-distilled LoRA specialist for the Molly OS model-agnostic orchestration layer.
-Base `meta-llama/Llama-3.1-8B-Instruct`, LoRA rank 32. Domain: Software Engineering.
-© 2026 Corelabs Group.

 ---
 library_name: peft
+base_model: meta-llama/Llama-3.1-8B-Instruct
 license: cc-by-nc-4.0
+pipeline_tag: text-generation
+tags:
+  - lora
+  - peft
+  - molly-os
+  - software-engineering
 ---
+# Molly OS - Specialist Adapter: Software Engineering
+Frontier-distilled **LoRA specialist** (PEFT, rank 32; target modules
+`q_proj`, `k_proj`, `v_proj`, `o_proj`) for the Molly OS model-agnostic
+orchestration layer. Base model: **[meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)**.
+Domain: **Software Engineering**.
+Adapter weights are released under **CC BY-NC 4.0**. The base model is governed by
+its own (Llama 3.1) license.
+## Before you run: the base model is gated
+This adapter needs the base weights, and the base is **access-gated**. Do this **once**:
+1. Open the base page and **accept its license**: <https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct>
+2. Create a **read token**: <https://huggingface.co/settings/tokens>
+3. Make the token available to your environment:
+   - **Google Colab:** open the **Secrets** panel (key icon, left sidebar) -> *Add new secret* -> Name `HF_TOKEN`, paste the value, enable **Notebook access**.
+   - **Kaggle:** *Add-ons -> Secrets* -> add `HF_TOKEN`.
+   - **Local:** run `huggingface-cli login` or `export HF_TOKEN=...`.
+If you skip this you will get `GatedRepoError` / `401 Unauthorized` when the **base** loads.
+A stored Colab secret is **not** used automatically - you must authenticate in code (see below).
+## Quickstart
+```python
+# pip install -U transformers peft accelerate
+import os
+from huggingface_hub import login
+# Authenticate (Colab secret -> env var -> interactive prompt)
+try:
+    from google.colab import userdata
+    login(userdata.get("HF_TOKEN"))
+except Exception:
+    tok = os.environ.get("HF_TOKEN")
+    login(tok) if tok else login()
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+BASE = "meta-llama/Llama-3.1-8B-Instruct"
+ADAPTER = "BoomJules/molly-software-engineering"
+tok = AutoTokenizer.from_pretrained(BASE)
+base = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype=torch.bfloat16, device_map="auto")
+model = PeftModel.from_pretrained(base, ADAPTER).eval()
+msgs = [{"role": "user", "content": "Your question here"}]
+ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device)
+out = model.generate(ids, max_new_tokens=300)
+print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))
+```
+## Low-VRAM (4-bit) - fits a free Colab/Kaggle GPU (~6-7 GB)
+Use a **GPU runtime** (Colab: *Runtime -> Change runtime type -> T4 GPU*).
+```python
+# pip install -U transformers peft accelerate bitsandbytes
+import os, torch
+from huggingface_hub import login
+try:
+    from google.colab import userdata
+    login(userdata.get("HF_TOKEN"))
+except Exception:
+    login(os.environ.get("HF_TOKEN"))
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+from peft import PeftModel
+BASE = "meta-llama/Llama-3.1-8B-Instruct"
+ADAPTER = "BoomJules/molly-software-engineering"
+bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
+                         bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True)
+tok = AutoTokenizer.from_pretrained(BASE)
+base = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
+model = PeftModel.from_pretrained(base, ADAPTER).eval()
+```
+## Troubleshooting
+- **`GatedRepoError` / `401 Unauthorized`** - base license not accepted, or `HF_TOKEN`
+  missing/invalid, or you stored the Colab secret but did not call `login(...)` in code.
+- **CUDA out of memory** - use the 4-bit snippet and a GPU runtime.
+- **Adapter seems to have no effect** - confirm the base id matches `base_model` above.
+## License & intended use
+Adapter: **CC BY-NC 4.0** (attribution, non-commercial). Base model: Llama 3.1 license.
+Intended for research and evaluation in Software Engineering.
+(c) 2026 Core Labs R&D.