BoomJules commited on
Commit
635bf5a
·
verified ·
1 Parent(s): 8565236

card: gated-base quickstart + auth + 4-bit (Fable5 review)

Browse files
Files changed (1) hide show
  1. README.md +101 -10
README.md CHANGED
@@ -1,17 +1,108 @@
1
  ---
2
- base_model: meta-llama/Llama-3.1-8B-Instruct
3
  library_name: peft
4
- tags:
5
- - lora
6
- - molly-os
7
- - specialist
8
- - software_engineering
9
  license: cc-by-nc-4.0
 
 
 
 
 
 
10
  ---
11
 
12
- # Molly OS Specialist Adapter: Software Engineering
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- Frontier-distilled LoRA specialist for the Molly OS model-agnostic orchestration layer.
15
- Base `meta-llama/Llama-3.1-8B-Instruct`, LoRA rank 32. Domain: Software Engineering.
16
 
17
- © 2026 Corelabs Group.
 
1
  ---
 
2
  library_name: peft
3
+ base_model: meta-llama/Llama-3.1-8B-Instruct
 
 
 
 
4
  license: cc-by-nc-4.0
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - lora
8
+ - peft
9
+ - molly-os
10
+ - software-engineering
11
  ---
12
 
13
+ # Molly OS - Specialist Adapter: Software Engineering
14
+
15
+ Frontier-distilled **LoRA specialist** (PEFT, rank 32; target modules
16
+ `q_proj`, `k_proj`, `v_proj`, `o_proj`) for the Molly OS model-agnostic
17
+ orchestration layer. Base model: **[meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)**.
18
+ Domain: **Software Engineering**.
19
+
20
+ Adapter weights are released under **CC BY-NC 4.0**. The base model is governed by
21
+ its own (Llama 3.1) license.
22
+
23
+ ## Before you run: the base model is gated
24
+
25
+ This adapter needs the base weights, and the base is **access-gated**. Do this **once**:
26
+
27
+ 1. Open the base page and **accept its license**: <https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct>
28
+ 2. Create a **read token**: <https://huggingface.co/settings/tokens>
29
+ 3. Make the token available to your environment:
30
+ - **Google Colab:** open the **Secrets** panel (key icon, left sidebar) -> *Add new secret* -> Name `HF_TOKEN`, paste the value, enable **Notebook access**.
31
+ - **Kaggle:** *Add-ons -> Secrets* -> add `HF_TOKEN`.
32
+ - **Local:** run `huggingface-cli login` or `export HF_TOKEN=...`.
33
+
34
+ If you skip this you will get `GatedRepoError` / `401 Unauthorized` when the **base** loads.
35
+ A stored Colab secret is **not** used automatically - you must authenticate in code (see below).
36
+
37
+ ## Quickstart
38
+
39
+ ```python
40
+ # pip install -U transformers peft accelerate
41
+ import os
42
+ from huggingface_hub import login
43
+
44
+ # Authenticate (Colab secret -> env var -> interactive prompt)
45
+ try:
46
+ from google.colab import userdata
47
+ login(userdata.get("HF_TOKEN"))
48
+ except Exception:
49
+ tok = os.environ.get("HF_TOKEN")
50
+ login(tok) if tok else login()
51
+
52
+ import torch
53
+ from transformers import AutoModelForCausalLM, AutoTokenizer
54
+ from peft import PeftModel
55
+
56
+ BASE = "meta-llama/Llama-3.1-8B-Instruct"
57
+ ADAPTER = "BoomJules/molly-software-engineering"
58
+
59
+ tok = AutoTokenizer.from_pretrained(BASE)
60
+ base = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype=torch.bfloat16, device_map="auto")
61
+ model = PeftModel.from_pretrained(base, ADAPTER).eval()
62
+
63
+ msgs = [{"role": "user", "content": "Your question here"}]
64
+ ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device)
65
+ out = model.generate(ids, max_new_tokens=300)
66
+ print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))
67
+ ```
68
+
69
+ ## Low-VRAM (4-bit) - fits a free Colab/Kaggle GPU (~6-7 GB)
70
+
71
+ Use a **GPU runtime** (Colab: *Runtime -> Change runtime type -> T4 GPU*).
72
+
73
+ ```python
74
+ # pip install -U transformers peft accelerate bitsandbytes
75
+ import os, torch
76
+ from huggingface_hub import login
77
+ try:
78
+ from google.colab import userdata
79
+ login(userdata.get("HF_TOKEN"))
80
+ except Exception:
81
+ login(os.environ.get("HF_TOKEN"))
82
+
83
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
84
+ from peft import PeftModel
85
+
86
+ BASE = "meta-llama/Llama-3.1-8B-Instruct"
87
+ ADAPTER = "BoomJules/molly-software-engineering"
88
+
89
+ bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
90
+ bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True)
91
+ tok = AutoTokenizer.from_pretrained(BASE)
92
+ base = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
93
+ model = PeftModel.from_pretrained(base, ADAPTER).eval()
94
+ ```
95
+
96
+ ## Troubleshooting
97
+
98
+ - **`GatedRepoError` / `401 Unauthorized`** - base license not accepted, or `HF_TOKEN`
99
+ missing/invalid, or you stored the Colab secret but did not call `login(...)` in code.
100
+ - **CUDA out of memory** - use the 4-bit snippet and a GPU runtime.
101
+ - **Adapter seems to have no effect** - confirm the base id matches `base_model` above.
102
+
103
+ ## License & intended use
104
 
105
+ Adapter: **CC BY-NC 4.0** (attribution, non-commercial). Base model: Llama 3.1 license.
106
+ Intended for research and evaluation in Software Engineering.
107
 
108
+ (c) 2026 Core Labs R&D.