deburky commited on
Commit
930266b
·
verified ·
1 Parent(s): fb9e9ec

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: openai/gpt-oss-20b
3
+ library_name: transformers
4
+ tags:
5
+ - lora
6
+ - sft
7
+ - tool-use
8
+ - gpt-oss
9
+ license: apache-2.0
10
+ ---
11
+
12
+ # gpt-oss-claude-code
13
+
14
+ Fine-tuned [`openai/gpt-oss-20b`](https://huggingface.co/openai/gpt-oss-20b) for tool-use and agentic coding tasks. LoRA adapters merged into base weights.
15
+
16
+ ## Quick start
17
+ ```python
18
+ import re, torch
19
+ from transformers import AutoModelForCausalLM, AutoTokenizer
20
+
21
+ model = AutoModelForCausalLM.from_pretrained(
22
+ "deburky/gpt-oss-claude-code",
23
+ torch_dtype=torch.bfloat16,
24
+ device_map="auto",
25
+ )
26
+ tokenizer = AutoTokenizer.from_pretrained("deburky/gpt-oss-claude-code")
27
+
28
+ messages = [{"role": "user", "content": "Who is Alan Turing?"}]
29
+ inputs = tokenizer.apply_chat_template(
30
+ messages, add_generation_prompt=True,
31
+ return_tensors="pt", return_dict=True,
32
+ ).to(model.device)
33
+
34
+ with torch.no_grad():
35
+ out = model.generate(**inputs, max_new_tokens=256)
36
+ response = tokenizer.decode(out[0][inputs["input_ids"].shape[-1]:])
37
+
38
+ if "<|channel|>final<|message|>" in response:
39
+ response = response.split("<|channel|>final<|message|>")[-1]
40
+ print(re.sub(r"<\\|[^>]+\\|>", "", response).strip())
41
+ ```
42
+
43
+ ## Apple Silicon (MLX)
44
+
45
+ A fused MLX version is available at [`deburky/gpt-oss-claude-mlx`](https://huggingface.co/deburky/gpt-oss-claude-mlx).
46
+
47
+ ## Training
48
+
49
+ - **Data:** ~280 tool-use conversation examples in gpt-oss harmony format
50
+ - **Method:** LoRA (rank 8, alpha 16) on attention + MoE expert layers, merged after training
51
+ - **LR:** 1e-4, cosine schedule
52
+ - **Final val loss:** ~0.48
53
+ - **Hardware:** Google Colab