yizheapple commited on
Commit
f17846d
·
verified ·
1 Parent(s): 313aae9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apple-amlr
3
+ base_model:
4
+ - Qwen/Qwen3-4B
5
+ tags:
6
+ - self-distillation
7
+ - code-generation
8
+ - ssd
9
+ library_name: transformers
10
+ ---
11
+
12
+ # SSD-Qwen3-4B-Instruct
13
+
14
+ This model was produced using **Simple Self-Distillation (SSD)**, a method that improves code generation by fine-tuning a language model on its own sampled outputs—without rewards, verifiers, teacher models, or reinforcement learning.
15
+
16
+ - **Base model:** [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B)
17
+ - **Variant:** instruct
18
+ - **Self-distillation sampling:** temperature=1.1, top_p=0.8, top_k=20
19
+
20
+ ## Method
21
+
22
+ SSD samples solutions from the base model using non-unit temperature and top-k/top-p truncation, then fine-tunes on those samples via standard supervised learning. Despite its simplicity, SSD yields large gains on competitive programming benchmarks, with improvements concentrating on harder problems. The mechanism traces to resolving a *precision–exploration conflict*: SSD reshapes token distributions in a context-dependent way so that a single global decoding configuration becomes far more effective at evaluation time.
23
+
24
+ ## Paper
25
+
26
+ **Embarrassingly Simple Self-Distillation Improves Code Generation**
27
+
28
+ Ruixiang Zhang, Richard He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang
29
+
30
+ ## Usage
31
+
32
+ ```python
33
+ from transformers import AutoModelForCausalLM, AutoTokenizer
34
+
35
+ model = AutoModelForCausalLM.from_pretrained("apple/SSD-Qwen3-4B-Instruct")
36
+ tokenizer = AutoTokenizer.from_pretrained("apple/SSD-Qwen3-4B-Instruct")
37
+ ```
38
+
39
+ ## License
40
+
41
+ This model is released under the [Apple Sample Code License](https://huggingface.co/apple/CLaRa-7B-Instruct/blob/main/LICENSE).