SpermAI commited on
Commit
0449a78
·
verified ·
1 Parent(s): d682890

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -64
README.md CHANGED
@@ -1,81 +1,45 @@
1
  ---
 
2
  language:
3
  - en
4
- license: apache-2.0
5
  tags:
6
- - qwen3
7
  - reasoning
8
  - math
9
- - coding
10
- - autonomous-learning
11
- - chain-of-thought
12
- - distillation
13
- - reinforcement-learning
14
- base_model: Qwen/Qwen3-0.6B
 
 
15
  pipeline_tag: text-generation
16
- datasets:
17
- - microsoft/orca-math-word-problems-200k
18
- - meta-math/MetaMathQA
19
- - theblackcat102/evol-code-alpaca-v1
20
- - nickrosh/Evol-Instruct-Code-80k-v1
21
- library_name: transformers
22
  model-index:
23
- - name: SpermLLM-S1-Qwen3
24
- results:
25
- - task:
26
- type: text-generation
27
- dataset:
28
- name: GSM8K
29
- type: gsm8k
30
- metrics:
31
- - type: accuracy
32
- value: TBD
33
- - task:
34
- type: text-generation
35
- dataset:
36
- name: HumanEval
37
- type: openai_humaneval
38
- metrics:
39
- - type: pass@1
40
- value: TBD
41
  ---
42
 
43
- <div align="center">
44
-
45
- # 🧬 SpermLLM-S1
46
-
47
- ### *Autonomous Learning Meets Small Language Models*
48
-
49
- [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
50
- [![Model Size](https://img.shields.io/badge/Size-0.6B_params-green.svg)]()
51
- [![Base Model](https://img.shields.io/badge/Base-Qwen3--0.6B-orange.svg)](https://huggingface.co/Qwen/Qwen3-0.6B)
52
- [![Distillation](https://img.shields.io/badge/Distilled_from-70B_Teacher-purple.svg)]()
53
-
54
- *A 0.6B parameter model that MIGHT punch above it weight class*
55
 
56
- [🤗 Model](https://huggingface.co/SpermAI/SpermLLM-S1-Qwen3) • [📦 GGUF](https://huggingface.co/SpermAI/SpermLLM-S1-Qwen3-GGUF)
 
 
 
 
 
 
57
 
58
- </div>
59
 
60
- ---
61
-
62
- ## 🎯 What Makes SpermLLM Different?
63
-
64
- **SpermLLM** isn't just another fine-tuned model. It's trained through **autonomous learning**:
65
-
66
- 1. 🌐 **Self-Discovers Problems** - Scrapes math and coding challenges from the web
67
- 2. 🧠 **Learns from 120B Teachers** - Gets solutions from GPT-OSS-120B via distillation
68
- 3. 🔄 **Continuous Self-Improvement** - Trains on its failures, gets smarter over time
69
- 4. 🛡️ **Benchmark Decontaminated** - Zero test set leakage (proven via n-gram analysis)
70
-
71
- Unlike traditional fine-tuning, SpermLLM **generates its own curriculum** and learns **continuously**.
72
-
73
- ---
74
 
75
- ## 🏆 Performance
76
 
77
- Not yet known. We will test it
 
 
78
 
79
- *Note: Benchmarks pending*
80
 
81
- **Key Strength:** Step-by-step reasoning (uses `<think>` tags for chain-of-thought)
 
1
  ---
2
+ license: apache-2.0
3
  language:
4
  - en
 
5
  tags:
6
+ - distillation
7
  - reasoning
8
  - math
9
+ - code
10
+ - science
11
+ - gguf
12
+ - spermllm
13
+ - qwen
14
+ - small-language-model
15
+ base_model:
16
+ - Qwen/Qwen3-0.6B-Instruct
17
  pipeline_tag: text-generation
 
 
 
 
 
 
18
  model-index:
19
+ - name: SpermLLM
20
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ---
22
 
23
+ # 🧠 SpermLLM — Distilled Reasoning Model
 
 
 
 
 
 
 
 
 
 
 
24
 
25
+ <p align="center">
26
+ <img src="https://img.shields.io/badge/Parameters-0.5B-blue" alt="Parameters">
27
+ <img src="https://img.shields.io/badge/Teacher-Kimi_K2.5_(70B)-green" alt="Teacher">
28
+ <img src="https://img.shields.io/badge/Method-Auto_Distillation-orange" alt="Method">
29
+ <img src="https://img.shields.io/badge/Format-GGUF-red" alt="Format">
30
+ <img src="https://img.shields.io/badge/License-Apache_2.0-purple" alt="License">
31
+ </p>
32
 
33
+ SpermLLM is a compact distilled reasoning model based on **Qwen3-0.6B-Instruct**, designed to improve performance in math, coding, and structured reasoning while remaining lightweight and efficient.
34
 
35
+ ## Training Method
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
+ The model was fine-tuned on a mixture of curated instruction datasets and further distilled from larger teacher models (Mix of GPT-OSS-120B and Kimi K2.5)
38
 
39
+ ## Training Overview
40
+ - **Base Model**: Qwen3 0.6B Instruct
41
+ - **Training Method**: SFT (Supervised Finetuning) + Distillation
42
 
43
+ ## Notes
44
 
45
+ SpermLLM is an experimental model, We plan on making this larger and better! Currently no benchmarks but benchmarks will be soon!