hzeng412 commited on
Commit
c4ed9c8
·
verified ·
1 Parent(s): dd52ff5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -50
README.md CHANGED
@@ -1,53 +1,14 @@
1
- ---
2
- title: README
3
- emoji: 🔥
4
- colorFrom: pink
5
- colorTo: indigo
6
- sdk: static
7
- pinned: false
8
- ---
9
 
10
- Moxin 7B: A Fully Open-Source 7B Language Model with Unprecedented Transparency
11
 
12
- We’re thrilled to unveil Moxin 7B, a new milestone in open large language model (LLM) development designed to push the boundaries of performance and openness.
13
-
14
- In an era where many "open" LLMs lack true transparency (e.g., missing training code, data, or restrictive licenses), Moxin 7B sets a new gold standard by committing to full disclosure and reproducibility.
15
- Developed under the Model Openness Framework (MOF), Moxin 7B achieves the top classification level of Open Science, thanks to:
16
-
17
- **What we’ve open-sourced**:
18
-
19
- - Pre-training code, data, and Moxin Base model.
20
-
21
- + Post-training code, data, and Moxin Instruct model.
22
-
23
- + RL code with GRPO, data and Moxin Reasoning model.
24
-
25
- **Performance Highlights**:
26
-
27
- + Zero-shot / Few-shot: Outperforms Mistral, Qwen, and LLaMA on tasks like HellaSwag, ARC, MMLU, and PIQA
28
-
29
- + Reasoning: Moxin-Reasoning-7B achieves superior performance on MATH-500, AMC, and OlympiadBench — proving reinforcement learning can work for small 7B models
30
-
31
- + Training cost: ~$160K for full pretraining — efficient and reproducible at scale
32
-
33
- **Post-training Frameworks**:
34
-
35
- + SFT and DPO with Tülu 3
36
-
37
- + CoT-enhanced reasoning with GRPO via DeepScaleR
38
-
39
- **Get the models and code**:
40
-
41
- + Base model: Moxin-LLM-7B
42
-
43
- + Instruction model: Moxin-Instruct-7B
44
-
45
- + Reasoning model: Moxin-Reasoning-7B
46
-
47
- + Code & docs: github.com/moxin-org/Moxin-LLM
48
-
49
- + Arxiv paper: https://arxiv.org/abs/2412.06845
50
-
51
- We believe this is a step toward a more transparent, reproducible, and innovation-friendly AI ecosystem — especially for researchers, developers, and startups looking to build upon a robust, open foundation.
52
- Let’s build open AI the right way.
53
 
 
 
1
+ ---
2
+ title: README
3
+ emoji: 🔥
4
+ colorFrom: pink
5
+ colorTo: indigo
6
+ sdk: static
7
+ pinned: false
8
+ ---
9
 
10
+ Introducing Moxin 7B: The truly open, SOTA-performing LLM and VLM that's redefining transparency.
11
 
12
+ We've open-sourced everything—pre-training code, data, and models, including our GRPO-enhanced Reasoning model. It outperforms Mistral, Qwen, and LLaMA in zero-shot/few-shot tasks and delivers superior reasoning on complex math benchmarks, all with an efficient training cost of ~$160K for full pretraining.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
+ We unleash the power of reproducible AI 🚀. Interested? Explore the models and code on our [GitHub](https://github.com/moxin-org/Moxin-LLM) and read the full paper on [arXiv](https://arxiv.org/abs/2412.06845).