Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -37,8 +37,7 @@ For full transparency and reproducibility, please refer to our technical report
 ## Model Details
-The performance of **JT-Math-8B-Thinking** stems from a meticulous, multi-stage training approach aimed at tackling complex mathematical challenges with state-of-the-art accuracy. Building on the **JT-Math-8B-Base** model, its training pipeline involved **Supervised Fine-Tuning (SFT)** using a high-quality, bilingual dataset of intricate math problems. This SFT phase leveraged the model's native **32,768-token context window**, enabling it to comprehend lengthy premises, multi-step instructions, and problems with extensive background information right from the start. Following SFT, an advanced **Reinforcement Learning (RL)** phase further refined its reasoning capabilities. This RL process employed a multi-stage curriculum, gradually introducing problems of increasing difficulty, and was specifically engineered to boost the model's focus and accuracy across the entire 32K context window, ensuring the coherence and precision of even the longest reasoning chains.


37
38	## Model Details
39
40	+ JT-Math-8B-Thinking achieves its cutting-edge performance on complex mathematical challenges through a rigorous, multi-stage training methodology. Starting with the robust JT-Math-8B-Base model, our pipeline first implemented Supervised Fine-Tuning (SFT). This involved training on a high-quality, bilingual dataset of intricate math problems, capitalizing on the model's impressive native 32,768-token context window. This large context allowed the model to grasp lengthy problem descriptions, multi-step instructions, and extensive background details from the outset. Subsequently, an advanced Reinforcement Learning (RL) phase, incorporating a multi-stage curriculum of progressively harder problems, further honed its reasoning abilities.

41
42
43