Update README.md
Browse files
README.md
CHANGED
|
@@ -49,10 +49,11 @@ This model is a **base model** intended for:
|
|
| 49 |
|
| 50 |
RedSage employs a multi-stage training pipeline. This model represents the output of **Stage 2**.
|
| 51 |
|
| 52 |
-
1. Stage 1: Continual Pre-Training (CPT) ->
|
| 53 |
2. **Stage 2: Targeted Pre-Training** -> **`RedSage-Qwen3-8B-Base`** (Current Model)
|
| 54 |
-
|
| 55 |
-
4. Stage
|
|
|
|
| 56 |
|
| 57 |
## Training Data: RedSage-Seed & Dump
|
| 58 |
|
|
|
|
| 49 |
|
| 50 |
RedSage employs a multi-stage training pipeline. This model represents the output of **Stage 2**.
|
| 51 |
|
| 52 |
+
1. Stage 1: Continual Pre-Training (CPT) -> [RedSage-Qwen3-8B-CFW](https://huggingface.co/RISys-Lab/RedSage-Qwen3-8B-CFW) (CyberFineWeb data)
|
| 53 |
2. **Stage 2: Targeted Pre-Training** -> **`RedSage-Qwen3-8B-Base`** (Current Model)
|
| 54 |
+
* *Data:* RedSage-Seed (\~150M Tokens) + RedSage-Dump (\~700M Tokens)
|
| 55 |
+
4. Stage 3: Supervised Fine-Tuning (SFT) -> [RedSage-Qwen3-8B-Ins](https://huggingface.co/RISys-Lab/RedSage-Qwen3-8B-Ins)
|
| 56 |
+
5. Stage 4: Direct Preference Optimization (DPO) -> [RedSage-Qwen3-8B-DPO](https://huggingface.co/RISys-Lab/RedSage-Qwen3-8B-DPO)
|
| 57 |
|
| 58 |
## Training Data: RedSage-Seed & Dump
|
| 59 |
|