Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -324,13 +324,13 @@ print(response)
|
|
| 324 |
|
| 325 |
This model is the result of systematic exploration across 8 architectural variants. Key insight: **training methodology matters more than parameter count at this scale.** The 3-phase pipeline (Pretrain → Distill → Finetune) enabled a 10M model to outperform 30M models trained with a standard 2-phase approach.
|
| 326 |
|
| 327 |
-
For the full experimental report, see our [technical report](docs/technical_report.md).
|
| 328 |
|
| 329 |
## Links
|
| 330 |
|
| 331 |
- **Website:** [naranjositos.tech](https://naranjositos.tech/)
|
| 332 |
- **Code:** [github.com/xaskasdf/brandon-tiny](https://github.com/xaskasdf/brandon-tiny)
|
| 333 |
-
- **Technical Report:** [
|
| 334 |
- **Instruction Dataset:** [xaskasdf/brandon-tiny-instruct](https://huggingface.co/datasets/xaskasdf/brandon-tiny-instruct)
|
| 335 |
- **Synthetic Pretrain Data:** [xaskasdf/brandon-tiny-synthetic-pretrain](https://huggingface.co/datasets/xaskasdf/brandon-tiny-synthetic-pretrain)
|
| 336 |
|
|
|
|
| 324 |
|
| 325 |
This model is the result of systematic exploration across 8 architectural variants. Key insight: **training methodology matters more than parameter count at this scale.** The 3-phase pipeline (Pretrain → Distill → Finetune) enabled a 10M model to outperform 30M models trained with a standard 2-phase approach.
|
| 326 |
|
| 327 |
+
For the full experimental report, see our [technical report](https://github.com/xaskasdf/brandon-tiny/blob/master/docs/technical_report.md).
|
| 328 |
|
| 329 |
## Links
|
| 330 |
|
| 331 |
- **Website:** [naranjositos.tech](https://naranjositos.tech/)
|
| 332 |
- **Code:** [github.com/xaskasdf/brandon-tiny](https://github.com/xaskasdf/brandon-tiny)
|
| 333 |
+
- **Technical Report:** [technical_report.md](https://github.com/xaskasdf/brandon-tiny/blob/master/docs/technical_report.md)
|
| 334 |
- **Instruction Dataset:** [xaskasdf/brandon-tiny-instruct](https://huggingface.co/datasets/xaskasdf/brandon-tiny-instruct)
|
| 335 |
- **Synthetic Pretrain Data:** [xaskasdf/brandon-tiny-synthetic-pretrain](https://huggingface.co/datasets/xaskasdf/brandon-tiny-synthetic-pretrain)
|
| 336 |
|