Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,8 @@ tags:
|
|
| 14 |
---
|
| 15 |
# MistralThinker Model Card
|
| 16 |
|
| 17 |
-
Please, read this: https://huggingface.co/Undi95/MistralThinker-v1.1/discussions/1
|
|
|
|
| 18 |
|
| 19 |
## Model Description
|
| 20 |
|
|
@@ -63,7 +64,7 @@ This model is a specialized variant of **Mistral-Small-24B-Base-2501**, adapted
|
|
| 63 |
|
| 64 |
- **Limitations & Bias:**
|
| 65 |
- **Hallucination:** It can generate fictitious information in the thinking process, but still end up with a succesfull reply.
|
| 66 |
-
- **Thinking can be dismissed:** Being a distillation of DeepSeek R1 is essence, this model, even trained on Base, could forget to add `<think
|
| 67 |
|
| 68 |
## Ethical Considerations
|
| 69 |
|
|
|
|
| 14 |
---
|
| 15 |
# MistralThinker Model Card
|
| 16 |
|
| 17 |
+
Please, read this: https://huggingface.co/Undi95/MistralThinker-v1.1/discussions/1 \
|
| 18 |
+
Prefill required for the Assistant: `<think>\n`
|
| 19 |
|
| 20 |
## Model Description
|
| 21 |
|
|
|
|
| 64 |
|
| 65 |
- **Limitations & Bias:**
|
| 66 |
- **Hallucination:** It can generate fictitious information in the thinking process, but still end up with a succesfull reply.
|
| 67 |
+
- **Thinking can be dismissed:** Being a distillation of DeepSeek R1 is essence, this model, even trained on Base, could forget to add `<think>\n` in some scenario.
|
| 68 |
|
| 69 |
## Ethical Considerations
|
| 70 |
|