Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
---
|
| 4 |
+
# Mit-ThinkDeeply-3B-GGUF
|
| 5 |
+
|
| 6 |
+
## Model Description
|
| 7 |
+
|
| 8 |
+
**Mit-ThinkDeeply** is the advanced version of the Mit series of large language models (LLMs) developed by WinkingFace. Built upon the robust foundation of the Mit base model, **Mit-ThinkDeeply** introduces enhanced reasoning capabilities, superior contextual understanding, and refined function-calling precision. This model is designed to seamlessly integrate intuitive conversational abilities with advanced multi-step reasoning, making it ideal for complex analytical tasks, structured problem-solving, and high-stakes decision-making.
|
| 9 |
+
|
| 10 |
+
Key features of **Mit-ThinkDeeply** include:
|
| 11 |
+
|
| 12 |
+
- **Advanced Reasoning**: Capable of generating long chains of thought to deeply analyze problems and provide well-reasoned solutions.
|
| 13 |
+
- **Enhanced Contextual Awareness**: Improved ability to maintain coherence across multi-turn conversations and long-form interactions.
|
| 14 |
+
- **Function Calling Precision**: Optimized for reliable and accurate execution of tool calls, enabling seamless integration with external APIs and services.
|
| 15 |
+
- **Versatile Use Cases**: Adaptable for both standard conversational tasks and complex reasoning scenarios, including mathematical problem-solving, code generation, and structured output generation.
|
| 16 |
+
- **Long Context Support**: Supports context lengths of up to 128K tokens, ensuring robust performance in applications requiring extensive input data.
|
| 17 |
+
|
| 18 |
+
**Mit-ThinkDeeply** has undergone extensive architectural refinements and fine-tuning to align more effectively with real-world applications. Our training process emphasizes deeper contextual awareness, enhanced response coherence, and improved execution of function-calling, making **Mit-ThinkDeeply** a powerful and versatile AI system.
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
## Quickstart
|
| 22 |
+
|
| 23 |
+
We recommend using **Customized llama.cpp version**.
|
| 24 |
+
|
| 25 |
+
```bash
|
| 26 |
+
git clone https://github.com/WinkingFaceAI/lmc-recooked.git
|
| 27 |
+
```
|
| 28 |
+
|
| 29 |
+
In the following demonstration, we assume that you are running commands under the repository `lmc-recooked`.
|
| 30 |
+
|
| 31 |
+
Since cloning the entire repo may be inefficient, you can manually download the GGUF file that you need or use `huggingface-cli`:
|
| 32 |
+
1. Install
|
| 33 |
+
```shell
|
| 34 |
+
pip install -U huggingface_hub
|
| 35 |
+
```
|
| 36 |
+
2. Download:
|
| 37 |
+
```shell
|
| 38 |
+
huggingface-cli download WinkingFace/Mit-ThinkDeeply-0.5B-gguf Mit-ThinkDeeply-3B-q8_0.gguf --local-dir . --local-dir-use-symlinks False
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
For users, to achieve chatbot-like experience, it is recommended to commence in the conversation mode:
|
| 42 |
+
|
| 43 |
+
```shell
|
| 44 |
+
./llama-cli -m <gguf-file-path> \
|
| 45 |
+
-co -cnv -p "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem." \
|
| 46 |
+
-fa -ngl 80 -n 512
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
## Evaluation & Performance
|
| 51 |
+
|
| 52 |
+
<div align="center">
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
| Category | Benchmark (Metric) | Mit-ThinkDeeply-0.5B | Mit-ThinkDeeply-1.5B | Mit-ThinkDeeply-3B | Mit-ThinkDeeply-7B |
|
| 56 |
+
|----------|--------------------|----------------------|----------------------|--------------------|--------------------|
|
| 57 |
+
| | Context Length | 32K | 32K | 32K | 128K |
|
| 58 |
+
| | Generation Length | 8K | 8K | 8K | 8K |
|
| 59 |
+
| General | MMLU | 45.4 | 58.9 | 63.8 | 72.6 |
|
| 60 |
+
| | MMLU-pro | 13.8 | 26.6 | 33.0 | 43.7 |
|
| 61 |
+
| | MMLU-redux | 43.1 | 56.8 | 62.7 | 70.3 |
|
| 62 |
+
| | BBH | 18.3 | 41.7 | 64.9 | 68.1 |
|
| 63 |
+
| | ARC-C | 32.9 | 56.0 | 57.5 | 65.8 |
|
| 64 |
+
| Code | LiveCodeBench | 11.5 | 21.4 | 25.9 | 36.2 |
|
| 65 |
+
| | HumanEval | 25.4 | 44.6 | 51.6 | 69.5 |
|
| 66 |
+
| | HumanEval+ | 29.7 | 38.1 | 43.9 | 60.7 |
|
| 67 |
+
| | MBPP | 46.3 | 74.2 | 69.9 | 82.9 |
|
| 68 |
+
| | MBPP+ | 36.8 | 59.5 | 59.3 | 70.2 |
|
| 69 |
+
| | MultiPL-E | 24.9 | 51.7 | 49.6 | 58.1 |
|
| 70 |
+
| Mathematics | GPQA | 25.1 | 29.0 | 31.5 | 40.7 |
|
| 71 |
+
| | Theoremqa | 18.2 | 23.2 | 27.9 | 39.4 |
|
| 72 |
+
| | MATH | 25.4 | 38.1 | 46.7 | 54.8 |
|
| 73 |
+
| | MATH-500 | 62.5 | 79.2 | 88.4 | 94.6 |
|
| 74 |
+
| | MMLU-stem | 43.3 | 65.8 | 75.1 | 81.3 |
|
| 75 |
+
| | GSM8K | 45.8 | 70.1 | 81.5 | 86.2 |
|
| 76 |
+
|
| 77 |
+
</div>
|
| 78 |
+
|
| 79 |
+
|
| 80 |
+
## Citation
|
| 81 |
+
|
| 82 |
+
```
|
| 83 |
+
If you find our work helpful, feel free to cite us:
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
@misc{mit-thinkdeeply,
|
| 87 |
+
title = {Mit-ThinkDeeply: Advanced Reasoning and Contextual Awareness in Large Language Models},
|
| 88 |
+
author = {WinkingFace Team},
|
| 89 |
+
year = {2025},
|
| 90 |
+
url = {https://huggingface.co/WinkingFace/Mit-ThinkDeeply-7B}
|
| 91 |
+
}
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
## Contact
|
| 96 |
+
|
| 97 |
+
For any questions or inquiries, feel free to [contact us here 📨](mailto:contact@winkingfacehub.com).
|