|
|
--- |
|
|
license: mit |
|
|
datasets: |
|
|
- cheapresearch/CheapResearch-DS-33k |
|
|
--- |
|
|
|
|
|
|
|
|
|
|
|
# FlashResearch-4B-Thinking |
|
|
|
|
|
<img src='cheap.png' width='700'> |
|
|
|
|
|
[](https://huggingface.co/your-username/your-model-name) |
|
|
[](#license) |
|
|
[](https://huggingface.co/datasets/cheapresearch/CheapResearch-DS-33k) |
|
|
|
|
|
**A 4B-parameter Qwen model distilled from Tongyi DeepResearch-30B A3B**, optimized for web-scale “deep research” tasks and inference with **[Alibaba-NLP/DeepResearch](https://github.com/Alibaba-NLP/DeepResearch)**. |
|
|
|
|
|
* **Base**: Qwen 4B (dense) |
|
|
* **Teacher**: Tongyi DeepResearch 30B A3B (MoE) |
|
|
* **Method**: SFT distillation on **33k** curated deep-research examples |
|
|
* **Dataset**: [`flashresearch/FlashResearch-DS-33k`](https://huggingface.co/datasets/cheapresearch/CheapResearch-DS-33k) |
|
|
* **Primary Use**: Fast, low-cost **DeepResearch** agent runs (browsing, multi-step reasoning, source-grounded answers) |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
<img src='hle.png' width='500'> |
|
|
<img src='simpleqa.png' width='500'> |
|
|
|
|
|
## Training Data |
|
|
|
|
|
* **Primary dataset**: [`flashresearch/FlashResearch-DS-33k`](https://huggingface.co/datasets/flashresearch/FlashResearch-DS-33k) |
|
|
|
|
|
## Inference with Alibaba-NLP/DeepResearch (Recommended) |
|
|
|
|
|
This model is intended to be used **directly** with the DeepResearch repo. |
|
|
|
|
|
### 1) Install & set up |
|
|
|
|
|
```bash |
|
|
git clone https://github.com/Alibaba-NLP/DeepResearch |
|
|
cd DeepResearch |
|
|
# Create env (example) |
|
|
python -m venv .venv && source .venv/bin/activate |
|
|
pip install -e . # or pip install -r requirements.txt if provided |
|
|
``` |
|
|
|
|
|
### 2) Point DeepResearch to this model |
|
|
|
|
|
Edit the config to add this model |
|
|
|
|
|
```bash |
|
|
MODEL_PATH=flashresearch/FlashResearch-4B-Thinking |
|
|
``` |
|
|
|
|
|
### Hardware notes |
|
|
|
|
|
* **Single 12–16GB GPU** is enough for 4B FP16; FP8/INT4 quantization allows smaller VRAM. If you quantize, the summary model can be local as well. |
|
|
|
|
|
|
|
|
|
|
|
## Acknowledgements |
|
|
|
|
|
* Qwen team for the base 4B architecture |
|
|
* Alibaba-NLP for **DeepResearch** |
|
|
* CheapResearch contributors for the 33k dataset |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@software{cheapresearch_thinking_2025, |
|
|
title = {CheapResearch 4B Thinking}, |
|
|
author = {Artem Y.}, |
|
|
year = {2025}, |
|
|
url = {https://huggingface.co/flashresearch/FlashResearch-4B-Thinking} |
|
|
} |
|
|
``` |
|
|
|
|
|
And the dataset: |
|
|
|
|
|
```bibtex |
|
|
@dataset{cheapresearch_ds_33k, |
|
|
title = {CheapResearch-DS-33k}, |
|
|
author = {Artem Y.}, |
|
|
year = {2025}, |
|
|
url = {https://huggingface.co/datasets/flashresearch/FlashResearch-DS-33k} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Changelog |
|
|
|
|
|
* **v1.0.0 (2025-10-04)** — First public release (33k distillation, DeepResearch-ready) |
|
|
|
|
|
|
|
|
|
|
|
### Model Card Metadata (Hugging Face) |
|
|
|
|
|
```yaml |
|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- qwen |
|
|
- deep-research |
|
|
- browsing |
|
|
- citation |
|
|
- reasoning |
|
|
- distillation |
|
|
- agent |
|
|
- vllm |
|
|
- cheapresearch |
|
|
datasets: |
|
|
- flashresearch/FlashResearch-DS-33k |
|
|
base_model: |
|
|
- Qwen/Qwen3-4B-Thinking-2507 |
|
|
model-index: |
|
|
- name: FlashResearch-4B-Thinking |
|
|
results: [] |
|
|
--- |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|