Text Generation
Transformers
Safetensors
English
deep-research
agent
reinforcement-learning
tool-use
open-ended-evolution
qwen3
Eval Results (legacy)
Instructions to use IQuestLab/HOTE-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use IQuestLab/HOTE-8B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="IQuestLab/HOTE-8B")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("IQuestLab/HOTE-8B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use IQuestLab/HOTE-8B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "IQuestLab/HOTE-8B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "IQuestLab/HOTE-8B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/IQuestLab/HOTE-8B
- SGLang
How to use IQuestLab/HOTE-8B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "IQuestLab/HOTE-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "IQuestLab/HOTE-8B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "IQuestLab/HOTE-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "IQuestLab/HOTE-8B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use IQuestLab/HOTE-8B with Docker Model Runner:
docker model run hf.co/IQuestLab/HOTE-8B
| license: apache-2.0 | |
| language: | |
| - en | |
| base_model: | |
| - Qwen/Qwen3-8B | |
| datasets: | |
| - rl-research/dr-tulu-sft-data | |
| - rl-research/dr-tulu-rl-data | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| tags: | |
| - deep-research | |
| - agent | |
| - reinforcement-learning | |
| - tool-use | |
| - open-ended-evolution | |
| - qwen3 | |
| model-index: | |
| - name: HOTE-8B | |
| results: | |
| - task: | |
| type: text-generation | |
| name: Long-form deep research | |
| dataset: | |
| name: HealthBench | |
| type: HealthBench | |
| metrics: | |
| - type: score | |
| value: 54.4 | |
| name: HealthBench score | |
| - task: | |
| type: text-generation | |
| name: Long-form deep research | |
| dataset: | |
| name: ResearchQA | |
| type: ResearchQA | |
| metrics: | |
| - type: score | |
| value: 76.9 | |
| name: ResearchQA score | |
| - task: | |
| type: text-generation | |
| name: Long-form deep research | |
| dataset: | |
| name: DeepResearchBench | |
| type: DeepResearchBench | |
| metrics: | |
| - type: score | |
| value: 45.9 | |
| name: DeepResearchBench score | |
| # HOTE-8B | |
| HOTE-8B is an 8B-parameter deep research model trained with **Hybrid Open-Ended Tri-Evolution (HOTE)**, a reinforcement-learning framework for open-ended research agents. The model is introduced in [Hybrid Open-Ended Tri-Evolution Makes Better Deep Researcher](https://arxiv.org/abs/2606.13710) (arXiv:2606.13710v2, 2026-06-15). | |
| HOTE trains a deep research system through the co-evolution of three roles: | |
| - **Solver**: plans, searches, integrates retrieved evidence, and writes long-form research reports with citations. | |
| - **Judge**: generates and updates rubrics, evaluates multiple solver responses, and provides rewards beyond deterministic-answer tasks. | |
| - **Proposer**: searches for weaknesses identified by the judge and proposes challenging but learnable research tasks. | |
| The framework uses a dual-mode strategy with both tool-use and no-tool training. According to the paper, this improves training efficiency while allowing the tool-use and no-tool modes to benefit each other. | |
| ## Repository Contents | |
| This repository contains the following checkpoint folders: | |
| - `step_700/`: HOTE-8B deep research model checkpoint. | |
| - `step_700_query/`: proposer checkpoint used in the HOTE framework. | |
| ## Intended Use | |
| HOTE-8B is intended for research on long-form deep research agents, search-augmented report generation, open-ended agent evolution, and reinforcement learning for non-verifiable tasks. | |
| The model is most useful when integrated with a search-enabled agent runtime. In the paper, the solver operates with ReAct-style actions including thinking, tool calls, final answers, and citations. The model weights alone do not provide web search, browsing, paper search, citation validation, or tool execution. | |
| ## Limitations | |
| - The model is designed for deep research workflows and should be paired with robust tool execution, citation validation, and source-quality checks. | |
| - The model may generate inaccurate, incomplete, outdated, or unsupported claims, especially without retrieval tools. | |
| - The paper notes that evolution slows as training progresses and that the upper bound may still be constrained by model scale. | |
| - The HOTE method still relies on initial training data; fully data-free open-ended deep research evolution is left for future work. | |
| - Research outputs in sensitive domains such as healthcare, law, finance, or public policy should be reviewed by qualified experts. | |
| ## Citation | |
| ```bibtex | |
| @misc{piao2026hybridopenendedtrievolutionmakes, | |
| title = {Hybrid Open-Ended Tri-Evolution Makes Better Deep Researcher}, | |
| author = {Hongming Piao and Chi Liu and Mengzhuo Chen and Yan Shu and Xidong Wang and Derek Li and Ying Wei and Bryan Dai}, | |
| year = {2026}, | |
| eprint = {2606.13710}, | |
| archivePrefix = {arXiv}, | |
| primaryClass = {cs.AI}, | |
| url = {https://arxiv.org/abs/2606.13710} | |
| } | |
| ``` | |