File size: 4,277 Bytes

b5440b2

---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- research
- scientific-discovery
- idea-generation
- llm
- pytorch
base_model: Qwen/Qwen2.5-14B-Instruct
---

# DeepInnovator-14B

<p align="center">
  <a href="https://github.com/HKUDS/DeepInnovator">💻 Code</a> •
  <a href="https://arxiv.org/abs/2602.18920">📄 Paper</a> •
  <a href="https://huggingface.co/T1anyu/DeepInnovator">🤗 Model</a>
</p>

## Model Description

**DeepInnovator** is a Large Language Model trained to possess genuine innovative capability — the ability to autonomously generate novel and significant research ideas. Unlike existing approaches that rely on sophisticated prompt engineering, DeepInnovator is built upon a systematic training paradigm designed to trigger the innovative capability of LLMs.

### Key Features

- 🚀 **Innovative Capability**: Trained specifically for generating novel research ideas
- 📚 **Knowledge-Grounded**: Leverages structured research knowledge extracted from vast scientific literature
- 🔄 **Iterative Refinement**: Employs "Next Idea Prediction" paradigm for continuous idea improvement
- 🏆 **State-of-the-Art Performance**: Achieves 80.53%-93.81% win rates against untrained baselines

## Training Methodology

DeepInnovator comprises two core components:

### 1. "Standing on the Shoulders of Giants"
An automated data extraction pipeline that extracts and organizes structured research knowledge from a vast corpus of unlabeled scientific literature.

### 2. "Conjectures and Refutations"
A "Next Idea Prediction" training paradigm that models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next ideas.

## Usage

### Installation

```bash
pip install transformers torch
```

### Quick Start

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "T1anyu/DeepInnovator"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"

messages = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
```

### Using vLLM for Faster Inference

```python
from vllm import LLM, SamplingParams

llm = LLM(model="T1anyu/DeepInnovator")
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024)

prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"
outputs = llm.generate([prompt], sampling_params)

print(outputs[0].outputs[0].text)
```

## Evaluation Results

Both automatic and expert evaluations demonstrate that DeepInnovator-14B significantly outperforms untrained baselines:

| Comparison | Win Rate |
|------------|----------|
| vs. Untrained Baselines | 80.53% - 93.81% |
| vs. Leading LLMs | Comparable Performance |

## Citation

If you find DeepInnovator useful in your research, please cite our paper:

```bibtex
@article{fan2026deepinnovator,
  title={DeepInnovator: Triggering the Innovative Capabilities of LLMs},
  author={Fan, Tianyu and Zhang, Fengji and Zheng, Yuxiang and Chen, Bei and Niu, Xinyao and Huang, Chengen and Lin, Junyang and Huang, Chao},
  journal={arXiv preprint arXiv:2602.18920},
  year={2026}
}
```

## License

This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

## Links

- **GitHub Repository**: [https://github.com/HKUDS/DeepInnovator](https://github.com/HKUDS/DeepInnovator)
- **Hugging Face Model**: [https://huggingface.co/T1anyu/DeepInnovator](https://huggingface.co/T1anyu/DeepInnovator)

## Acknowledgements

This work is developed by the [HKU Data Science Lab (HKUDS)](https://github.com/HKUDS).