File size: 4,277 Bytes
b5440b2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- research
- scientific-discovery
- idea-generation
- llm
- pytorch
base_model: Qwen/Qwen2.5-14B-Instruct
---

# DeepInnovator-14B

<p align="center">
  <a href="https://github.com/HKUDS/DeepInnovator">πŸ’» Code</a> β€’
  <a href="https://arxiv.org/abs/2602.18920">πŸ“„ Paper</a> β€’
  <a href="https://huggingface.co/T1anyu/DeepInnovator">πŸ€— Model</a>
</p>

## Model Description

**DeepInnovator** is a Large Language Model trained to possess genuine innovative capability β€” the ability to autonomously generate novel and significant research ideas. Unlike existing approaches that rely on sophisticated prompt engineering, DeepInnovator is built upon a systematic training paradigm designed to trigger the innovative capability of LLMs.

### Key Features

- πŸš€ **Innovative Capability**: Trained specifically for generating novel research ideas
- πŸ“š **Knowledge-Grounded**: Leverages structured research knowledge extracted from vast scientific literature
- πŸ”„ **Iterative Refinement**: Employs "Next Idea Prediction" paradigm for continuous idea improvement
- πŸ† **State-of-the-Art Performance**: Achieves 80.53%-93.81% win rates against untrained baselines

## Training Methodology

DeepInnovator comprises two core components:

### 1. "Standing on the Shoulders of Giants"
An automated data extraction pipeline that extracts and organizes structured research knowledge from a vast corpus of unlabeled scientific literature.

### 2. "Conjectures and Refutations"
A "Next Idea Prediction" training paradigm that models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next ideas.

## Usage

### Installation

```bash
pip install transformers torch
```

### Quick Start

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "T1anyu/DeepInnovator"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"

messages = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
```

### Using vLLM for Faster Inference

```python
from vllm import LLM, SamplingParams

llm = LLM(model="T1anyu/DeepInnovator")
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024)

prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"
outputs = llm.generate([prompt], sampling_params)

print(outputs[0].outputs[0].text)
```

## Evaluation Results

Both automatic and expert evaluations demonstrate that DeepInnovator-14B significantly outperforms untrained baselines:

| Comparison | Win Rate |
|------------|----------|
| vs. Untrained Baselines | 80.53% - 93.81% |
| vs. Leading LLMs | Comparable Performance |

## Citation

If you find DeepInnovator useful in your research, please cite our paper:

```bibtex
@article{fan2026deepinnovator,
  title={DeepInnovator: Triggering the Innovative Capabilities of LLMs},
  author={Fan, Tianyu and Zhang, Fengji and Zheng, Yuxiang and Chen, Bei and Niu, Xinyao and Huang, Chengen and Lin, Junyang and Huang, Chao},
  journal={arXiv preprint arXiv:2602.18920},
  year={2026}
}
```

## License

This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

## Links

- **GitHub Repository**: [https://github.com/HKUDS/DeepInnovator](https://github.com/HKUDS/DeepInnovator)
- **Hugging Face Model**: [https://huggingface.co/T1anyu/DeepInnovator](https://huggingface.co/T1anyu/DeepInnovator)

## Acknowledgements

This work is developed by the [HKU Data Science Lab (HKUDS)](https://github.com/HKUDS).