DeepInnovator / README.md
T1anyu's picture
Upload DeepInnovator model
b5440b2 verified
---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- research
- scientific-discovery
- idea-generation
- llm
- pytorch
base_model: Qwen/Qwen2.5-14B-Instruct
---
# DeepInnovator-14B
<p align="center">
<a href="https://github.com/HKUDS/DeepInnovator">πŸ’» Code</a> β€’
<a href="https://arxiv.org/abs/2602.18920">πŸ“„ Paper</a> β€’
<a href="https://huggingface.co/T1anyu/DeepInnovator">πŸ€— Model</a>
</p>
## Model Description
**DeepInnovator** is a Large Language Model trained to possess genuine innovative capability β€” the ability to autonomously generate novel and significant research ideas. Unlike existing approaches that rely on sophisticated prompt engineering, DeepInnovator is built upon a systematic training paradigm designed to trigger the innovative capability of LLMs.
### Key Features
- πŸš€ **Innovative Capability**: Trained specifically for generating novel research ideas
- πŸ“š **Knowledge-Grounded**: Leverages structured research knowledge extracted from vast scientific literature
- πŸ”„ **Iterative Refinement**: Employs "Next Idea Prediction" paradigm for continuous idea improvement
- πŸ† **State-of-the-Art Performance**: Achieves 80.53%-93.81% win rates against untrained baselines
## Training Methodology
DeepInnovator comprises two core components:
### 1. "Standing on the Shoulders of Giants"
An automated data extraction pipeline that extracts and organizes structured research knowledge from a vast corpus of unlabeled scientific literature.
### 2. "Conjectures and Refutations"
A "Next Idea Prediction" training paradigm that models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next ideas.
## Usage
### Installation
```bash
pip install transformers torch
```
### Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "T1anyu/DeepInnovator"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=1024,
temperature=0.7,
top_p=0.9,
do_sample=True,
)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
```
### Using vLLM for Faster Inference
```python
from vllm import LLM, SamplingParams
llm = LLM(model="T1anyu/DeepInnovator")
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024)
prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"
outputs = llm.generate([prompt], sampling_params)
print(outputs[0].outputs[0].text)
```
## Evaluation Results
Both automatic and expert evaluations demonstrate that DeepInnovator-14B significantly outperforms untrained baselines:
| Comparison | Win Rate |
|------------|----------|
| vs. Untrained Baselines | 80.53% - 93.81% |
| vs. Leading LLMs | Comparable Performance |
## Citation
If you find DeepInnovator useful in your research, please cite our paper:
```bibtex
@article{fan2026deepinnovator,
title={DeepInnovator: Triggering the Innovative Capabilities of LLMs},
author={Fan, Tianyu and Zhang, Fengji and Zheng, Yuxiang and Chen, Bei and Niu, Xinyao and Huang, Chengen and Lin, Junyang and Huang, Chao},
journal={arXiv preprint arXiv:2602.18920},
year={2026}
}
```
## License
This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
## Links
- **GitHub Repository**: [https://github.com/HKUDS/DeepInnovator](https://github.com/HKUDS/DeepInnovator)
- **Hugging Face Model**: [https://huggingface.co/T1anyu/DeepInnovator](https://huggingface.co/T1anyu/DeepInnovator)
## Acknowledgements
This work is developed by the [HKU Data Science Lab (HKUDS)](https://github.com/HKUDS).