Upload DeepInnovator model

b5440b2 verified 6 days ago

4.28 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- research
	- scientific-discovery
	- idea-generation
	- llm
	- pytorch
	base_model: Qwen/Qwen2.5-14B-Instruct
	---

	# DeepInnovator-14B

	<p align="center">
	<a href="https://github.com/HKUDS/DeepInnovator">💻 Code</a> •
	<a href="https://arxiv.org/abs/2602.18920">📄 Paper</a> •
	<a href="https://huggingface.co/T1anyu/DeepInnovator">🤗 Model</a>
	</p>

	## Model Description

	DeepInnovator is a Large Language Model trained to possess genuine innovative capability — the ability to autonomously generate novel and significant research ideas. Unlike existing approaches that rely on sophisticated prompt engineering, DeepInnovator is built upon a systematic training paradigm designed to trigger the innovative capability of LLMs.

	### Key Features

	- 🚀 Innovative Capability: Trained specifically for generating novel research ideas
	- 📚 Knowledge-Grounded: Leverages structured research knowledge extracted from vast scientific literature
	- 🔄 Iterative Refinement: Employs "Next Idea Prediction" paradigm for continuous idea improvement
	- 🏆 State-of-the-Art Performance: Achieves 80.53%-93.81% win rates against untrained baselines

	## Training Methodology

	DeepInnovator comprises two core components:

	### 1. "Standing on the Shoulders of Giants"
	An automated data extraction pipeline that extracts and organizes structured research knowledge from a vast corpus of unlabeled scientific literature.

	### 2. "Conjectures and Refutations"
	A "Next Idea Prediction" training paradigm that models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next ideas.

	## Usage

	### Installation

	```bash
	pip install transformers torch
	```

	### Quick Start

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "T1anyu/DeepInnovator"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

	prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"

	messages = [
	{"role": "user", "content": prompt}
	]

	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer([text], return_tensors="pt").to(model.device)

	outputs = model.generate(
	**inputs,
	max_new_tokens=1024,
	temperature=0.7,
	top_p=0.9,
	do_sample=True,
	)

	response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
	print(response)
	```

	### Using vLLM for Faster Inference

	```python
	from vllm import LLM, SamplingParams

	llm = LLM(model="T1anyu/DeepInnovator")
	sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024)

	prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"
	outputs = llm.generate([prompt], sampling_params)

	print(outputs[0].outputs[0].text)
	```

	## Evaluation Results

	Both automatic and expert evaluations demonstrate that DeepInnovator-14B significantly outperforms untrained baselines:

	\| Comparison \| Win Rate \|
	\|------------\|----------\|
	\| vs. Untrained Baselines \| 80.53% - 93.81% \|
	\| vs. Leading LLMs \| Comparable Performance \|

	## Citation

	If you find DeepInnovator useful in your research, please cite our paper:

	```bibtex
	@article{fan2026deepinnovator,
	title={DeepInnovator: Triggering the Innovative Capabilities of LLMs},
	author={Fan, Tianyu and Zhang, Fengji and Zheng, Yuxiang and Chen, Bei and Niu, Xinyao and Huang, Chengen and Lin, Junyang and Huang, Chao},
	journal={arXiv preprint arXiv:2602.18920},
	year={2026}
	}
	```

	## License

	This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

	## Links

	- GitHub Repository: [https://github.com/HKUDS/DeepInnovator](https://github.com/HKUDS/DeepInnovator)
	- Hugging Face Model: [https://huggingface.co/T1anyu/DeepInnovator](https://huggingface.co/T1anyu/DeepInnovator)

	## Acknowledgements

	This work is developed by the [HKU Data Science Lab (HKUDS)](https://github.com/HKUDS).