Instructions to use T1anyu/DeepInnovator with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use T1anyu/DeepInnovator with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="T1anyu/DeepInnovator")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("T1anyu/DeepInnovator")
model = AutoModelForCausalLM.from_pretrained("T1anyu/DeepInnovator")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use T1anyu/DeepInnovator with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "T1anyu/DeepInnovator"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "T1anyu/DeepInnovator",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/T1anyu/DeepInnovator

SGLang

How to use T1anyu/DeepInnovator with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "T1anyu/DeepInnovator" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "T1anyu/DeepInnovator",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "T1anyu/DeepInnovator" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "T1anyu/DeepInnovator",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use T1anyu/DeepInnovator with Docker Model Runner:
```
docker model run hf.co/T1anyu/DeepInnovator
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

DeepInnovator-14B

💻 Code • 📄 Paper • 🤗 Model

Model Description

DeepInnovator is a Large Language Model trained to possess genuine innovative capability — the ability to autonomously generate novel and significant research ideas. Unlike existing approaches that rely on sophisticated prompt engineering, DeepInnovator is built upon a systematic training paradigm designed to trigger the innovative capability of LLMs.

Key Features

🚀 Innovative Capability: Trained specifically for generating novel research ideas
📚 Knowledge-Grounded: Leverages structured research knowledge extracted from vast scientific literature
🔄 Iterative Refinement: Employs "Next Idea Prediction" paradigm for continuous idea improvement
🏆 State-of-the-Art Performance: Achieves 80.53%-93.81% win rates against untrained baselines

Training Methodology

DeepInnovator comprises two core components:

1. "Standing on the Shoulders of Giants"

An automated data extraction pipeline that extracts and organizes structured research knowledge from a vast corpus of unlabeled scientific literature.

2. "Conjectures and Refutations"

A "Next Idea Prediction" training paradigm that models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next ideas.

Usage

Installation

pip install transformers torch

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "T1anyu/DeepInnovator"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"

messages = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Using vLLM for Faster Inference

from vllm import LLM, SamplingParams

llm = LLM(model="T1anyu/DeepInnovator")
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024)

prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"
outputs = llm.generate([prompt], sampling_params)

print(outputs[0].outputs[0].text)

Evaluation Results

Both automatic and expert evaluations demonstrate that DeepInnovator-14B significantly outperforms untrained baselines:

Comparison	Win Rate
vs. Untrained Baselines	80.53% - 93.81%
vs. Leading LLMs	Comparable Performance

Citation

If you find DeepInnovator useful in your research, please cite our paper:

@article{fan2026deepinnovator,
  title={DeepInnovator: Triggering the Innovative Capabilities of LLMs},
  author={Fan, Tianyu and Zhang, Fengji and Zheng, Yuxiang and Chen, Bei and Niu, Xinyao and Huang, Chengen and Lin, Junyang and Huang, Chao},
  journal={arXiv preprint arXiv:2602.18920},
  year={2026}
}

License

This model is released under the Apache 2.0 License.

Acknowledgements

This work is developed by the HKU Data Science Lab (HKUDS).

Downloads last month: 1,215

Safetensors

Model size

15B params

Tensor type

BF16

Model tree for T1anyu/DeepInnovator

Base model

Qwen/Qwen2.5-14B

Finetuned

Qwen/Qwen2.5-14B-Instruct

Finetuned

(389)

this model

Quantizations

1 model

Paper for T1anyu/DeepInnovator

DeepInnovator: Triggering the Innovative Capabilities of LLMs

Paper • 2602.18920 • Published Feb 21 • 1

T1anyu
/

DeepInnovator