| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | library_name: transformers |
| | pipeline_tag: text-generation |
| | tags: |
| | - research |
| | - scientific-discovery |
| | - idea-generation |
| | - llm |
| | - pytorch |
| | base_model: Qwen/Qwen2.5-14B-Instruct |
| | --- |
| | |
| | # DeepInnovator-14B |
| |
|
| | <p align="center"> |
| | <a href="https://github.com/HKUDS/DeepInnovator">π» Code</a> β’ |
| | <a href="https://arxiv.org/abs/2602.18920">π Paper</a> β’ |
| | <a href="https://huggingface.co/T1anyu/DeepInnovator">π€ Model</a> |
| | </p> |
| |
|
| | ## Model Description |
| |
|
| | **DeepInnovator** is a Large Language Model trained to possess genuine innovative capability β the ability to autonomously generate novel and significant research ideas. Unlike existing approaches that rely on sophisticated prompt engineering, DeepInnovator is built upon a systematic training paradigm designed to trigger the innovative capability of LLMs. |
| |
|
| | ### Key Features |
| |
|
| | - π **Innovative Capability**: Trained specifically for generating novel research ideas |
| | - π **Knowledge-Grounded**: Leverages structured research knowledge extracted from vast scientific literature |
| | - π **Iterative Refinement**: Employs "Next Idea Prediction" paradigm for continuous idea improvement |
| | - π **State-of-the-Art Performance**: Achieves 80.53%-93.81% win rates against untrained baselines |
| |
|
| | ## Training Methodology |
| |
|
| | DeepInnovator comprises two core components: |
| |
|
| | ### 1. "Standing on the Shoulders of Giants" |
| | An automated data extraction pipeline that extracts and organizes structured research knowledge from a vast corpus of unlabeled scientific literature. |
| |
|
| | ### 2. "Conjectures and Refutations" |
| | A "Next Idea Prediction" training paradigm that models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next ideas. |
| |
|
| | ## Usage |
| |
|
| | ### Installation |
| |
|
| | ```bash |
| | pip install transformers torch |
| | ``` |
| |
|
| | ### Quick Start |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model_name = "T1anyu/DeepInnovator" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto") |
| | |
| | prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:" |
| | |
| | messages = [ |
| | {"role": "user", "content": prompt} |
| | ] |
| | |
| | text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| | inputs = tokenizer([text], return_tensors="pt").to(model.device) |
| | |
| | outputs = model.generate( |
| | **inputs, |
| | max_new_tokens=1024, |
| | temperature=0.7, |
| | top_p=0.9, |
| | do_sample=True, |
| | ) |
| | |
| | response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True) |
| | print(response) |
| | ``` |
| |
|
| | ### Using vLLM for Faster Inference |
| |
|
| | ```python |
| | from vllm import LLM, SamplingParams |
| | |
| | llm = LLM(model="T1anyu/DeepInnovator") |
| | sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024) |
| | |
| | prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:" |
| | outputs = llm.generate([prompt], sampling_params) |
| | |
| | print(outputs[0].outputs[0].text) |
| | ``` |
| |
|
| | ## Evaluation Results |
| |
|
| | Both automatic and expert evaluations demonstrate that DeepInnovator-14B significantly outperforms untrained baselines: |
| |
|
| | | Comparison | Win Rate | |
| | |------------|----------| |
| | | vs. Untrained Baselines | 80.53% - 93.81% | |
| | | vs. Leading LLMs | Comparable Performance | |
| |
|
| | ## Citation |
| |
|
| | If you find DeepInnovator useful in your research, please cite our paper: |
| |
|
| | ```bibtex |
| | @article{fan2026deepinnovator, |
| | title={DeepInnovator: Triggering the Innovative Capabilities of LLMs}, |
| | author={Fan, Tianyu and Zhang, Fengji and Zheng, Yuxiang and Chen, Bei and Niu, Xinyao and Huang, Chengen and Lin, Junyang and Huang, Chao}, |
| | journal={arXiv preprint arXiv:2602.18920}, |
| | year={2026} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). |
| |
|
| | ## Links |
| |
|
| | - **GitHub Repository**: [https://github.com/HKUDS/DeepInnovator](https://github.com/HKUDS/DeepInnovator) |
| | - **Hugging Face Model**: [https://huggingface.co/T1anyu/DeepInnovator](https://huggingface.co/T1anyu/DeepInnovator) |
| |
|
| | ## Acknowledgements |
| |
|
| | This work is developed by the [HKU Data Science Lab (HKUDS)](https://github.com/HKUDS). |
| |
|