--- license: apache-2.0 language: - en library_name: transformers pipeline_tag: text-generation tags: - research - scientific-discovery - idea-generation - llm - pytorch base_model: Qwen/Qwen2.5-14B-Instruct --- # DeepInnovator-14B

💻 Code📄 Paper🤗 Model

## Model Description **DeepInnovator** is a Large Language Model trained to possess genuine innovative capability — the ability to autonomously generate novel and significant research ideas. Unlike existing approaches that rely on sophisticated prompt engineering, DeepInnovator is built upon a systematic training paradigm designed to trigger the innovative capability of LLMs. ### Key Features - 🚀 **Innovative Capability**: Trained specifically for generating novel research ideas - 📚 **Knowledge-Grounded**: Leverages structured research knowledge extracted from vast scientific literature - 🔄 **Iterative Refinement**: Employs "Next Idea Prediction" paradigm for continuous idea improvement - 🏆 **State-of-the-Art Performance**: Achieves 80.53%-93.81% win rates against untrained baselines ## Training Methodology DeepInnovator comprises two core components: ### 1. "Standing on the Shoulders of Giants" An automated data extraction pipeline that extracts and organizes structured research knowledge from a vast corpus of unlabeled scientific literature. ### 2. "Conjectures and Refutations" A "Next Idea Prediction" training paradigm that models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next ideas. ## Usage ### Installation ```bash pip install transformers torch ``` ### Quick Start ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "T1anyu/DeepInnovator" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto") prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:" messages = [ {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer([text], return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=1024, temperature=0.7, top_p=0.9, do_sample=True, ) response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True) print(response) ``` ### Using vLLM for Faster Inference ```python from vllm import LLM, SamplingParams llm = LLM(model="T1anyu/DeepInnovator") sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024) prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:" outputs = llm.generate([prompt], sampling_params) print(outputs[0].outputs[0].text) ``` ## Evaluation Results Both automatic and expert evaluations demonstrate that DeepInnovator-14B significantly outperforms untrained baselines: | Comparison | Win Rate | |------------|----------| | vs. Untrained Baselines | 80.53% - 93.81% | | vs. Leading LLMs | Comparable Performance | ## Citation If you find DeepInnovator useful in your research, please cite our paper: ```bibtex @article{fan2026deepinnovator, title={DeepInnovator: Triggering the Innovative Capabilities of LLMs}, author={Fan, Tianyu and Zhang, Fengji and Zheng, Yuxiang and Chen, Bei and Niu, Xinyao and Huang, Chengen and Lin, Junyang and Huang, Chao}, journal={arXiv preprint arXiv:2602.18920}, year={2026} } ``` ## License This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). ## Links - **GitHub Repository**: [https://github.com/HKUDS/DeepInnovator](https://github.com/HKUDS/DeepInnovator) - **Hugging Face Model**: [https://huggingface.co/T1anyu/DeepInnovator](https://huggingface.co/T1anyu/DeepInnovator) ## Acknowledgements This work is developed by the [HKU Data Science Lab (HKUDS)](https://github.com/HKUDS).