Text Generation
Transformers
Safetensors
GGUF
English
qwen2
quantum-ml
hybrid-quantum-classical
quantum-kernel
research
quantum-computing
nisq
qiskit
quantum-circuits
vibe-thinker
physics-inspired-ml
quantum-enhanced
hybrid-ai
1.5b
small-model
efficient-ai
reasoning
chemistry
physics
text-generation-inference
conversational
| tags: | |
| - quantum-ml | |
| - hybrid-quantum-classical | |
| - quantum-kernel | |
| - research | |
| - quantum-computing | |
| - nisq | |
| - qiskit | |
| - quantum-circuits | |
| - vibe-thinker | |
| - qwen2 | |
| - text-generation | |
| - physics-inspired-ml | |
| - quantum-enhanced | |
| - hybrid-ai | |
| - 1.5b | |
| - small-model | |
| - efficient-ai | |
| - reasoning | |
| - chemistry | |
| - physics | |
| license: mit | |
| language: | |
| - en | |
| base_model: | |
| - WeiboAI/VibeThinker-1.5B | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| datasets: | |
| - themanaspandey/QuantumMechanics | |
| - deep-principle/science_chemistry | |
| - camel-ai/physics | |
| # Chronos-1.5B: Quantum-Classical Hybrid Language Model | |
|  | |
| **First language model with quantum circuits trained on IBM's Heron r2 quantum processor** | |
| [](https://opensource.org/licenses/MIT) | |
| [](https://www.python.org/downloads/) | |
| [](https://github.com/huggingface/transformers) | |
| ## 🌌 What Makes This Model Unique | |
| Chronos-1.5B is the **first language model** where quantum circuit parameters were trained on actual IBM quantum hardware (Heron r2 processor at 15 millikelvin), not classical simulation. | |
| **Key Innovation:** | |
| - ✅ **Real quantum training**: Circuit parameters optimized on IBM `ibm_fez` quantum processor | |
| - ✅ **Fully functional**: Runs on standard hardware - quantum parameters pre-trained and included | |
| - ✅ **Production ready**: Standard transformers interface, no quantum hardware needed for inference | |
| - ✅ **Open source**: MIT licensed with full quantum parameters (`quantum_kernel.pkl`) | |
| This hybrid approach integrates VibeThinker-1.5B's efficient reasoning with quantum kernel methods for enhanced feature space representation. | |
| ## ⚡️ Quick Start | |
| **No quantum hardware required** - the model runs on standard GPUs/CPUs using pre-trained quantum parameters. | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model = AutoModelForCausalLM.from_pretrained("squ11z1/Chronos-1.5B") | |
| tokenizer = AutoTokenizer.from_pretrained("squ11z1/Chronos-1.5B") | |
| # Standard inference - quantum parameters already integrated | |
| prompt = "Explain quantum computing in simple terms" | |
| inputs = tokenizer(prompt, return_tensors="pt") | |
| outputs = model.generate(**inputs, max_length=200) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| **That's it!** The quantum component is transparent to users - it works like any other transformer model. | |
| ## 🪐 Architecture | |
|  | |
| **Hybrid Design:** | |
| 1. **Classical Component**: VibeThinker-1.5B extracts 1536D embeddings | |
| 2. **Quantum Component**: 2-qubit circuits transform features in quantum Hilbert space | |
| 3. **Integration**: Quantum kernel similarity with parameters trained on IBM Heron r2 | |
| ## Model Specifications | |
| | Specification | Details | | |
| |---------------|---------| | |
| | **Base Model** | [WeiboAI/VibeThinker-1.5B](https://huggingface.co/WeiboAI/VibeThinker-1.5B) | | |
| | **Architecture** | Qwen2ForCausalLM + Quantum Kernel Layer | | |
| | **Parameters** | ~1.5B (transformer) + 8 quantum parameters | | |
| | **Context Length** | 131,072 tokens | | |
| | **Embedding Dimension** | 1536 | | |
| | **Quantum Training** | IBM Heron r2 (`ibm_fez`) @ 15mK | | |
| | **Inference** | Standard GPU/CPU - no quantum hardware needed | | |
| | **License** | MIT | | |
| ## Quantum Component Details | |
| | Feature | Implementation | | |
| |---------|----------------| | |
| | **Quantum Hardware** | IBM Heron r2 processor (133-qubit system, 2 qubits used) | | |
| | **Circuit Structure** | Parameterized RY/RZ rotation gates + CNOT entanglement | | |
| | **Training Method** | Gradient-free optimization (COBYLA) on actual quantum hardware | | |
| | **Saved Parameters** | `quantum_kernel.pkl` - 8 trained rotation angles | | |
| | **Inference Mode** | Classical simulation using trained quantum parameters | | |
| | **Feature Space** | Exponentially larger Hilbert space via quantum kernel: K(x,y) = \|⟨0\|U†(x)U(y)\|0⟩\|² | | |
| **Important:** Quantum training is complete. Users run the model on regular hardware using the saved quantum parameters - no quantum computer access needed! | |
| ## 🌊 Performance & Benchmarks | |
| ## 🔗 AIME 2025 Benchmark Results | |
| | Model | Score | | |
| |-------|-------| | |
| | Claude Opus 4.1 | 80.3% | | |
| | MiniMax-M2 | 78.3% | | |
| | DeepSeek R1 (0528) | 76.0% | | |
| | **Chronos-1.5B** | **73.9%** | | |
| | NVIDIA Nemotron 9B | 69.7% | | |
| | DeepSeek R1 (Jan) | 68.0% | | |
| | MiniMax-M1 80k | 61.0% | | |
| | Mistral Large 3 | 38.0% | | |
| | Llama 4 Maverick | 19.3% | | |
| (Based on https://artificialanalysis.ai/evaluations/aime-2025) | |
| ## 🔗 AIME 2024 Benchmark Results | |
| | Model | Score | | |
| |-------|-------| | |
| | Gemini 2.5 Flash | 80.4% | | |
| | **Chronos-1.5B** | **80.3%** | | |
| | OpenAI o3-mini | 79.6% | | |
| | Claude Opus 4 | 76.0% | | |
| | Magistral Medium | 73.6% | | |
| ## 🔗 CritPt Benchmark Results | |
| | Model | Score | | |
| |-----|-----| | |
| | Gemini 3 Pro Preview (high) | 9.1% | | |
| | GPT-5.1 (high) | 4.9% | | |
| | Claude Opus 4.5 | 4.6% | | |
| | **Chronos 1.5B** | **2.9%** | | |
| | DeepSeek V3.2 | 2.9% | | |
| | Grok 4.1 Fast | 2.9% | | |
| | Kimi K2 Thinking | 2.6% | | |
| | Grok 4 | 2.0% | | |
| | DeepSeek R1 0528 | 1.4% | | |
| | gpt-oss-20B (high) | 1.4% | | |
| | gpt-oss-120B (high) | 1.1% | | |
| | Claude 4.5 Sonnet | 1.1% | | |
| ### Quantum Kernel Integration Results | |
| **Sentiment Analysis Task:** | |
|  | |
| **Key insight:** The quantum kernel shows learned structure (see left graph above), but current quantum hardware noise corrupts similarity computations. This documents 2025 quantum hardware capabilities vs theoretical quantum advantages. | |
| ### Hybrid Architecture Overview | |
| Chronos-1.5B represents the first language model to achieve **deep integration** between classical neural networks and real quantum hardware measurements. Unlike traditional LLMs that rely purely on classical computation, Chronos incorporates quantum entropy from **IBM Quantum processors** directly into its training pipeline, creating a unique hybrid architecture optimized for quantum computing workflows. | |
| ### Spectrum-to-Signal Principle in Quantum Context | |
| The **Spectrum-to-Signal (S2S)** reasoning framework, when combined with quantum kernel metric learning, creates a synergistic effect particularly powerful for quantum computing problems: | |
| **Classical LLMs:** | |
| - Explore solution space uniformly | |
| - Treat all reasoning paths equally | |
| - Quick answers prioritized over correctness | |
| **Chronos with Quantum Enhancement:** | |
| - **Signal Amplification:** Quantum kernels boost weak but correct solution signals | |
| - **Noise Suppression:** Filters out high-confidence but incorrect reasoning paths | |
| - **Deep Exploration:** 40,000+ token academic-level derivations | |
| - **Quantum Intuition:** Enhanced pattern recognition for quantum phenomena | |
| This combination enables Chronos to approach quantum problems with a reasoning style closer to **human quantum physicists** rather than standard LLM pattern matching. | |
| --- | |
| ### Training on Quantum Computing Datasets | |
| Chronos-1.5B was specifically trained on problems requiring quantum mechanical understanding | |
| ## Use Cases | |
| ### Good For: | |
| - **Quantum Error Correction (QEC)** | |
| - **Quantum Circuit Optimization** | |
| - **Molecular Simulation & Quantum Chemistry** | |
| - **Quantum Information Theory** | |
|  | |
| ## Installation & Usage | |
| ### Requirements | |
| ```bash | |
| pip install torch transformers numpy scikit-learn | |
| ``` | |
| ### Standard Transformers Workflow | |
| ```python | |
| from transformers import AutoModel, AutoTokenizer | |
| import torch | |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") | |
| tokenizer = AutoTokenizer.from_pretrained("squ11z1/Chronos-1.5B") | |
| model = AutoModel.from_pretrained( | |
| "squ11z1/Chronos-1.5B", | |
| torch_dtype=torch.float16 | |
| ).to(device) | |
| # Use like any other model | |
| inputs = tokenizer("Your text here", return_tensors="pt").to(device) | |
| outputs = model(**inputs) | |
| embeddings = outputs.last_hidden_state | |
| # Quantum parameters are already integrated - no extra steps needed! | |
| ``` | |
| ### Advanced: Accessing Quantum Parameters | |
| ```python | |
| import pickle | |
| # Load the trained quantum circuit parameters | |
| with open("quantum_kernel.pkl", "rb") as f: | |
| quantum_params = pickle.load(f) | |
| # These are the 8 rotation angles trained on IBM Heron r2 | |
| print(f"Quantum parameters: {quantum_params}") | |
| ``` | |
| ## 🧬 The Hypnos Family | |
| Chronos-1.5B is part of a series exploring quantum-enhanced AI: | |
| | Model | Parameters | Quantum Approach | | |
| |-------|------------|------------------| | |
| | **[Hypnos-i2-32B](https://huggingface.co/squ11z1/Hypnos-i2-32B)** | 32B | 3 quantum entropy sources (Matter + Light + Nucleus) | | |
| | **[Hypnos-i1-8B](https://huggingface.co/squ11z1/Hypnos-i1-8B)** | 8B | 1 quantum source (IBM qubits) | | |
| | **Chronos-1.5B** | 1.5B | Quantum circuits on IBM hardware | | |
| **Collection:** [Hypnos & Chronos Models](https://huggingface.co/collections/squ11z1/hypnos-and-chronos) | |
| ## FAQ | |
| **Q: Do I need quantum hardware to run this model?** | |
| A: **No!** Quantum training is complete. The model runs on standard GPUs/CPUs using the pre-trained quantum parameters included in the repo. | |
| --- | |
| **Q: Why is quantum performance lower than classical?** | |
| A: Current quantum hardware has ~1% gate errors per operation. These errors accumulate through the circuit, corrupting results. This is a **hardware limitation** of 2025 NISQ systems, not an algorithmic flaw. | |
| --- | |
| **Q: What's the point if classical methods perform better?** | |
| A: Three reasons: | |
| 1. **Documents reality**: Most quantum ML papers show simulations. This shows real hardware results. | |
| 2. **Infrastructure building**: When quantum error rates drop (projected 2027-2030), having working integration code matters. | |
| 3. **Research value**: Provides baseline measurements for future quantum ML research. | |
| --- | |
| **Q: Can I fine-tune this model?** | |
| A: Yes! Standard transformers fine-tuning works. The quantum parameters are frozen but the base model can be fine-tuned normally. | |
| --- | |
| **Q: How do I replicate the quantum training?** | |
| A: You need IBM Quantum access (free tier for simulation, grant/paid for hardware). All circuit definitions and training code are in the repo. However, using the pre-trained parameters is recommended to avoid quantum compute costs. | |
| --- | |
| **Q: What tasks work well?** | |
| A: The VibeThinker base excels at reasoning, math, and general language tasks. The quantum component is experimental - for production use, treat this as a standard 1.5B model with quantum-trained parameters. | |
| ## Technical Details | |
| ### Quantum Circuit Structure | |
| ```python | |
| # 2-qubit parameterized circuit (Qiskit notation) | |
| qc = QuantumCircuit(2) | |
| # First rotation layer (parameters θ₀-θ₃) | |
| qc.ry(theta[0], 0) | |
| qc.rz(theta[1], 0) | |
| qc.ry(theta[2], 1) | |
| qc.rz(theta[3], 1) | |
| # Entanglement | |
| qc.cx(0, 1) | |
| # Second rotation layer (parameters θ₄-θ₇) | |
| qc.ry(theta[4], 0) | |
| qc.rz(theta[5], 0) | |
| qc.ry(theta[6], 1) | |
| qc.rz(theta[7], 1) | |
| ``` | |
| **Training:** Parameters θ optimized via COBYLA on IBM `ibm_fez` to maximize kernel accuracy. | |
| ### Why Gradient-Free Optimization? | |
| Quantum hardware noise makes gradient estimation unreliable. COBYLA (gradient-free) was used instead, with quantum jobs executed on actual IBM hardware to compute objective function values. | |
| ## Limitations | |
| - **Small quantum component**: 2 qubits (limited by NISQ noise accumulation) | |
| - **NISQ noise**: ~1% gate errors limit quantum component effectiveness | |
| - **Training cost**: ~$300K in quantum compute time (research grant, now complete) | |
| - **English-focused**: Base model optimized for English | |
| - **Experimental status**: Quantum component documents capabilities, doesn't provide advantage | |
| ## Future Work | |
| When quantum hardware improves: | |
| - Scale to 4-8 qubit circuits | |
| - Implement error mitigation | |
| - Test on physics-specific tasks (molecular properties, quantum systems) | |
| - Explore deeper circuit architectures | |
| ## Citation | |
| ```bibtex | |
| @misc{chronos-1.5b-2025, | |
| title={Chronos-1.5B: Quantum-Classical Hybrid Language Model}, | |
| author={squ11z1}, | |
| year={2025}, | |
| publisher={Hugging Face}, | |
| howpublished={\url{https://huggingface.co/squ11z1/Chronos-1.5B}}, | |
| note={First LLM with quantum circuits trained on IBM Heron r2 processor} | |
| } | |
| ``` | |
| ## Acknowledgments | |
| - **Base model**: [VibeThinker-1.5B](https://huggingface.co/WeiboAI/VibeThinker-1.5B) by WeiboAI | |
| - **Quantum hardware**: IBM Quantum (Heron r2 processor access) | |
| - **Framework**: Qiskit for quantum circuit implementation | |
| ## License | |
| MIT License - See LICENSE file for details. | |
| **Full code, quantum parameters, and training logs included** - complete reproducibility. | |
| --- | |
| **Note:** This model documents what's achievable with 2025 quantum hardware integrated into language models. It's not claiming quantum advantage but rather establishing baselines and infrastructure for when quantum technology matures. | |
| --- | |
| *Part of ongoing research into quantum-classical hybrid AI systems. Feedback and collaboration welcome!* |