QuantumGPT-124M-v2: Quantum Circuit Generation Model

QuantumGPT-124M-v2 is a GPT-2 architecture language model trained from scratch on quantum circuit description → OpenQASM 2.0 pairs. It is the second model in the QuantumGPT scaling series, trained on the expanded quantum-circuits-21k dataset (21,208 samples vs 8,129 in v1).

Compared to QuantumGPT-124M-v1, this model achieves pass@1 syntax validity of 95.8% (up from 67.2%) and pass@5 of 100%, as measured on the QuantumGPT Benchmark v1.0 — a statistically significant improvement (Fisher exact test, p=0.0016).

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("merileijona/quantumgpt-124m-v2")
tokenizer = AutoTokenizer.from_pretrained("merileijona/quantumgpt-124m-v2")

prompt = "<|user|>Create a Bell state with two qubits<|end|>\n<|assistant|>"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.8,
    top_k=50,
    repetition_penalty=1.1,
    pad_token_id=tokenizer.eos_token_id,
)

text = tokenizer.decode(outputs[0], skip_special_tokens=False)
response = text[len(prompt):]
if "<|end|>" in response:
    response = response[:response.index("<|end|>")]
print(response.strip())

Expected Output:

OPENQASM 2.0;
include "qelib1.inc";
qreg q[2];
creg c[2];
h q[0];
cx q[0],q[1];
measure q -> c;

Model Details

Architecture

Parameter	Value
Base architecture	GPT-2
Parameters	123.8M
Layers	12
Attention heads	12
Embedding dimension	768
Context length	256 tokens
Dropout (training)	0.2
Activation function	GELU (standard)

Implementation Notes

Converted from NanoGPT-style training checkpoint
All Conv1D weights correctly transposed for HuggingFace compatibility
Bias tensors injected as zeros (bias-free architecture → HF GPT2LMHeadModel compatibility)
Word embeddings tied with lm_head (tie_word_embeddings: true)

Training Configuration

Parameter	Value
Training dataset	quantum-circuits-21k
Training samples	21,208
Unique base circuits	1,928
Estimated training tokens	~1.75M
Training steps	2,000 (best checkpoint at step 1700)
Learning rate	3×10⁻⁴ (cosine decay)
Effective batch size	64 sequences (16,384 tokens/step)
Hardware	NVIDIA RTX 4070 12GB
Best validation loss	0.2502
v1 validation loss	0.2691

Benchmark Results

Evaluated on QuantumGPT Benchmark v1.0 — 100 prompts, 50 in-distribution / 50 out-of-distribution, 3 difficulty tiers, k=5 samples per prompt, seed=42.

Overall Performance

Metric	QuantumGPT-124M-v1	QuantumGPT-124M-v2	Δ
Validation loss	0.2691	0.2502	−0.019
Syntax valid pass@1	67.2%	95.8%	+28.6pp
Syntax valid pass@3	87.2%	99.9%	+12.7pp
Syntax valid pass@5	91.0%	100.0%	+9.0pp
Semantic valid pass@1	23.4%	46.2%	+22.8pp
Semantic valid pass@5	48.0%	61.0%	+13.0pp

Overall syntax improvement is statistically significant (Fisher exact, p=0.0016). Benchmark prompt suite hash: ee2da8a57e683af2464eb7a4eada0898.

By Category (syntax valid, pass@5)

Category	v1	v2
Algorithm	7/10	10/10
Arithmetic	10/10	10/10
Error correction	10/10	10/10
Measurement	10/10	10/10
Multi-qubit complex	9/10	10/10
Single gate	10/10	10/10
State preparation	8/10	10/10
Three-qubit	9/10	10/10
Two-qubit basic	9/10	10/10
Variational	9/10	10/10

By Difficulty Tier

Tier	n	v1	v2
Easy	18	94.4%	100.0%
Medium	43	93.0%	100.0%
Hard	39	87.2%	100.0% (p=0.027)

Semantic Gap

Both models show a gap between syntactic and semantic validity — circuits that parse correctly but implement a trivial or incorrect unitary. This remains the primary open challenge at 124M scale.

Model	Syntax valid	Semantic valid	Gap
v1	91%	48%	43pp
v2	100%	61%	39pp

Prompt Format

The model was trained using explicit conversation delimiters:

<|user|>{natural language description}<|end|>
<|assistant|>{OpenQASM 2.0 circuit}<|end|>

These markers are literal text tokens, not special tokenizer tokens. Always include the full prefix including <|assistant|> and stop generation at the first <|end|>.

Limitations

Semantic correctness: 39pp gap between syntactic and semantic validity — circuits may parse correctly but implement incorrect unitaries. Always simulate before use.
Context length: 256-token context limits very deep circuits.
Synthetic training data: All training circuits generated by LLM (xAI Grok), not from real quantum programs.
OOD generalisation: Improvement over v1 concentrates in in-distribution prompts; out-of-distribution generalisation remains a challenge.
No hardware validation: Generated circuits require transpilation and validation before execution on real quantum hardware.

Intended Use

✅ Educational tools and quantum computing demonstrations
✅ Rapid circuit prototyping and exploration
✅ QASM code completion assistance
✅ Benchmarking quantum compilers and simulators
✅ Research baseline for quantum circuit generation

❌ Production quantum computing workflows
❌ Hardware deployment without independent validation
❌ Safety-critical quantum applications

Scaling Series

Model	Dataset	Samples	pass@1 syntax	pass@5 syntax	Val loss
QuantumGPT-124M-v1	quantum-circuits-8k	8,129	67.2%	91.0%	0.2691
QuantumGPT-124M-v2 (this model)	quantum-circuits-21k	21,208	95.8%	100.0%	0.2502
QuantumGPT-354M	quantum-circuits-21k	21,208	92.2%	99.0%	0.2677

Citation

@misc{quantumgpt124mv2,
  author    = {Merilehto, Juhani},
  title     = {QuantumGPT-124M-v2: Data Scaling Study for Quantum Circuit Generation},
  year      = {2026},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/merileijona/quantumgpt-124m-v2},
  note      = {GPT-2 124M trained on quantum-circuits-21k (21,208 samples).
               pass@1 syntax 95.8\%, pass@5 100.0\% on QuantumGPT Benchmark v1.0}
}

Model Card Authors

Juhani Merilehto

HuggingFace: @merileijona
GitHub: @juhanimerilehto
Affiliation: University of Vaasa, School of Management; University of Turku, Faculty of Technology

License

MIT License

Acknowledgments

Training framework: Andrej Karpathy's nanoGPT / nanochat architecture
Data generation: xAI Grok API (grok-4-2)
Tokenizer: Standard GPT-2 BPE (HuggingFace GPT2TokenizerFast)
Validation: Qiskit OpenQASM 2.0 parser
Hardware: NVIDIA RTX 4070 12GB / AMD Ryzen 9 5950X / 128GB RAM

Additional Resources

Training dataset: merileijona/quantum-circuits-21k
Original v1 dataset: merileijona/quantum-circuits-8k
Previous model: merileijona/quantumgpt-124m

Model Version: 2.0
Release Date: March 2026

Downloads last month: 126

Dataset used to train merileijona/quantumgpt-124m-v2

Collection including merileijona/quantumgpt-124m-v2

QuantumGPT — Quantum Circuit Generation

Collection

Three GPT-style models trained from scratch on two OpenQASM 2 datasets. • 5 items • Updated 23 days ago