Instructions to use merileijona/quantumgpt-124m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use merileijona/quantumgpt-124m with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="merileijona/quantumgpt-124m")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("merileijona/quantumgpt-124m")
model = AutoModelForCausalLM.from_pretrained("merileijona/quantumgpt-124m")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use merileijona/quantumgpt-124m with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "merileijona/quantumgpt-124m"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "merileijona/quantumgpt-124m",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/merileijona/quantumgpt-124m

SGLang

How to use merileijona/quantumgpt-124m with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "merileijona/quantumgpt-124m" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "merileijona/quantumgpt-124m",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "merileijona/quantumgpt-124m" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "merileijona/quantumgpt-124m",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use merileijona/quantumgpt-124m with Docker Model Runner:
```
docker model run hf.co/merileijona/quantumgpt-124m
```

QuantumGPT-124M: Quantum Circuit Generation Model

QuantumGPT-124M is a GPT-2 architecture language model trained specifically for generating quantum circuits in OpenQASM 2.0 format from natural language descriptions.

Model Description

Model Type: Causal Language Model (GPT-2 architecture)
Parameters: 124 million
Training Data: 8,129 quantum circuits across 92 categories (~373K tokens)
Output Format: OpenQASM 2.0
Specialty: Generates syntactically valid quantum circuits for 1-4 qubit systems

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("merileijona/quantumgpt-124m")
tokenizer = AutoTokenizer.from_pretrained("merileijona/quantumgpt-124m")

prompt = "<|user|>Create a Bell state with two qubits<|end|>\n<|assistant|>"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.8,
    top_k=50,
    repetition_penalty=1.1,
    pad_token_id=tokenizer.eos_token_id,
)

text = tokenizer.decode(outputs[0], skip_special_tokens=False)
qasm = text.split("<|end|>", 1)[0]
print(qasm)

Expected Output:

OPENQASM 2.0;
include "qelib1.inc";
qreg q[2];
creg c[2];
h q[0];
cx q[0],q[1];
measure q -> c;

Model Details

Architecture

Base Model: GPT-2 (124M parameters)
Layers: 12
Attention Heads: 12
Embedding Dimension: 768
Context Length: 256 tokens
Dropout: 0.2 (training)
Activation Function: GELU (standard, not gelu_new)

Implementation Notes

Converted from a NanoGPT-style training checkpoint
All Conv1D weights correctly transposed for HuggingFace compatibility
Bias tensors injected for GPT2LMHeadModel compatibility
Word embeddings are tied with lm_head

Training Configuration

Dataset Size: 8,129 training samples
Unique Circuits: 739 (with 11x augmentation via paraphrasing)
Training Tokens: ~373,000
Training Steps: 1,000 iterations
Hardware: NVIDIA RTX 4070 12GB
Training Time: ~0.5 hours
Final Validation Loss: 0.2691
Train/Val Gap: 0.044 (excellent generalization)

Dataset Composition

The model was trained on 92 distinct categories of quantum circuits:

Single-Qubit Operations (14 categories):

Basic gates: H, X, Y, Z, S, T, Sdg, Tdg
Parametric rotations: RX, RY, RZ
Universal gates: U1, U2, U3

Two-Qubit Operations (11 categories):

Bell states (all 4 variants)
Entanglement: CNOT, CZ, SWAP, iSWAP
Controlled rotations

Three-Qubit Operations (6 categories):

GHZ states, W states
Toffoli, Fredkin gates

Quantum Algorithms (15 categories):

Deutsch-Jozsa, Grover's search
Phase estimation, QFT variants

Variational Circuits (15 categories):

VQE ansatzes, QAOA
Hardware-efficient ansatzes

Plus: Arithmetic circuits, error correction codes, special states, and more.

Prompt Format

The model was trained using explicit conversation delimiters:

<|user|>{description}<|end|>
<|assistant|>{qasm}<|end|>

These markers are literal text tokens, not special tokenizer tokens.

Generation should begin with:

<|assistant|>

and stop at the first occurrence of:

<|end|>

If <|assistant|> is omitted, generation quality may degrade.

Performance

Accuracy by Circuit Type from preliminary testing (just approximations!)

Circuit Type	Accuracy	Notes
Basic gates (H, X, Y, Z)	95-100%	Near-perfect on simple gates
2-qubit entanglement	90-95%	Strong on Bell states, CNOT patterns
3-qubit states (GHZ, W)	85-90%	Good semantic understanding
Arithmetic circuits	75-85%	Moderate accuracy on adders/incrementers
Complex algorithms	70-80%	Struggles with QFT, Grover's
4+ qubit circuits	60-70%	Limited training data for large systems

Example Test Results (Step 600)

Perfect Generation:

Prompt: "Apply Hadamard gate to single qubit"
Output: ✅ OPENQASM 2.0; ... h q[0]; measure q[0] -> c[0];

Prompt: "Create Bell state with two qubits"
Output: ✅ OPENQASM 2.0; ... h q[0]; cx q[0],q[1]; measure q -> c;

Prompt: "Generate GHZ state with three qubits"
Output: ✅ OPENQASM 2.0; ... h q[0]; cx q[0],q[1]; cx q[0],q[2]; measure q -> c;

Limitations

Qubit Count: Optimized for 1-3 qubit circuits. Performance degrades for 4+ qubits due to limited training data.
Complex Algorithms: May generate syntactically valid but semantically incorrect circuits for advanced algorithms (e.g., full quantum teleportation, complex QFT implementations).
Parametric Gates: Limited support for gates with specific angle parameters. May substitute similar gates (e.g., RY → Y, S → T).
No Execution Guarantee: Generated circuits are syntactically valid QASM 2.0 but not guaranteed to execute correctly on quantum hardware without validation.

Intended Use

Primary Use Cases

✅ Educational Tools: Generate example circuits for quantum computing education ✅ Rapid Prototyping: Quick circuit templates for experimentation ✅ Code Completion: Assist developers writing QASM code ✅ Benchmarking: Generate diverse circuits for compiler/simulator testing

Out of Scope

❌ Production Quantum Computing: Not suitable for critical quantum applications ❌ Large-Scale Circuits: Limited to small qubit counts (1-4 qubits) ❌ Hardware Deployment: Requires validation before running on actual quantum hardware

Training Data

The model was trained on a custom dataset of 8,129 quantum circuits:

Source: Synthetically generated via xAI Grok API with extensive quality control
Format: Natural language description → QASM 2.0 code pairs
Quality Control: 100% QASM syntax validation, SHA256 hash deduplication
Diversity: 11x augmentation via paraphrasing (10 variations + original)
Dataset Availability: merileijona/quantum-circuits-8k

Data Generation Pipeline

Master Generation: 739 unique circuits across 92 categories
Hash Deduplication: SHA256 hashing ensures zero duplicates
Description Augmentation: 10x paraphrasing for diversity
Validation: 100% QASM 2.0 syntax compliance
Train/Val/Test Split: 70/15/15

Ethical Considerations

No Safety Alignment: Model has not undergone safety fine-tuning
Hallucination Risk: May generate plausible but incorrect quantum circuits
Educational Purpose: Designed for learning, not production deployment
Verification Required: Always validate generated circuits before use

Citation

If you use this model in your research, please cite:

@misc{quantumgpt124m,
  author = {Merilehto, Juhani},
  title = {QuantumGPT-124M: Quantum Circuit Generation with GPT-2},
  year = {2026},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/merileijona/quantumgpt-124m}},
  note = {GPT-2 model trained on 8,129 quantum circuits for OpenQASM 2.0 generation}
}

Model Card Authors

Juhani Merilehto

HuggingFace: @merileijona
GitHub: @juhanimerilehto
Affiliation(s): University of Vaasa, School of Management; University of Turku, Faculty of Technology

License

This model is released under the MIT License.

Acknowledgments

Training Framework: Based on Andrej Karpathy's nanoGPT architecture
Data Generation: Powered by xAI Grok API
Tokenizer: Standard GPT-2 tokenizer (HuggingFace GPT2TokenizerFast)
Infrastructure: Trained on NVIDIA RTX 4070 12GB

Additional Resources

Dataset: merileijona/quantum-circuits-8k
Training Code: Available in model repository
Related Work: See papers on quantum circuit synthesis with LLMs

Model Version: 1.0
Release Date: February 2026
Last Updated: February 27, 2026

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Dataset used to train merileijona/quantumgpt-124m

Collection including merileijona/quantumgpt-124m

QuantumGPT — Quantum Circuit Generation

Collection

Three GPT-style models trained from scratch on two OpenQASM 2 datasets. • 5 items • Updated Mar 26