Instructions to use clarkkitchen22/Pokemon-Red-Qwen3-80B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use clarkkitchen22/Pokemon-Red-Qwen3-80B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="clarkkitchen22/Pokemon-Red-Qwen3-80B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("clarkkitchen22/Pokemon-Red-Qwen3-80B")
model = AutoModelForCausalLM.from_pretrained("clarkkitchen22/Pokemon-Red-Qwen3-80B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use clarkkitchen22/Pokemon-Red-Qwen3-80B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "clarkkitchen22/Pokemon-Red-Qwen3-80B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clarkkitchen22/Pokemon-Red-Qwen3-80B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/clarkkitchen22/Pokemon-Red-Qwen3-80B

SGLang

How to use clarkkitchen22/Pokemon-Red-Qwen3-80B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "clarkkitchen22/Pokemon-Red-Qwen3-80B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clarkkitchen22/Pokemon-Red-Qwen3-80B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "clarkkitchen22/Pokemon-Red-Qwen3-80B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clarkkitchen22/Pokemon-Red-Qwen3-80B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use clarkkitchen22/Pokemon-Red-Qwen3-80B with Docker Model Runner:
```
docker model run hf.co/clarkkitchen22/Pokemon-Red-Qwen3-80B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Pokemon Red Strategic Commander (Qwen3-Coder-Next 80B Merged)

An AI-powered strategic brain for Pokemon Red, fine-tuned from Qwen3-Coder-Next (80B total / 3B active MoE) using QLoRA — full-precision merged weights.

This is the full merged model (BFloat16 safetensors). For a quantized version, see the GGUF / 4B variant.

Model Description

This model is a QLoRA fine-tune of Qwen/Qwen3-Coder-Next with LoRA adapters merged back into the full-precision weights. It provides expert-level Pokemon Red gameplay guidance — analyzing game state and providing actionable strategic recommendations.

Rather than playing the game directly, it acts as an expert advisory system for Gen 1 Pokemon battles, team building, route planning, and overall strategy.

Architecture

Parameter	Value
Architecture	Qwen3-Coder-Next (Hybrid MoE)
Total Parameters	80B
Active Parameters	3B (MoE routing)
Hidden Dimension	2048
Layers	48 (Hybrid: Gated DeltaNet + Gated Attention + MoE)
Experts	512 total, 10 active + 1 shared
Context Length	262,144 tokens
Precision	BFloat16

Training Details

Parameter	Value
Base Model	Qwen/Qwen3-Coder-Next (80B total, 3B active MoE)
Method	QLoRA (4-bit quantization during training)
LoRA Rank	8
LoRA Alpha	16
Target Modules	q_proj, k_proj, v_proj, o_proj
Trainable Parameters	2,064,384 / 79,676,455,680 (0.003%)
Training Examples	~1,000 (903 train / 53 val / 48 test)
Epochs	3
Batch Size	16 (1 x 16 grad accum)
Learning Rate	2e-4 (cosine schedule)
Optimizer	Paged AdamW 8-bit
Precision	BFloat16
Hardware	NVIDIA H100 80GB HBM3
Framework	Unsloth 2026.2.1 + PyTorch 2.6.0

Loss Curve

Step	Loss	Epoch
50	0.3827	0.89
60	0.3216	1.05
70	0.2321	1.23
80	0.2227	1.41
90	0.2546	1.58
110	0.1795	1.94
120	0.2046	2.11
130	0.2135	2.28
150	0.2212	2.64
160	0.1703	2.82

Training Data

Trained on ~1,000 curated instruction-response pairs (903 train / 53 val / 48 test) covering:

Pokedex Knowledge — Stats, types, evolution chains for all 151 Gen 1 Pokemon
Move Knowledge — Move stats, type effectiveness, PP management
Battle Strategy — Type matchups, damage calculation, switch decisions
Team Building — Optimal team compositions, coverage analysis
Route Planning — Efficient progression through Kanto
Gym Strategy — Leader teams, weaknesses, recommended counters
Elite Four — Championship preparation and strategy
Game Mechanics — Gen 1 quirks (badge boosts, Ghost/Psychic bug, crit formula, etc.)
Speedrun Tactics — Optimized routing and execution

Data was sourced from PokeAPI, Bulbapedia, and the pret/pokered disassembly project, then converted into ChatML-formatted instruction pairs.

Usage

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "clarkkitchen22/Pokemon-Red-Qwen3-80B",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(
    "clarkkitchen22/Pokemon-Red-Qwen3-80B"
)

messages = [
    {"role": "system", "content": "You are the Strategic Commander for a Pokemon Red playthrough. Analyze the game state and provide optimal decisions."},
    {"role": "user", "content": "I'm about to fight Misty. My team is Charmeleon (lv 22) with Ember, Slash, Leer, Rage. What should I do?"},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With vLLM

vllm serve clarkkitchen22/Pokemon-Red-Qwen3-80B \
    --port 8000 \
    --tensor-parallel-size 2 \
    --enable-auto-tool-choice \
    --tool-call-parser qwen3_coder

With SGLang

python -m sglang.launch_server \
    --model clarkkitchen22/Pokemon-Red-Qwen3-80B \
    --port 30000 \
    --tp-size 2 \
    --tool-call-parser qwen3_coder

Convert to GGUF

You can quantize this model yourself using llama.cpp:

# Pull the model
git lfs install
git clone https://huggingface.co/clarkkitchen22/Pokemon-Red-Qwen3-80B

# Build llama.cpp and convert
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make -j

# Convert and quantize
python convert_hf_to_gguf.py ../Pokemon-Red-Qwen3-80B --outtype q4_k_m

Related Models

Variant	Description	Link
Full Merged (this)	BFloat16 safetensors, 80B params	Pokemon-Red-Qwen3-80B
4B + GGUF	Smaller model + Q4_K_M quantized	pokemon-red-commander-qwen3-4b

Intended Use

Pokemon Red gameplay advisory and strategy analysis
Educational demonstration of QLoRA fine-tuning on large MoE models
Game AI research

Limitations

Trained exclusively on Gen 1 (Pokemon Red/Blue) data — may hallucinate about later generations
Small training set (~1,000 examples) — responses may lack diversity
Strategic advice quality depends on accurate game state description
Not designed for direct game control — provides text-based recommendations only
Full model requires significant VRAM (~160 GB for BF16, or use quantization/offloading)

Citation

@misc{pokemon-red-commander-2026,
  title={Pokemon Red Strategic Commander},
  author={clarkkitchen22},
  year={2026},
  url={https://huggingface.co/clarkkitchen22/Pokemon-Red-Qwen3-80B}
}

License

MIT

Downloads last month: 10

Safetensors

Model size

80B params

Tensor type

BF16

Model tree for clarkkitchen22/Pokemon-Red-Qwen3-80B

Base model

Qwen/Qwen3-Coder-Next

Finetuned

(34)

this model

Quantizations

2 models