Poker & Blackjack AI โ€” Gemma 4 E4B LoRA

Fine-tuned Gemma 4 E4B (7.5B dense) for poker and blackjack decision-making.

What This Model Does

Given a poker or blackjack game state, the model outputs the optimal action as JSON.

Training Details

  • Base model: unsloth/gemma-4-E4B-it (7.5B dense)
  • Method: LoRA (r=16, alpha=32)
  • Data: 12,848 examples (3,072 poker + 9,776 blackjack)
  • Training: 3 epochs on NVIDIA RTX 3090 24GB
  • Final metrics: Loss 0.099, Token accuracy 96.4%
  • Cost: ~$1.32 on RunPod

Arena Results (1000 hands poker)

  • BB/100: -0.1 (breakeven over 1000 hands)
  • VPIP: 80.5% (plays too many hands โ€” GRPO fix planned)
  • Beats CallingStation, survives against ExploitBot/NitBot

Usage

# Quantize to GGUF Q4_K_M (~5GB), serve with llama.cpp:
llama-server --model gemma4-poker-q4_k_m.gguf --port 8080 --n-gpu-layers 999 --jinja

Disable thinking mode: {"chat_template_kwargs": {"enable_thinking": false}}

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for waltgrace/poker-gemma4-e4b-lora

Adapter
(20)
this model