Spaces:

nph4rd
/

tiny-hanabi

Sleeping

nph4rd commited on Feb 21

Commit

ad995e5

1 Parent(s): 41ab4e6

update model

Files changed (2) hide show

README.md CHANGED Viewed

@@ -32,4 +32,4 @@ Play a simplified version of Hanabi with a trained AI model!
 - **Hint:** `1HR`, `1HG`, `1H1`, `1H2`, `1H3` - Tell the AI about their Red/Green cards or their 1s/2s/3s
 ## Model
-The AI uses [nph4rd/Qwen3-0.6B-Tiny-Hanabi-RL-300](https://huggingface.co/nph4rd/Qwen3-0.6B-Tiny-Hanabi-RL-300), a Qwen3-0.6B model fine-tuned with reinforcement learning on this Tiny Hanabi environment.

 - **Hint:** `1HR`, `1HG`, `1H1`, `1H2`, `1H3` - Tell the AI about their Red/Green cards or their 1s/2s/3s
 ## Model
+The AI uses [nph4rd/Qwen3-1.7B-Tiny-Hanabi-XML-RL-12-2](https://huggingface.co/nph4rd/Qwen3-1.7B-Tiny-Hanabi-XML-RL-12-2), a Qwen3-1.7B model fine-tuned with reinforcement learning on this Tiny Hanabi environment.

app.py CHANGED Viewed

@@ -15,7 +15,7 @@ RANKS = (1, 2, 3)
 HAND_SIZE = 2
 MAX_INFO_TOKENS = 8
 MAX_LIFE_TOKENS = 3
-MODEL_ID = "nph4rd/Qwen3-0.6B-Tiny-Hanabi-RL-300"
 COLOR_NAMES = {"R": "Red", "G": "Green"}
 COLOR_HEX = {"R": "#e63946", "G": "#2a9d8f"}
@@ -669,7 +669,7 @@ def load_model():
         tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
         model = AutoModelForCausalLM.from_pretrained(
             MODEL_ID,
-            torch_dtype=torch.float16,
             device_map="auto",
         )
         model.eval()

 HAND_SIZE = 2
 MAX_INFO_TOKENS = 8
 MAX_LIFE_TOKENS = 3
+MODEL_ID = "nph4rd/Qwen3-1.7B-Tiny-Hanabi-XML-RL-12-2"
 COLOR_NAMES = {"R": "Red", "G": "Green"}
 COLOR_HEX = {"R": "#e63946", "G": "#2a9d8f"}
         tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
         model = AutoModelForCausalLM.from_pretrained(
             MODEL_ID,
+            torch_dtype=torch.float32,
             device_map="auto",
         )
         model.eval()