nuriyev commited on
Commit
c783a58
·
1 Parent(s): b629477

Add README and initial implementation for Chess Reasoner app

Browse files
Files changed (3) hide show
  1. README.md +43 -2
  2. app.py +481 -0
  3. requirements.txt +7 -0
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: Chess
3
- emoji: 🦀
4
  colorFrom: purple
5
  colorTo: gray
6
  sdk: gradio
@@ -9,6 +9,47 @@ app_file: app.py
9
  pinned: false
10
  license: mit
11
  short_description: Play against chess-playing reasoning LLM
 
 
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Chess
3
+ emoji: ♟️
4
  colorFrom: purple
5
  colorTo: gray
6
  sdk: gradio
 
9
  pinned: false
10
  license: mit
11
  short_description: Play against chess-playing reasoning LLM
12
+ models:
13
+ - nuriyev/chess-reasoner
14
+ datasets:
15
+ - nuriyev/chess-reasoning
16
  ---
17
 
18
+ # Chess Reasoner
19
+
20
+ Play chess against a reasoning LLM! This demo showcases **[nuriyev/chess-reasoner](https://huggingface.co/nuriyev/chess-reasoner)**, a Qwen3-4B model fine-tuned to output structured reasoning before selecting moves.
21
+
22
+ ## 🎮 How to Play
23
+
24
+ 1. **You play as White** - click on pieces to move them
25
+ 2. **AI plays as Black** - the model will respond with its move
26
+ 3. **View AI Reasoning** - expand the "🧠 AI Reasoning" accordion to see the model's thought process
27
+ 4. **AI First** - click this button if you want the AI to play White instead
28
+
29
+ ## 🧠 Model Details
30
+
31
+ | Attribute | Value |
32
+ |-----------|-------|
33
+ | Base Model | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) |
34
+ | Training | SFT with LoRA (r=32) on reasoning traces |
35
+ | Dataset | [nuriyev/chess-reasoning](https://huggingface.co/datasets/nuriyev/chess-reasoning) |
36
+ | Output Format | `<think>reasoning</think><uci_move>move</uci_move>` |
37
+
38
+ ## 📋 Output Format
39
+
40
+ The model outputs structured reasoning:
41
+ ```
42
+ <think>The opponent left their queen undefended. Taking it wins material.</think>
43
+ <uci_move>d4d8</uci_move>
44
+ ```
45
+
46
+ ## ⚠️ Limitations
47
+
48
+ This is an **SFT checkpoint** focused on format alignment. The model outputs valid reasoning but hasn't been optimized for chess strength via reinforcement learning yet. A GRPO stage using Stockfish rewards is planned.
49
+
50
+ ## 🔗 Links
51
+
52
+ - [Model Card](https://huggingface.co/nuriyev/chess-reasoner)
53
+ - [LoRA Adapter](https://huggingface.co/nuriyev/chess-reasoner-lora)
54
+ - [Training Dataset](https://huggingface.co/datasets/nuriyev/chess-reasoning)
55
+ - [Training Code](https://colab.research.google.com/drive/1koRx4Aa8AzA1HGwvEFYll9dWmw0hyVzo)
app.py ADDED
@@ -0,0 +1,481 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import re
2
+ import chess
3
+ import gradio as gr
4
+ from jinja2 import Template
5
+ from gradio_chessboard import Chessboard
6
+ import torch
7
+ from transformers import AutoModelForCausalLM, AutoTokenizer
8
+
9
+ # ============================================================================
10
+ # Model Loading
11
+ # ============================================================================
12
+
13
+ MODEL_ID = "nuriyev/chess-reasoner"
14
+
15
+ print("Loading model...")
16
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
17
+ model = AutoModelForCausalLM.from_pretrained(
18
+ MODEL_ID,
19
+ torch_dtype=torch.float16,
20
+ device_map="auto",
21
+ trust_remote_code=True,
22
+ )
23
+ model.eval()
24
+ print("Model loaded!")
25
+
26
+ # Custom chat template (matching training)
27
+ CHAT_TEMPLATE = """{%- if messages[0].role == 'system' %}
28
+ {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
29
+ {%- endif %}
30
+ {%- for message in messages %}
31
+ {%- if message.role == 'user' or (message.role == 'system' and not loop.first) %}
32
+ {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>\n' }}
33
+ {%- elif message.role == 'assistant' %}
34
+ {{- '<|im_start|>assistant\n' + message.content + '<|im_end|>\n' }}
35
+ {%- endif %}
36
+ {%- endfor %}
37
+ {%- if add_generation_prompt %}
38
+ {{- '<|im_start|>assistant\n<think>\n' }}
39
+ {%- endif %}"""
40
+
41
+ tokenizer.chat_template = CHAT_TEMPLATE
42
+
43
+ # ============================================================================
44
+ # Chess Rendering (matching training exactly)
45
+ # ============================================================================
46
+
47
+ UNICODE_PIECES = {
48
+ 'P': '♙', 'R': '♖', 'N': '♘', 'B': '♗', 'Q': '♕', 'K': '♔',
49
+ 'p': '♟', 'r': '♜', 'n': '♞', 'b': '♝', 'q': '♛', 'k': '♚',
50
+ }
51
+
52
+
53
+ def render_board_unicode(board: chess.Board) -> str:
54
+ """Render the chess board using Unicode pieces (matching training format)."""
55
+ lines = []
56
+ files = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
57
+ ranks = ['8', '7', '6', '5', '4', '3', '2', '1']
58
+
59
+ coord_parts = [f" {file} " for file in files]
60
+ coord_line = " " + "".join(coord_parts) + " "
61
+ lines.append(coord_line)
62
+
63
+ border_width = len(files) * 3
64
+ lines.append(" +" + "-" * border_width + "+")
65
+
66
+ for rank_idx, rank in enumerate(ranks):
67
+ line_parts = [f"{rank} |"]
68
+ for file_idx, file in enumerate(files):
69
+ square = chess.parse_square(file + rank)
70
+ piece = board.piece_at(square)
71
+ piece_char = "·" if piece is None else UNICODE_PIECES[piece.symbol(
72
+ )]
73
+ line_parts.append(f" {piece_char} ")
74
+ line_parts.append(f"| {rank}")
75
+ lines.append("".join(line_parts))
76
+
77
+ lines.append(" +" + "-" * border_width + "+")
78
+ lines.append(coord_line)
79
+ return "\n".join(lines)
80
+
81
+
82
+ # ============================================================================
83
+ # Prompts (matching training exactly)
84
+ # ============================================================================
85
+
86
+ SYSTEM_PROMPT = """You are an expert chess player.
87
+
88
+ Given a current game state, you must select the best next move. Think in 1-2 sentences, then output your chosen move.
89
+
90
+ Output format:
91
+ <think>brief thinking (2 sentences max)</think>
92
+ <uci_move>your_move</uci_move>"""
93
+
94
+ USER_PROMPT = Template("""Here is the current game state
95
+ Board (Fen): {{ fen }}
96
+ Turn: It is your turn ({{ turn }})
97
+ Board (Unicode):
98
+ {{ board_utf }}""")
99
+
100
+
101
+ # ============================================================================
102
+ # Model Inference
103
+ # ============================================================================
104
+
105
+ def get_model_move(fen: str) -> tuple[str, str, str]:
106
+ """Get model's move for the given position. Returns (uci_move, reasoning, raw_output)."""
107
+ board = chess.Board(fen)
108
+ turn = "white" if board.turn else "black"
109
+
110
+ messages = [
111
+ {"role": "system", "content": SYSTEM_PROMPT},
112
+ {"role": "user", "content": USER_PROMPT.render(
113
+ fen=fen,
114
+ board_utf=render_board_unicode(board),
115
+ turn=turn,
116
+ )},
117
+ ]
118
+
119
+ text = tokenizer.apply_chat_template(
120
+ messages,
121
+ tokenize=False,
122
+ add_generation_prompt=True,
123
+ )
124
+
125
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
126
+
127
+ with torch.no_grad():
128
+ outputs = model.generate(
129
+ **inputs,
130
+ max_new_tokens=256,
131
+ temperature=0.7,
132
+ top_p=0.8,
133
+ top_k=20,
134
+ do_sample=True,
135
+ pad_token_id=tokenizer.pad_token_id,
136
+ )
137
+
138
+ generated = tokenizer.decode(
139
+ outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
140
+
141
+ # Parse the output
142
+ think_match = re.search(r'<think>(.*?)</think>', generated, re.DOTALL)
143
+ move_match = re.search(r'<uci_move>(.*?)</uci_move>', generated)
144
+
145
+ reasoning = think_match.group(1).strip(
146
+ ) if think_match else "No reasoning provided"
147
+ uci_move = move_match.group(1).strip() if move_match else None
148
+
149
+ # Clean up raw output for display
150
+ raw_output = generated.split('<|im_end|>')[0].strip()
151
+
152
+ return uci_move, reasoning, raw_output
153
+
154
+
155
+ # ============================================================================
156
+ # Game State
157
+ # ============================================================================
158
+
159
+ def create_initial_state():
160
+ return {
161
+ "board": chess.Board(),
162
+ "history": [],
163
+ "last_reasoning": "",
164
+ "last_raw_output": "",
165
+ "game_over": False,
166
+ "result": "",
167
+ }
168
+
169
+
170
+ # ============================================================================
171
+ # Game Logic
172
+ # ============================================================================
173
+
174
+ def make_player_move(fen: str, state: dict) -> tuple[str, dict, str, str, str]:
175
+ """Handle player's move from the chessboard."""
176
+ if state["game_over"]:
177
+ return state["board"].fen(), state, get_status(state), state["last_reasoning"], state["last_raw_output"]
178
+
179
+ board = chess.Board(fen)
180
+ state["board"] = board
181
+ state["history"].append(fen)
182
+
183
+ # Check if game is over after player move
184
+ if board.is_game_over():
185
+ state["game_over"] = True
186
+ state["result"] = get_game_result(board)
187
+ return board.fen(), state, get_status(state), state["last_reasoning"], state["last_raw_output"]
188
+
189
+ # AI's turn (Black)
190
+ if not board.turn: # Black's turn
191
+ uci_move, reasoning, raw_output = get_model_move(board.fen())
192
+ state["last_reasoning"] = reasoning
193
+ state["last_raw_output"] = raw_output
194
+
195
+ if uci_move:
196
+ try:
197
+ move = chess.Move.from_uci(uci_move)
198
+ if move in board.legal_moves:
199
+ board.push(move)
200
+ state["board"] = board
201
+ state["history"].append(board.fen())
202
+ else:
203
+ # Try to find a legal move that starts with the same piece
204
+ state["last_reasoning"] = f"Model suggested illegal move: {uci_move}. " + reasoning
205
+ except:
206
+ state["last_reasoning"] = f"Model output invalid move format: {uci_move}. " + reasoning
207
+
208
+ # Check if game is over after AI move
209
+ if board.is_game_over():
210
+ state["game_over"] = True
211
+ state["result"] = get_game_result(board)
212
+
213
+ return board.fen(), state, get_status(state), state["last_reasoning"], state["last_raw_output"]
214
+
215
+
216
+ def get_game_result(board: chess.Board) -> str:
217
+ """Get the game result string."""
218
+ if board.is_checkmate():
219
+ winner = "Black" if board.turn else "White"
220
+ return f"Checkmate! {winner} wins!"
221
+ elif board.is_stalemate():
222
+ return "Stalemate! It's a draw."
223
+ elif board.is_insufficient_material():
224
+ return "Draw by insufficient material."
225
+ elif board.is_fifty_moves():
226
+ return "Draw by fifty-move rule."
227
+ elif board.is_repetition():
228
+ return "Draw by repetition."
229
+ return "Game Over"
230
+
231
+
232
+ def get_status(state: dict) -> str:
233
+ """Get current game status."""
234
+ if state["game_over"]:
235
+ return f"🏁 {state['result']}"
236
+
237
+ board = state["board"]
238
+ turn = "White (You)" if board.turn else "Black (AI)"
239
+
240
+ status = f"**Turn:** {turn}"
241
+ if board.is_check():
242
+ status += " ⚠️ **CHECK!**"
243
+
244
+ move_count = len(state["history"])
245
+ status += f"\n**Move:** {move_count // 2 + 1}"
246
+
247
+ return status
248
+
249
+
250
+ def new_game() -> tuple[str, dict, str, str, str]:
251
+ """Start a new game."""
252
+ state = create_initial_state()
253
+ return state["board"].fen(), state, get_status(state), "", ""
254
+
255
+
256
+ def ai_first_move(state: dict) -> tuple[str, dict, str, str, str]:
257
+ """Let AI make the first move (play as Black)."""
258
+ board = state["board"]
259
+
260
+ if len(state["history"]) > 0:
261
+ return board.fen(), state, get_status(state) + "\n⚠️ Game already started!", state["last_reasoning"], state["last_raw_output"]
262
+
263
+ uci_move, reasoning, raw_output = get_model_move(board.fen())
264
+ state["last_reasoning"] = reasoning
265
+ state["last_raw_output"] = raw_output
266
+
267
+ if uci_move:
268
+ try:
269
+ move = chess.Move.from_uci(uci_move)
270
+ if move in board.legal_moves:
271
+ board.push(move)
272
+ state["board"] = board
273
+ state["history"].append(board.fen())
274
+ except:
275
+ pass
276
+
277
+ return board.fen(), state, get_status(state), reasoning, raw_output
278
+
279
+
280
+ # ============================================================================
281
+ # Custom CSS for chess.com-like appearance
282
+ # ============================================================================
283
+
284
+ CUSTOM_CSS = """
285
+ /* Main container */
286
+ .gradio-container {
287
+ max-width: 1200px !important;
288
+ margin: auto !important;
289
+ font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif !important;
290
+ }
291
+
292
+ /* Header styling */
293
+ .header-title {
294
+ text-align: center;
295
+ color: #769656;
296
+ font-size: 2.5em;
297
+ font-weight: bold;
298
+ margin-bottom: 0.2em;
299
+ text-shadow: 2px 2px 4px rgba(0,0,0,0.1);
300
+ }
301
+
302
+ .header-subtitle {
303
+ text-align: center;
304
+ color: #666;
305
+ font-size: 1.1em;
306
+ margin-bottom: 1em;
307
+ }
308
+
309
+ /* Game panel */
310
+ .game-panel {
311
+ background: linear-gradient(145deg, #312e2b, #272522);
312
+ border-radius: 12px;
313
+ padding: 20px;
314
+ box-shadow: 0 8px 32px rgba(0,0,0,0.3);
315
+ }
316
+
317
+ /* Status box */
318
+ .status-box {
319
+ background: #1a1916;
320
+ border-radius: 8px;
321
+ padding: 15px;
322
+ color: #fff;
323
+ font-size: 1.1em;
324
+ border-left: 4px solid #769656;
325
+ }
326
+
327
+ /* Reasoning box */
328
+ .reasoning-box {
329
+ background: #262421;
330
+ border-radius: 8px;
331
+ padding: 15px;
332
+ color: #bababa;
333
+ font-family: 'Courier New', monospace;
334
+ font-size: 0.95em;
335
+ max-height: 200px;
336
+ overflow-y: auto;
337
+ }
338
+
339
+ /* Buttons */
340
+ .game-button {
341
+ background: #769656 !important;
342
+ color: white !important;
343
+ border: none !important;
344
+ border-radius: 6px !important;
345
+ padding: 10px 20px !important;
346
+ font-weight: bold !important;
347
+ transition: all 0.2s ease !important;
348
+ }
349
+
350
+ .game-button:hover {
351
+ background: #8bac6a !important;
352
+ transform: translateY(-1px) !important;
353
+ }
354
+
355
+ .secondary-button {
356
+ background: #4a4745 !important;
357
+ color: #bababa !important;
358
+ }
359
+
360
+ .secondary-button:hover {
361
+ background: #5a5755 !important;
362
+ }
363
+
364
+ /* Accordion */
365
+ .reasoning-accordion {
366
+ background: #1a1916 !important;
367
+ border: 1px solid #333 !important;
368
+ border-radius: 8px !important;
369
+ }
370
+
371
+ /* Footer */
372
+ .footer-text {
373
+ text-align: center;
374
+ color: #666;
375
+ font-size: 0.9em;
376
+ margin-top: 1em;
377
+ }
378
+ """
379
+
380
+ # ============================================================================
381
+ # Gradio Interface
382
+ # ============================================================================
383
+
384
+ with gr.Blocks(css=CUSTOM_CSS, title="Chess Reasoner", theme=gr.themes.Soft(
385
+ primary_hue="green",
386
+ secondary_hue="gray",
387
+ neutral_hue="gray",
388
+ )) as demo:
389
+
390
+ # State
391
+ game_state = gr.State(create_initial_state)
392
+
393
+ # Header
394
+ gr.HTML("""
395
+ <div class="header-title">♟️ Chess Reasoner</div>
396
+ <div class="header-subtitle">Play chess against a reasoning AI • You play as White</div>
397
+ """)
398
+
399
+ with gr.Row():
400
+ # Left: Chessboard
401
+ with gr.Column(scale=3):
402
+ chessboard = Chessboard(
403
+ value=chess.STARTING_FEN,
404
+ label="",
405
+ interactive=True,
406
+ game_mode=True,
407
+ )
408
+
409
+ # Right: Game controls and info
410
+ with gr.Column(scale=2):
411
+ with gr.Group(elem_classes="game-panel"):
412
+ # Status
413
+ gr.Markdown("### 📊 Game Status")
414
+ status_display = gr.Markdown(
415
+ value="**Turn:** White (You)\n**Move:** 1",
416
+ elem_classes="status-box"
417
+ )
418
+
419
+ gr.Markdown("---")
420
+
421
+ # Controls
422
+ with gr.Row():
423
+ new_game_btn = gr.Button(
424
+ "🔄 New Game", elem_classes="game-button", size="lg")
425
+ ai_first_btn = gr.Button(
426
+ "🤖 AI First", elem_classes="secondary-button", size="lg")
427
+
428
+ gr.Markdown("---")
429
+
430
+ # AI Reasoning (collapsible)
431
+ with gr.Accordion("🧠 AI Reasoning", open=True, elem_classes="reasoning-accordion"):
432
+ reasoning_display = gr.Textbox(
433
+ value="",
434
+ label="Thinking",
435
+ lines=3,
436
+ interactive=False,
437
+ elem_classes="reasoning-box"
438
+ )
439
+
440
+ with gr.Accordion("📝 Raw Output", open=False):
441
+ raw_output_display = gr.Textbox(
442
+ value="",
443
+ label="Model Output",
444
+ lines=5,
445
+ interactive=False,
446
+ elem_classes="reasoning-box"
447
+ )
448
+
449
+ # Footer
450
+ gr.HTML("""
451
+ <div class="footer-text">
452
+ Model: <a href="https://huggingface.co/nuriyev/chess-reasoner" target="_blank">nuriyev/chess-reasoner</a>
453
+ • Fine-tuned from Qwen3-4B-Instruct • SFT Phase 1
454
+ </div>
455
+ """)
456
+
457
+ # Event handlers
458
+ chessboard.change(
459
+ fn=make_player_move,
460
+ inputs=[chessboard, game_state],
461
+ outputs=[chessboard, game_state, status_display,
462
+ reasoning_display, raw_output_display],
463
+ )
464
+
465
+ new_game_btn.click(
466
+ fn=new_game,
467
+ inputs=[],
468
+ outputs=[chessboard, game_state, status_display,
469
+ reasoning_display, raw_output_display],
470
+ )
471
+
472
+ ai_first_btn.click(
473
+ fn=ai_first_move,
474
+ inputs=[game_state],
475
+ outputs=[chessboard, game_state, status_display,
476
+ reasoning_display, raw_output_display],
477
+ )
478
+
479
+
480
+ if __name__ == "__main__":
481
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ gradio>=4.0.0
2
+ transformers>=4.56.2
3
+ torch>=2.9.0
4
+ accelerate>=1.12.0
5
+ chess>=1.11.2
6
+ jinja2>=3.1.6
7
+ gradio_chessboard==0.0.10