|
|
--- |
|
|
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
|
library_name: peft |
|
|
tags: |
|
|
- chess |
|
|
- tinyllama |
|
|
- lora |
|
|
- json |
|
|
- alpaca-format |
|
|
- ai-tournament |
|
|
- aura |
|
|
--- |
|
|
|
|
|
# βοΈ Konvah's Chess TinyLlama |
|
|
|
|
|
This model is a fine-tuned version of [`TinyLlama/TinyLlama-1.1B-Chat-v1.0`](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) using LoRA for the **Aura Chess AI Tournament**. It predicts high-quality chess moves in JSON format, given a move history, color, and a list of legal moves. |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ Model Objective |
|
|
|
|
|
The model learns to: |
|
|
- Choose the best legal move (`move`) |
|
|
- Give a short explanation (`reasoning`) in β€10 words |
|
|
- Format responses as valid JSON |
|
|
- Respond in `[INST] ... [/INST]` format |
|
|
|
|
|
--- |
|
|
|
|
|
## π‘ Input Format |
|
|
|
|
|
The model uses structured prompts: |
|
|
|
|
|
```json |
|
|
[INST] |
|
|
You are a chess player. |
|
|
{"moveHistory": ["e4", "e5", "Nf3"], "possibleMoves": ["Nc3", "Bc4", "d4"], "color": "w"} |
|
|
[/INST] |
|
|
|
|
|
π― Output Format |
|
|
Always a single-line JSON: |
|
|
|
|
|
json |
|
|
Copy |
|
|
Edit |
|
|
{"move": "Bc4", "reasoning": "Develops bishop and targets f7"} |
|
|
The move must be from possibleMoves |
|
|
|
|
|
The reasoning is free-form but short |
|
|
|
|
|
π οΈ Training Details |
|
|
Base: TinyLlama-1.1B-Chat |
|
|
|
|
|
LoRA (8-bit): q_proj, k_proj, v_proj, o_proj |
|
|
|
|
|
Epochs: 3 |
|
|
|
|
|
Dataset: ~70 samples from master-level PGNs |
|
|
|
|
|
Format: instruction-style using transformers.Trainer |
|
|
|
|
|
π Performance |
|
|
| Metric | Value | |
|
|
| ----------- | ----- | |
|
|
| Final loss | 1.08 | |
|
|
| Epochs | 3 | |
|
|
| Batch size | 1 | |
|
|
| Total steps | 51 | |
|
|
|
|
|
π Usage |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("Konvah/chess-tinyllama") |
|
|
tokenizer = AutoTokenizer.from_pretrained("Konvah/chess-tinyllama") |
|
|
|
|
|
prompt = """[INST] |
|
|
You are a chess player. |
|
|
{"moveHistory": ["e4", "e5", "Nf3"], "possibleMoves": ["Nc3", "Bc4", "d4"], "color": "w"} |
|
|
[/INST]""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_new_tokens=50) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
|
|
π License |
|
|
Open for research and tournament evaluation. Not intended for production without additional safety testing. |
|
|
|
|
|
βοΈ Author |
|
|
Ismail Abubakar (@boringcrypto_) |
|
|
|
|
|
Contact: abuismail842@gmail.com |
|
|
|
|
|
π Aura Tournament |
|
|
This model was created for the Aura Chess LLM Tournament to demonstrate reasoning and strategy prediction using open-source LLMs. |
|
|
|
|
|
--- |
|
|
|
|
|
|