affine-l / README.md
penva's picture
Upload folder using huggingface_hub
e8be33f verified
metadata
license: apache-2.0
base_model: WebScraper991923/Affine-S3
tags:
  - qwen3
  - affine
  - game
  - reinforcement-learning
  - openspiel

Affine-S3-GAME-Improved

Fine-tuned version of WebScraper991923/Affine-S3 with improved GAME (OpenSpiel) performance for Bittensor Subnet 120 (Affine).

Model Details

  • Base Model: WebScraper991923/Affine-S3 (Qwen3-4B)
  • Training: LoRA fine-tuning on 7,071 MCTS-generated game examples
  • Target: Improved strategic game-playing for Affine evaluation

Training Details

  • Method: LoRA (r=32, alpha=32)
  • Data: 7,071 examples from MCTS self-play across 9 games:
    • checkers (2,702 examples)
    • gin_rummy (1,896 examples)
    • othello (1,209 examples)
    • quoridor, phantom_ttt, hex, dots_and_boxes, leduc_poker, liars_dice
  • Epochs: 2
  • Final Loss: 0.024

Performance

Benchmark Base Model This Model
GAME Accuracy ~30% 76%
LGC 99.9% 99.9% (preserved)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("altro/Affine-S3-GAME", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("altro/Affine-S3-GAME")

Affine Competition

This model is designed for Bittensor Subnet 120 (Affine), which rewards models that dominate the Pareto frontier across multiple RL evaluation tasks.