Instructions to use vynr1504/Magnus-LoRA-Adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use vynr1504/Magnus-LoRA-Adapter with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/Llama-3.1-8B-Instruct-bnb-4bit") model = PeftModel.from_pretrained(base_model, "vynr1504/Magnus-LoRA-Adapter") - Transformers
How to use vynr1504/Magnus-LoRA-Adapter with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="vynr1504/Magnus-LoRA-Adapter") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("vynr1504/Magnus-LoRA-Adapter", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use vynr1504/Magnus-LoRA-Adapter with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "vynr1504/Magnus-LoRA-Adapter" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vynr1504/Magnus-LoRA-Adapter", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/vynr1504/Magnus-LoRA-Adapter
- SGLang
How to use vynr1504/Magnus-LoRA-Adapter with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "vynr1504/Magnus-LoRA-Adapter" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vynr1504/Magnus-LoRA-Adapter", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "vynr1504/Magnus-LoRA-Adapter" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vynr1504/Magnus-LoRA-Adapter", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use vynr1504/Magnus-LoRA-Adapter with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for vynr1504/Magnus-LoRA-Adapter to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for vynr1504/Magnus-LoRA-Adapter to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for vynr1504/Magnus-LoRA-Adapter to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="vynr1504/Magnus-LoRA-Adapter", max_seq_length=2048, ) - Docker Model Runner
How to use vynr1504/Magnus-LoRA-Adapter with Docker Model Runner:
docker model run hf.co/vynr1504/Magnus-LoRA-Adapter
Magnus LoRA Adapter
A Low-Rank Adaptation (LoRA) fine-tuned adapter for the Llama-3.1-8B-Instruct model, optimized for chess move prediction using Magnus Carlsen's game dataset. This adapter enables the model to understand and predict chess positions and strategies using 4-bit quantization via BNB (Bits and Bytes).
Model Details
- Base Model: unsloth/Llama-3.1-8B-Instruct-bnb-4bit
- Adapter Type: LoRA (Low-Rank Adaptation)
- Library: PEFT
- Training Framework: TRL + Unsloth
- License: Llama 3.1
Dataset
This adapter was trained on a specialized dataset compiled from Magnus Carlsen chess matches. The dataset contains:
First Training Attempt
- Total Moves: 1,123 moves from Magnus Carlsen's games
Second Training Attempt
- Total Moves: 2,145 moves from Magnus Carlsen's games
Dataset Features
- Instruction-based chess analysis tasks - Predicting moves from FEN (Forsyth-Edwards Notation) positions
- Real match positions - Game states from actual matches played by Magnus Carlsen
- Training examples - Position-move pairs representing Magnus's playing style and strategies
The dataset format follows the supervised fine-tuning (SFT) structure:
{
"instruction": "Predict Magnus Carlsen's next move from the given chess position.",
"input": "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1",
"output": "e2e3"
}
This approach allows the model to learn Magnus Carlsen's distinctive playing patterns, move preferences, and strategic insights from his games.
Why LoRA? - Benefits of This Approach
LoRA (Low-Rank Adaptation) offers significant advantages for fine-tuning large language models:
Efficiency
- Reduced Parameters: Only ~0.5-2% of the base model parameters need to be trained, dramatically reducing memory requirements
- Faster Training: Significantly faster training times compared to full fine-tuning
- Lower Cost: Enables fine-tuning on consumer-grade hardware (4-bit quantization compatible)
Flexibility & Modularity
- Composable Adapters: Multiple LoRA adapters can be applied or switched easily without retraining
- Storage Efficient: Adapter files are typically 10-50MB vs. GB-sized full model checkpoints
- Easy Distribution: Lightweight adapters can be easily shared and deployed
Performance
- Quality Retention: Maintains the base model's general capabilities while specializing for specific tasks
- Domain Adaptation: Effectively transfers knowledge from chess game data to instruction-following context
- Minimal Degradation: Low rank matrices ensure efficient learning without catastrophic forgetting
Practical Advantages
- Multi-GPU Friendly: Works seamlessly with distributed training and inference
- Inference Speed: Negligible overhead during inference compared to full models
- Compatibility: Works with existing PEFT infrastructure and Hugging Face ecosystem
Training Hyperparameters
Precision & Optimization
- Training Regime: bf16/fp16 mixed precision
- Optimizer: AdamW with scheduler
Training Schedule
- Steps: 269
- Epochs: 1
- Train Batch Size: 2 (per device)
- Learning Rate: Dynamic scheduler peaking at 2e-4
LoRA Configuration
- LoRA Rank (r): 16
- LoRA Alpha: 16
- LoRA Dropout: 0.0
- Target Modules:
- q_proj (Query projection)
- k_proj (Key projection)
- v_proj (Value projection)
- o_proj (Output projection)
- gate_proj (Gate projection)
- up_proj (Up projection)
- down_proj (Down projection)
Training Results
First Training Attempt (Initial)
- Total Moves: 1,123
- Loss Progression: Started at 0.673 (step 1) → 0.782 (step 2) → 0.697 (step 3) through convergence
Second Training Attempt (Reused Weights)
- Total Moves: 2,145
- Trainable Parameters: 41,943,040 of 8,072,204,288 (0.52% trained)
- Steps: 269 total steps completed (1 epoch)
- Loss Progression: Step 266 (0.495) → Step 267 (0.462) → Step 268 (0.565) → Step 269 (0.795)
- Training Convergence: Successfully completed training across 269 steps with reused adapter weights
Usage
Loading the Adapter
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
model = AutoPeftModelForCausalLM.from_pretrained(
"path/to/magnus_lora_adapter",
device_map="auto",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("unsloth/Llama-3.1-8B-Instruct-bnb-4bit")
Inference
def generate_response(prompt):
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
response = generate_response("Your prompt here")
print(response)
Files in This Repository
adapter_config.json- LoRA adapter configurationadapter_model.safetensors- Adapter weights in safetensors formattokenizer.json- Tokenizer vocabularytokenizer_config.json- Tokenizer configurationspecial_tokens_map.json- Special tokens mappingtraining_args.bin- Training argumentschat_template.jinja- Chat template for inference
Requirements
torch>=2.0.0
transformers>=4.36.0
peft>=0.7.0
bitsandbytes>=0.41.0
unsloth
trl
Future Improvements & Optimization Potential
The adapter has been successfully trained on Magnus Carlsen's chess game dataset. Future enhancements could include:
Potential Enhancement Areas
- Extended Training: Training with multiple epochs or additional datasets could improve move prediction accuracy
- Larger Datasets: Incorporating additional Magnus Carlsen games or broader chess datasets could enhance pattern recognition
- Hyperparameter Tuning: Experimenting with different LoRA ranks (r), alpha values, and learning rates may yield better results
- Increased Batch Size: Training with larger batch sizes could improve convergence and model stability
- Multi-Phase Training: Implementing curriculum learning or progressive fine-tuning strategies
- Domain-Specific Evaluation: Using chess-specific metrics to validate and iteratively improve move prediction accuracy
Recommendations for Further Development
- Train for multiple epochs with validation monitoring to assess convergence improvements
- Implement early stopping based on move prediction accuracy metrics
- Experiment with different learning rate schedules and warmup strategies
- Fine-tune on a curated dataset of high-rated games for better strategic learning
- Evaluate performance gains from retraining with different random seeds or augmented chess positions
Users interested in further improving these weights are encouraged to continue training with your own datasets and hyperparameters.
Inference Tips
- Use
device_map="auto"for automatic device placement with quantized models - The adapter is optimized for chess move prediction tasks
- Supports both CPU and GPU inference (GPU recommended for performance)
License
This adapter is licensed under the Llama 3.1 License. See the base model's license for details.
- Downloads last month
- 45
Model tree for vynr1504/Magnus-LoRA-Adapter
Base model
meta-llama/Llama-3.1-8B