FrankDigsData's picture
Upload README.md with huggingface_hub
b1e7adf verified
metadata
language:
  - en
license: mit
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
  - lora
  - fine-tuned
  - rpg
  - basic-fantasy
  - bfrpg
  - tabletop
datasets:
  - custom
pipeline_tag: text-generation

Phi-3 Mini 4K Instruct — BFRPG Fine-Tune

A fine-tuned version of Microsoft Phi-3 Mini 4K Instruct trained on Basic Fantasy Role-Playing Game (BFRPG) Thief abilities rules Q&A.

Model Details

Property Value
Base Model Microsoft Phi-3 Mini 4K Instruct
Parameters ~3.8B
Fine-Tuning Method LoRA SFT (merged)
Precision bfloat16
LoRA Rank 16
LoRA Alpha 32
LoRA Dropout 0.05
Epochs 5
Batch Size 4
Learning Rate 2e-4
Hardware NVIDIA DGX Spark (GB10 Blackwell)

Training Data

8 synthetic Q&A pairs generated from the Basic Fantasy RPG rulebook, focused on Thief class abilities (Open Locks, Pick Pockets, Move Silently, etc.). Data was generated using an LLM-based synthetic data generation pipeline with faithfulness judging.

The model uses the following system prompt:

You are a rules expert for the Basic Fantasy Role-Playing Game. Answer questions accurately based on the official rules. Be specific and cite page references or table values where possible.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("FrankDigsData/phi3-mini-rhai-finetuned")
tokenizer = AutoTokenizer.from_pretrained("FrankDigsData/phi3-mini-rhai-finetuned")

messages = [
    {"role": "system", "content": "You are a rules expert for the Basic Fantasy Role-Playing Game. Answer questions accurately based on the official rules."},
    {"role": "user", "content": "What is a level 5 Thief's Pick Pockets score?"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Context

This model was fine-tuned as part of a Red Hat AI workshop comparing small model adaptation techniques across multiple architectures.