Upload README.md with huggingface_hub

b1e7adf verified about 2 months ago

2.3 kB

language:
  - en
license: mit
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
  - lora
  - fine-tuned
  - rpg
  - basic-fantasy
  - bfrpg
  - tabletop
datasets:
  - custom
pipeline_tag: text-generation

Phi-3 Mini 4K Instruct — BFRPG Fine-Tune

A fine-tuned version of Microsoft Phi-3 Mini 4K Instruct trained on Basic Fantasy Role-Playing Game (BFRPG) Thief abilities rules Q&A.

Model Details

Property	Value
Base Model	Microsoft Phi-3 Mini 4K Instruct
Parameters	~3.8B
Fine-Tuning Method	LoRA SFT (merged)
Precision	bfloat16
LoRA Rank	16
LoRA Alpha	32
LoRA Dropout	0.05
Epochs	5
Batch Size	4
Learning Rate	2e-4
Hardware	NVIDIA DGX Spark (GB10 Blackwell)

Training Data

8 synthetic Q&A pairs generated from the Basic Fantasy RPG rulebook, focused on Thief class abilities (Open Locks, Pick Pockets, Move Silently, etc.). Data was generated using an LLM-based synthetic data generation pipeline with faithfulness judging.

The model uses the following system prompt:

You are a rules expert for the Basic Fantasy Role-Playing Game. Answer questions accurately based on the official rules. Be specific and cite page references or table values where possible.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("FrankDigsData/phi3-mini-rhai-finetuned")
tokenizer = AutoTokenizer.from_pretrained("FrankDigsData/phi3-mini-rhai-finetuned")

messages = [
    {"role": "system", "content": "You are a rules expert for the Basic Fantasy Role-Playing Game. Answer questions accurately based on the official rules."},
    {"role": "user", "content": "What is a level 5 Thief's Pick Pockets score?"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Context

This model was fine-tuned as part of a Red Hat AI workshop comparing small model adaptation techniques across multiple architectures.