Update README.md

9cc28b8 verified 1 day ago

6.85 kB

	---
	library_name: peft
	tags:
	- elden-ring
	- question-answering
	- gaming
	- domain-specific
	- qlora
	- lora
	- phi-2
	base_model: microsoft/phi-2
	license: cc-by-sa-4.0
	language:
	- en
	pipeline_tag: text-generation
	---

	# 🗡️ Elden Ring QA — Phi-2 QLoRA Adapter

	A QLoRA fine-tuned adapter for [Microsoft Phi-2](https://huggingface.co/microsoft/phi-2) (2.7B) trained on a custom Elden Ring question-answering dataset. The model answers questions about weapons, bosses, spells, NPCs, locations, armor, and creatures — including boss vulnerability analysis and per-build weapon recommendations.

	## Model Details

	- Base model: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) (2.7B parameters)
	- Fine-tuning method: QLoRA (4-bit NF4 quantization + LoRA adapters)
	- LoRA rank: 8
	- LoRA alpha: 16
	- LoRA target modules: `q_proj`, `k_proj`, `v_proj`, `dense`
	- Trainable parameters: ~5.2M (0.34% of total)
	- Adapter size: 21 MB
	- Training data: [ArenaRune/elden-ring-qa-dataset](https://huggingface.co/datasets/ArenaRune/elden-ring-qa-dataset)
	- Language: English
	- Developed by: ArenaRune

	## Quick Start

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel
	import torch

	# Quantization config (must match training)
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.float16,
	bnb_4bit_use_double_quant=True,
	)

	# Load base model + adapter
	base = AutoModelForCausalLM.from_pretrained(
	"microsoft/phi-2",
	quantization_config=bnb_config,
	device_map="auto",
	trust_remote_code=True,
	)
	model = PeftModel.from_pretrained(base, "ArenaRune/elden-ring-phi2-qlora")
	tokenizer = AutoTokenizer.from_pretrained("ArenaRune/elden-ring-phi2-qlora")
	model.eval()

	# Ask a question
	prompt = """### Instruction:
	What weapons are good against Mohg, Lord of Blood?

	### Response:
	"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=128,
	do_sample=False,
	repetition_penalty=1.5,
	no_repeat_ngram_size=3,
	pad_token_id=tokenizer.eos_token_id,
	)

	answer = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
	print(answer)
	```

	## Prompt Format

	The model expects this instruction template:

	```
	### Instruction:
	{your question about Elden Ring}

	### Response:
	```

	## Training Details

	### Training Data

	Custom dataset built from 3 public sources:
	- Kaggle — Ultimate Elden Ring with Shadow of the Erdtree DLC (12 structured CSVs)
	- GitHub — [Impalers-Archive](https://github.com/ividyon/Impalers-Archive) (DLC text dump)
	- GitHub — [Carian-Archive](https://github.com/AsteriskAmpersand/Carian-Archive) (base game text dump)

	Dataset covers 10 entity types (weapons, bosses, armors, spells, NPCs, locations, creatures, skills, ashes of war) with 20+ question categories including cross-entity boss vulnerability analysis and per-build weapon recommendations.

	Full dataset: [ArenaRune/elden-ring-qa-dataset](https://huggingface.co/datasets/ArenaRune/elden-ring-qa-dataset)

	### Training Procedure

	- Framework: HuggingFace Transformers + PEFT
	- Method: QLoRA (4-bit NF4 quantization + LoRA)
	- Precision: FP16 mixed precision
	- Optimizer: Paged AdamW 8-bit
	- LR schedule: Cosine with 10% warmup
	- GPU: NVIDIA A100 (80GB)
	- Platform: Google Colab

	### Training Hyperparameters

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Learning rate \| 2e-4 \|
	\| LoRA rank (r) \| 8 \|
	\| LoRA alpha \| 16 \|
	\| LoRA dropout \| 0.1 \|
	\| Epochs \| 3 \|
	\| Batch size (effective) \| 16 (8 × 2 grad accum) \|
	\| Max sequence length \| 512 \|
	\| Weight decay \| 0.01 \|
	\| Warmup ratio \| 0.1 \|

	### Hyperparameter Search

	Three configurations were tested:

	\| Config \| LR \| Rank \| Alpha \| Description \|
	\|--------\|-----\|------\|-------\|-------------\|
	\| A (selected) \| 2e-4 \| 8 \| 16 \| Conservative — fast convergence \|
	\| B \| 1e-4 \| 16 \| 32 \| Balanced \|
	\| C \| 5e-5 \| 32 \| 64 \| Aggressive — high capacity \|

	Config A achieved the lowest validation loss. Higher-rank configs underfit due to insufficient training steps at their lower learning rates.

	## Evaluation

	### Metrics

	Evaluated on 100 held-out test examples against unmodified Phi-2 baseline using:
	- ROUGE-1/2/L — n-gram overlap (lexical similarity)
	- BERTScore F1 — semantic similarity via RoBERTa-Large embeddings

	Key finding: significant ROUGE-2 improvement over baseline, confirming domain vocabulary acquisition. The model learned Elden Ring terminology and response structure. See the training notebook for exact metrics and visualizations.

	### What the Model Learned

	- Elden Ring domain vocabulary (Hemorrhage, Scarlet Rot, Frostbite, damage negation, FP cost)
	- Entity type awareness (distinguishes weapons, bosses, spells, NPCs)
	- Structured response formatting ("The {weapon} requires {X} Str, {Y} Dex to wield")
	- Build archetype understanding (strength, dexterity, intelligence, faith, arcane)

	### Known Limitations

	- Factual hallucination: The model learned the correct output format but hallucinates specific values (wrong stat numbers, incorrect skill names, approximate weights). This is due to LoRA rank 8 having insufficient capacity to memorize entity-specific facts across hundreds of items.
	- Repetitive generation: Some outputs may loop despite anti-repetition measures. Use `repetition_penalty=1.5` and `no_repeat_ngram_size=3`.
	- Cross-entity confusion: May attribute one entity's properties to another similar entity.

	### Recommended Improvement: RAG

	The model's domain fluency + factual hallucination makes it ideal for Retrieval-Augmented Generation: retrieve entity data from the enriched dataset at inference time and inject it as context. The model already knows how to format the data — RAG just ensures it has the correct facts.

	## Uses

	### Intended Uses

	- Elden Ring game knowledge QA
	- Demonstrating QLoRA fine-tuning on domain-specific data
	- Base for RAG-augmented game assistant systems
	- Educational reference for parameter-efficient fine-tuning

	### Out-of-Scope Uses

	- Factual reference without verification (values may be hallucinated)
	- Commercial game guide products
	- General-purpose question answering outside Elden Ring

	## Environmental Impact

	- Hardware: NVIDIA A100 (40GB)
	- Training time: ~48 minutes (3 configs × ~16 min each)
	- Cloud provider: Google Colab

	## Citation

	```bibtex
	@misc{eldenring-phi2-qlora-2026,
	author = {ArenaRune},
	title = {Elden Ring QA — Phi-2 QLoRA Adapter},
	year = {2026},
	publisher = {HuggingFace},
	url = {https://huggingface.co/ArenaRune/elden-ring-phi2-qlora}
	}
	```