File size: 5,746 Bytes
75c2bd0 6405bd6 75c2bd0 e5ed5b6 d4c7481 e5ed5b6 d4c7481 e5ed5b6 6405bd6 d4c7481 e5ed5b6 d4c7481 e5ed5b6 d4c7481 e5ed5b6 d4c7481 e5ed5b6 d4c7481 e5ed5b6 d4c7481 2830c66 d4c7481 2830c66 d4c7481 2830c66 d4c7481 dcbf518 d4c7481 2830c66 d4c7481 2830c66 d4c7481 2830c66 d4c7481 35ada25 d4c7481 35ada25 d4c7481 e5ed5b6 35ada25 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
---
license: mit
datasets:
- hash-map/got_qa_pairs
language:
- en
base_model:
- google/gemma-2-2b-it
pipeline_tag: question-answering
library_name: peft
tags:
- got
- q&a
- RAG
- transformers
- peft
- bitsandbytes
---
# Game of Thrones Q&A Model (PEFT / QLoRA fine-tuned)
## 🧠 Model Overview
**Model name:** hash-map/got_model
**Base model:** `google/gemma-2-2b-it`
**Fine-tuning method:** QLoRA (via PEFT)
**Task:** Contextual Question Answering on *Game of Thrones*
**Summary:** A lightweight instruction-tuned question-answering model specialized in the *Game of Thrones* / *A Song of Ice and Fire* universe. It generates concise, faithful answers when given relevant context + a question.
**Description:**
This model was fine-tuned on the `hash-map/got_qa_pairs` dataset using QLoRA (4-bit quantization + Low-Rank Adaptation) to keep memory usage low while adapting the powerful `gemma-2-2b-it` model to answer questions about characters, events, houses, lore, battles, and plot points — **only when provided with relevant context**.
It is **not** a general-purpose LLM and performs poorly on questions without appropriate context or outside the GoT domain.
## 🧩 Intended Use
### Direct Use
- Answering factual questions about *Game of Thrones* when supplied with relevant book/show text chunks
- Building simple RAG-style (Retrieval-Augmented Generation) applications for GoT fans, wikis, quizzes, chatbots, etc.
- Educational tools, reading comprehension demos, or lore-exploration apps
### Out-of-Scope Use
- General-purpose chat or open-domain QA
- Questions about real history, other fictional universes, current events, politics, etc.
- High-stakes applications (legal, medical, safety-critical decisions)
- Generating creative fan-fiction or long-form narrative text (it is optimized for short factual answers)
## 📥 Context Retrieval Strategy (included in repo)
A simple **keyword-based lexical retrieval** system is provided to help select relevant context chunks:
```python
import re
import json
from collections import defaultdict, Counter
CHUNKS_FILE = "/kaggle/input/got-dataset/contexts.json" # list of {text, source, chunk_id}
def tokenize(text):
return re.findall(r"\b[a-zA-Z]{3,}\b", text.lower())
contexts = []
token_to_ctx = defaultdict(list)
with open(CHUNKS_FILE, "r", encoding="utf-8") as f:
data = json.load(f)
for idx, item in enumerate(data):
text = item["text"]
contexts.append(item)
for tok in tokenize(text):
token_to_ctx[tok].append(idx)
print(f"Indexed {len(contexts)} chunks")
def retrieve_2_contexts(question, token_to_ctx, contexts):
q_tokens = tokenize(question)
scores = Counter()
for tok in q_tokens:
for ctx_id in token_to_ctx.get(tok, []):
scores[ctx_id] += 1
if not scores:
return ""
top_ids = [cid for cid, _ in scores.most_common(2)]
return " ".join([contexts[cid]["text"] for cid in top_ids])
```
This is a basic sparse retrieval method (similar to TF-IDF without IDF).
can create faiss for better retreival using these contexts
## 🧑💻 How to Use
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Replace with your actual repo
model_name = "hash-map/got_model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-2-2b-it",
torch_dtype=torch.bfloat16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, model_name)
def answer_question(context: str, question: str, max_new_tokens=96) -> str:
prompt = f"""Context:
{context}
Question:
{question}
Answer:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=max_new_tokens,
do_sample=False,
temperature=0.0,
eos_token_id=tokenizer.eos_token_id
)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Extract only the answer part after "Answer:"
return answer.split("Answer:")[-1].strip()
# Example
context = retrieve_2_contexts("Who killed Joffrey Baratheon?", token_to_ctx, contexts)
print(answer_question(context, "Who killed Joffrey Baratheon?"))
```
## ⚠️ Bias, Risks & Limitations
- **Domain limitation:** Extremely poor performance on non-GoT topics
- **Retrieval dependency:** Answers are only as good as the retrieved context — lexical method can miss semantically similar but lexically different passages
- **Hallucinations:** Can still invent facts when context is ambiguous, incomplete or contradictory
- **Toxicity & bias:** Inherits biases present in the base Gemma model + any biases in the GoT dataset (e.g. gender roles, violence portrayal typical of the series)
- **No safety tuning:** No built-in refusal or content filtering
- **Hugging Face Token Necessity** for google gemma model need hf access token. To access the gemma repo,you need hf token
**Recommendations:**
- contexts are fine u can try with other retreiver but make sure total token length is <200
- Manually verify outputs for important use cases
- Consider adding a guardrail/moderation step in applications
## 📚 Citation
```bibtex
@misc{got-qa-gemma2-2026,
author = {Appala Sai Sumanth},
title = {Gemma-2-2b-it Fine-tuned for Game of Thrones Question Answering},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/hash-map/got_model}}
}
```
## Framework versions
- `transformers` >= 4.42
- `peft` 0.13.2
- `torch` >= 2.1
- `bitsandbytes` >= 0.43 (for 4-bit inference if desired)
--- |