|
|
--- |
|
|
license: mit |
|
|
datasets: |
|
|
- hash-map/got_qa_pairs |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- google/gemma-2-2b-it |
|
|
pipeline_tag: question-answering |
|
|
library_name: peft |
|
|
tags: |
|
|
- got |
|
|
- q&a |
|
|
- RAG |
|
|
- transformers |
|
|
- peft |
|
|
- bitsandbytes |
|
|
--- |
|
|
|
|
|
# Game of Thrones Q&A Model (PEFT / QLoRA fine-tuned) |
|
|
|
|
|
## 🧠 Model Overview |
|
|
|
|
|
**Model name:** hash-map/got_model |
|
|
**Base model:** `google/gemma-2-2b-it` |
|
|
**Fine-tuning method:** QLoRA (via PEFT) |
|
|
**Task:** Contextual Question Answering on *Game of Thrones* |
|
|
**Summary:** A lightweight instruction-tuned question-answering model specialized in the *Game of Thrones* / *A Song of Ice and Fire* universe. It generates concise, faithful answers when given relevant context + a question. |
|
|
|
|
|
**Description:** |
|
|
This model was fine-tuned on the `hash-map/got_qa_pairs` dataset using QLoRA (4-bit quantization + Low-Rank Adaptation) to keep memory usage low while adapting the powerful `gemma-2-2b-it` model to answer questions about characters, events, houses, lore, battles, and plot points — **only when provided with relevant context**. |
|
|
|
|
|
It is **not** a general-purpose LLM and performs poorly on questions without appropriate context or outside the GoT domain. |
|
|
|
|
|
## 🧩 Intended Use |
|
|
|
|
|
### Direct Use |
|
|
- Answering factual questions about *Game of Thrones* when supplied with relevant book/show text chunks |
|
|
- Building simple RAG-style (Retrieval-Augmented Generation) applications for GoT fans, wikis, quizzes, chatbots, etc. |
|
|
- Educational tools, reading comprehension demos, or lore-exploration apps |
|
|
|
|
|
### Out-of-Scope Use |
|
|
- General-purpose chat or open-domain QA |
|
|
- Questions about real history, other fictional universes, current events, politics, etc. |
|
|
- High-stakes applications (legal, medical, safety-critical decisions) |
|
|
- Generating creative fan-fiction or long-form narrative text (it is optimized for short factual answers) |
|
|
|
|
|
## 📥 Context Retrieval Strategy (included in repo) |
|
|
|
|
|
A simple **keyword-based lexical retrieval** system is provided to help select relevant context chunks: |
|
|
|
|
|
```python |
|
|
import re |
|
|
import json |
|
|
from collections import defaultdict, Counter |
|
|
|
|
|
CHUNKS_FILE = "/kaggle/input/got-dataset/contexts.json" # list of {text, source, chunk_id} |
|
|
|
|
|
def tokenize(text): |
|
|
return re.findall(r"\b[a-zA-Z]{3,}\b", text.lower()) |
|
|
|
|
|
contexts = [] |
|
|
token_to_ctx = defaultdict(list) |
|
|
|
|
|
with open(CHUNKS_FILE, "r", encoding="utf-8") as f: |
|
|
data = json.load(f) |
|
|
|
|
|
for idx, item in enumerate(data): |
|
|
text = item["text"] |
|
|
contexts.append(item) |
|
|
|
|
|
for tok in tokenize(text): |
|
|
token_to_ctx[tok].append(idx) |
|
|
|
|
|
print(f"Indexed {len(contexts)} chunks") |
|
|
|
|
|
def retrieve_2_contexts(question, token_to_ctx, contexts): |
|
|
q_tokens = tokenize(question) |
|
|
scores = Counter() |
|
|
for tok in q_tokens: |
|
|
for ctx_id in token_to_ctx.get(tok, []): |
|
|
scores[ctx_id] += 1 |
|
|
if not scores: |
|
|
return "" |
|
|
top_ids = [cid for cid, _ in scores.most_common(2)] |
|
|
return " ".join([contexts[cid]["text"] for cid in top_ids]) |
|
|
``` |
|
|
|
|
|
This is a basic sparse retrieval method (similar to TF-IDF without IDF). |
|
|
can create faiss for better retreival using these contexts |
|
|
|
|
|
## 🧑💻 How to Use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Replace with your actual repo |
|
|
model_name = "hash-map/got_model" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"google/gemma-2-2b-it", |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto" |
|
|
) |
|
|
model = PeftModel.from_pretrained(base_model, model_name) |
|
|
|
|
|
def answer_question(context: str, question: str, max_new_tokens=96) -> str: |
|
|
prompt = f"""Context: |
|
|
{context} |
|
|
|
|
|
Question: |
|
|
{question} |
|
|
|
|
|
Answer:""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=max_new_tokens, |
|
|
do_sample=False, |
|
|
temperature=0.0, |
|
|
eos_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
answer = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
# Extract only the answer part after "Answer:" |
|
|
return answer.split("Answer:")[-1].strip() |
|
|
|
|
|
# Example |
|
|
context = retrieve_2_contexts("Who killed Joffrey Baratheon?", token_to_ctx, contexts) |
|
|
print(answer_question(context, "Who killed Joffrey Baratheon?")) |
|
|
``` |
|
|
|
|
|
## ⚠️ Bias, Risks & Limitations |
|
|
|
|
|
- **Domain limitation:** Extremely poor performance on non-GoT topics |
|
|
- **Retrieval dependency:** Answers are only as good as the retrieved context — lexical method can miss semantically similar but lexically different passages |
|
|
- **Hallucinations:** Can still invent facts when context is ambiguous, incomplete or contradictory |
|
|
- **Toxicity & bias:** Inherits biases present in the base Gemma model + any biases in the GoT dataset (e.g. gender roles, violence portrayal typical of the series) |
|
|
- **No safety tuning:** No built-in refusal or content filtering |
|
|
- **Hugging Face Token Necessity** for google gemma model need hf access token. To access the gemma repo,you need hf token |
|
|
|
|
|
**Recommendations:** |
|
|
- contexts are fine u can try with other retreiver but make sure total token length is <200 |
|
|
- Manually verify outputs for important use cases |
|
|
- Consider adding a guardrail/moderation step in applications |
|
|
|
|
|
## 📚 Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{got-qa-gemma2-2026, |
|
|
author = {Appala Sai Sumanth}, |
|
|
title = {Gemma-2-2b-it Fine-tuned for Game of Thrones Question Answering}, |
|
|
year = {2026}, |
|
|
publisher = {Hugging Face}, |
|
|
howpublished = {\url{https://huggingface.co/hash-map/got_model}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Framework versions |
|
|
|
|
|
- `transformers` >= 4.42 |
|
|
- `peft` 0.13.2 |
|
|
- `torch` >= 2.1 |
|
|
- `bitsandbytes` >= 0.43 (for 4-bit inference if desired) |
|
|
|
|
|
--- |