File size: 5,746 Bytes
75c2bd0
 
 
 
 
 
 
 
 
 
6405bd6
 
 
 
 
 
 
75c2bd0
e5ed5b6
d4c7481
e5ed5b6
d4c7481
e5ed5b6
6405bd6
d4c7481
 
 
 
e5ed5b6
d4c7481
 
e5ed5b6
d4c7481
e5ed5b6
d4c7481
e5ed5b6
 
d4c7481
 
 
e5ed5b6
 
d4c7481
 
 
 
 
 
 
 
 
 
 
2830c66
 
 
 
d4c7481
 
 
 
2830c66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d4c7481
 
 
 
 
 
 
 
 
 
 
 
2830c66
 
d4c7481
 
 
 
 
 
 
 
 
dcbf518
d4c7481
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2830c66
d4c7481
 
2830c66
d4c7481
2830c66
d4c7481
 
 
 
 
35ada25
d4c7481
 
 
35ada25
d4c7481
 
 
 
 
 
 
 
 
e5ed5b6
35ada25
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---
license: mit
datasets:
- hash-map/got_qa_pairs
language:
- en
base_model:
- google/gemma-2-2b-it
pipeline_tag: question-answering
library_name: peft
tags:
- got
- q&a
- RAG
- transformers
- peft
- bitsandbytes
---

# Game of Thrones Q&A Model (PEFT / QLoRA fine-tuned)

## 🧠 Model Overview

**Model name:** hash-map/got_model  
**Base model:** `google/gemma-2-2b-it`  
**Fine-tuning method:** QLoRA (via PEFT)  
**Task:** Contextual Question Answering on *Game of Thrones*  
**Summary:** A lightweight instruction-tuned question-answering model specialized in the *Game of Thrones* / *A Song of Ice and Fire* universe. It generates concise, faithful answers when given relevant context + a question.

**Description:**  
This model was fine-tuned on the `hash-map/got_qa_pairs` dataset using QLoRA (4-bit quantization + Low-Rank Adaptation) to keep memory usage low while adapting the powerful `gemma-2-2b-it` model to answer questions about characters, events, houses, lore, battles, and plot points — **only when provided with relevant context**.

It is **not** a general-purpose LLM and performs poorly on questions without appropriate context or outside the GoT domain.

## 🧩 Intended Use

### Direct Use
- Answering factual questions about *Game of Thrones* when supplied with relevant book/show text chunks
- Building simple RAG-style (Retrieval-Augmented Generation) applications for GoT fans, wikis, quizzes, chatbots, etc.
- Educational tools, reading comprehension demos, or lore-exploration apps

### Out-of-Scope Use
- General-purpose chat or open-domain QA
- Questions about real history, other fictional universes, current events, politics, etc.
- High-stakes applications (legal, medical, safety-critical decisions)
- Generating creative fan-fiction or long-form narrative text (it is optimized for short factual answers)

## 📥 Context Retrieval Strategy (included in repo)

A simple **keyword-based lexical retrieval** system is provided to help select relevant context chunks:

```python
import re
import json
from collections import defaultdict, Counter

CHUNKS_FILE = "/kaggle/input/got-dataset/contexts.json"   # list of {text, source, chunk_id}

def tokenize(text):
    return re.findall(r"\b[a-zA-Z]{3,}\b", text.lower())

contexts = []
token_to_ctx = defaultdict(list)

with open(CHUNKS_FILE, "r", encoding="utf-8") as f:
    data = json.load(f)

for idx, item in enumerate(data):
    text = item["text"]
    contexts.append(item)

    for tok in tokenize(text):
        token_to_ctx[tok].append(idx)

print(f"Indexed {len(contexts)} chunks")

def retrieve_2_contexts(question, token_to_ctx, contexts):
    q_tokens = tokenize(question)
    scores = Counter()
    for tok in q_tokens:
        for ctx_id in token_to_ctx.get(tok, []):
            scores[ctx_id] += 1
    if not scores:
        return ""
    top_ids = [cid for cid, _ in scores.most_common(2)]
    return " ".join([contexts[cid]["text"] for cid in top_ids])
```

This is a basic sparse retrieval method (similar to TF-IDF without IDF).
can create faiss for better retreival using these contexts

## 🧑‍💻 How to Use

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Replace with your actual repo
model_name = "hash-map/got_model"

tokenizer = AutoTokenizer.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b-it",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, model_name)

def answer_question(context: str, question: str, max_new_tokens=96) -> str:
    prompt = f"""Context:
{context}

Question:
{question}

Answer:"""

    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            do_sample=False,
            temperature=0.0,
            eos_token_id=tokenizer.eos_token_id
        )
    answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
    # Extract only the answer part after "Answer:"
    return answer.split("Answer:")[-1].strip()

# Example
context = retrieve_2_contexts("Who killed Joffrey Baratheon?", token_to_ctx, contexts)
print(answer_question(context, "Who killed Joffrey Baratheon?"))
```

## ⚠️ Bias, Risks & Limitations

- **Domain limitation:** Extremely poor performance on non-GoT topics
- **Retrieval dependency:** Answers are only as good as the retrieved context — lexical method can miss semantically similar but lexically different passages
- **Hallucinations:** Can still invent facts when context is ambiguous, incomplete or contradictory
- **Toxicity & bias:** Inherits biases present in the base Gemma model + any biases in the GoT dataset (e.g. gender roles, violence portrayal typical of the series)
- **No safety tuning:** No built-in refusal or content filtering
- **Hugging Face Token Necessity** for google gemma model need hf access token. To access the gemma repo,you  need hf  token 

**Recommendations:**
- contexts are fine u can try with other retreiver but make sure total token length is <200 
- Manually verify outputs for important use cases
- Consider adding a guardrail/moderation step in applications

## 📚 Citation

```bibtex
@misc{got-qa-gemma2-2026,
  author       = {Appala Sai Sumanth},
  title        = {Gemma-2-2b-it Fine-tuned for Game of Thrones Question Answering},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/hash-map/got_model}}
}
```

## Framework versions

- `transformers` >= 4.42
- `peft` 0.13.2
- `torch` >= 2.1
- `bitsandbytes` >= 0.43 (for 4-bit inference if desired)

---