File size: 5,658 Bytes
44eb5d7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
---
license: apache-2.0
datasets:
- TinyModels/jjk-wiki-corpus
language:
- en
pipeline_tag: text-generation
tags:
- RAG
- Qwen2.5
- Jujutsu-Kaisen
- Anime
- Knowledge-Bot
- Retrieval-Augmented-Generation
---

<div align="center">

# ๐ŸŸฃ JujutsuKaiserver

### *The Cursed Intelligence. The Canon Oracle.*

[![Model](https://img.shields.io/badge/Base-Qwen2.5--1.5B--Instruct-blueviolet?style=for-the-badge)](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
[![Quantization](https://img.shields.io/badge/Quantization-4--bit-purple?style=for-the-badge)](https://huggingface.co/TinyModels/JujutsuKaiserver)
[![RAG](https://img.shields.io/badge/RAG-FAISS%20Powered-darkviolet?style=for-the-badge)](https://huggingface.co/TinyModels/JujutsuKaiserver)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue?style=for-the-badge)](LICENSE)
[![Dataset](https://img.shields.io/badge/Dataset-jjk--wiki--corpus-orange?style=for-the-badge)](https://huggingface.co/datasets/TinyModels/jjk-wiki-corpus)

<br/>

> *"Throughout Heaven and Earth, I alone am the honored one."*  
> โ€” **Satoru Gojo** | and also this model, kind of.

<br/>

**JujutsuKaiserver** is a Retrieval-Augmented Generation (RAG) model built for one purpose:  
to answer anything and everything about the **Jujutsu Kaisen** universe โ€” with canon-backed accuracy, zero hallucination tolerance, and the confidence of Unlimited Void.

</div>

---

## โšก What It Does

Ask it anything. Techniques. Domains. Arcs. Hidden lore. Character relationships. Cursed Energy mechanics. It retrieves the most relevant passages from a **200+ page wiki corpus**, feeds them into a fine-tuned **Qwen2.5-1.5B-Instruct** backbone, and gives you a clean, grounded answer โ€” not a guess.

| Ask This | Get This |
|----------|----------|
| *"What is Sukuna's Shrine?"* | Full technique breakdown with canon context |
| *"How does Mahito's Idle Transfiguration work?"* | Soul-level mechanics explained accurately |
| *"What happened in the Shibuya Incident?"* | Arc summary backed by wiki chunks |
| *"Who is the strongest Grade 1 sorcerer?"* | Ranked answer with sourced reasoning |

---

## ๐Ÿง  Architecture

```
User Query
    โ”‚
    โ–ผ
sentence-transformers (all-MiniLM-L6-v2)
    โ”‚  [embed query]
    โ–ผ
FAISS Index (jjk_index.faiss)
    โ”‚  [top-5 relevant wiki chunks]
    โ–ผ
Qwen2.5-1.5B-Instruct (4-bit)
    โ”‚  [context + question โ†’ chat template]
    โ–ผ
Canon-grounded Answer
```

### Model Composition

| Component | Details |
|-----------|---------|
| ๐Ÿค– **Base LLM** | `Qwen/Qwen2.5-1.5B-Instruct` (4-bit quantized) |
| ๐Ÿ”ข **Embeddings** | `sentence-transformers/all-MiniLM-L6-v2` |
| ๐Ÿ“ฆ **Vector Store** | FAISS โ€” `jjk_index.faiss` |
| ๐Ÿ“– **Knowledge Base** | 120+ cleaned JJK Fandom Wiki articles (`chunks.txt`) |
| ๐Ÿ”ง **Pipeline** | Custom `JujutsuKaiserver` class with Qwen chat template |

---

## ๐Ÿš€ Quick Start

```python
from huggingface_hub import snapshot_download

model_dir = snapshot_download("TinyModels/JujutsuKaiserver")

import sys
sys.path.insert(0, model_dir)
from pipeline import JujutsuKaiserver

bot = JujutsuKaiserver(model_dir=model_dir)

# Ask anything
print(bot.ask("What is Gojo's Domain Expansion called?"))
# โ†’ "Infinite Void (็„ก้‡็ฉบๅ‡ฆ). It..."
```

> โš ๏ธ **Requirements**: `bitsandbytes`, GPU with **โ‰ฅ6 GB VRAM**. CPU inference works but is slow.

### Install Dependencies

```bash
pip install transformers bitsandbytes faiss-cpu sentence-transformers huggingface_hub
```

---

## ๐Ÿ–ฅ๏ธ Gradio Demo (Optional)

Spin up a local chat UI in seconds:

```python
import gradio as gr
from pipeline import JujutsuKaiserver

bot = JujutsuKaiserver(model_dir="<path_to_downloaded_model>")

def chat(message, history):
    return bot.ask(message)

gr.ChatInterface(
    fn=chat,
    title="๐ŸŸฃ JujutsuKaiserver",
    description="Ask anything about the JJK universe."
).launch()
```

---

## โœจ Features

- ๐Ÿ” **Factual Q&A** โ€” Every answer is grounded in retrieved wiki content, not imagination
- ๐Ÿšซ **Hallucination Guard** โ€” Model is prompted to say *"I don't know"* when context is insufficient
- ๐Ÿ“š **Deep Coverage** โ€” 200+ wiki pages: characters, techniques, domains, arcs, lore
- โšก **T4-Friendly** โ€” 4-bit quantization means it runs on free Colab tiers
- ๐Ÿค– **Gradio Ready** โ€” One-script local demo included out of the box

---

## โš ๏ธ Known Limitations

- **Recent chapters** beyond the scraping date may not be indexed yet
- **Ambiguous context** can still occasionally produce imperfect answers โ€” being addressed via a feedback loop
- **Roleplay mode** is possible with a custom system prompt, but this version is optimized for factual retrieval

---

## ๐Ÿ”ฎ Roadmap

- [ ] **Live Feedback Flagging** โ€” ๐Ÿ‘/๐Ÿ‘Ž votes from the Gradio Space feed a correction dataset automatically
- [ ] **Self-Correcting Pipeline** โ€” Weekly DPO fine-tuning on flagged examples + FAISS index refresh
- [ ] **Expanded KB** โ€” Episode transcripts, manga panels text, community lore
- [ ] **Streaming Support** โ€” Token-by-token output for snappier UX

---

## ๐Ÿ“‚ Repo Structure

```
JujutsuKaiserver/
โ”œโ”€โ”€ pipeline.py          # Core RAG pipeline class
โ”œโ”€โ”€ jjk_index.faiss      # FAISS vector index
โ”œโ”€โ”€ chunks.txt           # Raw wiki knowledge base
โ”œโ”€โ”€ generation_config.json
โ””โ”€โ”€ README.md
```

---

<div align="center">

**Built with ๐Ÿฉธ and cursed energy for the JJK community.**

*Got a question the bot fumbled? Open a [Discussion](https://huggingface.co/TinyModels/JujutsuKaiserver/discussions) and help us fix it.*

`TinyModels` โ€ข `QuantaSparkLabs` โ€ข Apache 2.0

</div>