| --- |
| license: apache-2.0 |
| datasets: |
| - TinyModels/jjk-wiki-corpus |
| language: |
| - en |
| pipeline_tag: text-generation |
| tags: |
| - RAG |
| - Qwen2.5 |
| - Jujutsu-Kaisen |
| - Anime |
| - Knowledge-Bot |
| - Retrieval-Augmented-Generation |
| --- |
| |
| <div align="center"> |
|
|
| # ๐ฃ JujutsuKaiserver |
|
|
| ### *The Cursed Intelligence. The Canon Oracle.* |
|
|
| [](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) |
| [](https://huggingface.co/TinyModels/JujutsuKaiserver) |
| [](https://huggingface.co/TinyModels/JujutsuKaiserver) |
| [](LICENSE) |
| [](https://huggingface.co/datasets/TinyModels/jjk-wiki-corpus) |
|
|
| <br/> |
|
|
| > *"Throughout Heaven and Earth, I alone am the honored one."* |
| > โ **Satoru Gojo** | and also this model, kind of. |
|
|
| <br/> |
|
|
| **JujutsuKaiserver** is a Retrieval-Augmented Generation (RAG) model built for one purpose: |
| to answer anything and everything about the **Jujutsu Kaisen** universe โ with canon-backed accuracy, zero hallucination tolerance, and the confidence of Unlimited Void. |
|
|
| </div> |
|
|
| --- |
|
|
| ## โก What It Does |
|
|
| Ask it anything. Techniques. Domains. Arcs. Hidden lore. Character relationships. Cursed Energy mechanics. It retrieves the most relevant passages from a **200+ page wiki corpus**, feeds them into a fine-tuned **Qwen2.5-1.5B-Instruct** backbone, and gives you a clean, grounded answer โ not a guess. |
|
|
| | Ask This | Get This | |
| |----------|----------| |
| | *"What is Sukuna's Shrine?"* | Full technique breakdown with canon context | |
| | *"How does Mahito's Idle Transfiguration work?"* | Soul-level mechanics explained accurately | |
| | *"What happened in the Shibuya Incident?"* | Arc summary backed by wiki chunks | |
| | *"Who is the strongest Grade 1 sorcerer?"* | Ranked answer with sourced reasoning | |
|
|
| --- |
|
|
| ## ๐ง Architecture |
|
|
| ``` |
| User Query |
| โ |
| โผ |
| sentence-transformers (all-MiniLM-L6-v2) |
| โ [embed query] |
| โผ |
| FAISS Index (jjk_index.faiss) |
| โ [top-5 relevant wiki chunks] |
| โผ |
| Qwen2.5-1.5B-Instruct (4-bit) |
| โ [context + question โ chat template] |
| โผ |
| Canon-grounded Answer |
| ``` |
|
|
| ### Model Composition |
|
|
| | Component | Details | |
| |-----------|---------| |
| | ๐ค **Base LLM** | `Qwen/Qwen2.5-1.5B-Instruct` (4-bit quantized) | |
| | ๐ข **Embeddings** | `sentence-transformers/all-MiniLM-L6-v2` | |
| | ๐ฆ **Vector Store** | FAISS โ `jjk_index.faiss` | |
| | ๐ **Knowledge Base** | 120+ cleaned JJK Fandom Wiki articles (`chunks.txt`) | |
| | ๐ง **Pipeline** | Custom `JujutsuKaiserver` class with Qwen chat template | |
|
|
| --- |
|
|
| ## ๐ Quick Start |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| |
| model_dir = snapshot_download("TinyModels/JujutsuKaiserver") |
| |
| import sys |
| sys.path.insert(0, model_dir) |
| from pipeline import JujutsuKaiserver |
| |
| bot = JujutsuKaiserver(model_dir=model_dir) |
| |
| # Ask anything |
| print(bot.ask("What is Gojo's Domain Expansion called?")) |
| # โ "Infinite Void (็ก้็ฉบๅฆ). It..." |
| ``` |
|
|
| > โ ๏ธ **Requirements**: `bitsandbytes`, GPU with **โฅ6 GB VRAM**. CPU inference works but is slow. |
|
|
| ### Install Dependencies |
|
|
| ```bash |
| pip install transformers bitsandbytes faiss-cpu sentence-transformers huggingface_hub |
| ``` |
|
|
| --- |
|
|
| ## ๐ฅ๏ธ Gradio Demo (Optional) |
|
|
| Spin up a local chat UI in seconds: |
|
|
| ```python |
| import gradio as gr |
| from pipeline import JujutsuKaiserver |
| |
| bot = JujutsuKaiserver(model_dir="<path_to_downloaded_model>") |
| |
| def chat(message, history): |
| return bot.ask(message) |
| |
| gr.ChatInterface( |
| fn=chat, |
| title="๐ฃ JujutsuKaiserver", |
| description="Ask anything about the JJK universe." |
| ).launch() |
| ``` |
|
|
| --- |
|
|
| ## โจ Features |
|
|
| - ๐ **Factual Q&A** โ Every answer is grounded in retrieved wiki content, not imagination |
| - ๐ซ **Hallucination Guard** โ Model is prompted to say *"I don't know"* when context is insufficient |
| - ๐ **Deep Coverage** โ 200+ wiki pages: characters, techniques, domains, arcs, lore |
| - โก **T4-Friendly** โ 4-bit quantization means it runs on free Colab tiers |
| - ๐ค **Gradio Ready** โ One-script local demo included out of the box |
|
|
| --- |
|
|
| ## โ ๏ธ Known Limitations |
|
|
| - **Recent chapters** beyond the scraping date may not be indexed yet |
| - **Ambiguous context** can still occasionally produce imperfect answers โ being addressed via a feedback loop |
| - **Roleplay mode** is possible with a custom system prompt, but this version is optimized for factual retrieval |
|
|
| --- |
|
|
| ## ๐ฎ Roadmap |
|
|
| - [ ] **Live Feedback Flagging** โ ๐/๐ votes from the Gradio Space feed a correction dataset automatically |
| - [ ] **Self-Correcting Pipeline** โ Weekly DPO fine-tuning on flagged examples + FAISS index refresh |
| - [ ] **Expanded KB** โ Episode transcripts, manga panels text, community lore |
| - [ ] **Streaming Support** โ Token-by-token output for snappier UX |
|
|
| --- |
|
|
| ## ๐ Repo Structure |
|
|
| ``` |
| JujutsuKaiserver/ |
| โโโ pipeline.py # Core RAG pipeline class |
| โโโ jjk_index.faiss # FAISS vector index |
| โโโ chunks.txt # Raw wiki knowledge base |
| โโโ generation_config.json |
| โโโ README.md |
| ``` |
|
|
| --- |
|
|
| <div align="center"> |
|
|
| **Built with ๐ฉธ and cursed energy for the JJK community.** |
|
|
| *Got a question the bot fumbled? Open a [Discussion](https://huggingface.co/TinyModels/JujutsuKaiserver/discussions) and help us fix it.* |
|
|
| `TinyModels` โข `QuantaSparkLabs` โข Apache 2.0 |
|
|
| </div> |