MukulRay commited on
Commit
ef5f450
·
0 Parent(s):

feat: Irminsul — Llama 3.1 8B QLoRA RAG serving stack with Pinecone, FastAPI, Azure

Browse files
Files changed (15) hide show
  1. .gitignore +7 -0
  2. Dockerfile +24 -0
  3. Readme.MD +171 -0
  4. docs/character_builds.md +390 -0
  5. docs/characters_lore.md +204 -0
  6. docs/demo.html +321 -0
  7. docs/elemental_mechanics.md +207 -0
  8. docs/world_lore.md +137 -0
  9. embedder.py +20 -0
  10. index.html +672 -0
  11. ingest.py +110 -0
  12. main.py +83 -0
  13. pyvenv.cfg +5 -0
  14. rag.py +116 -0
  15. requirements.txt +20 -0
.gitignore ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ .env
2
+ venv/
3
+ models/
4
+ __pycache__/
5
+ *.pyc
6
+ *.pyo
7
+ .DS_Store
Dockerfile ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.12-slim
2
+
3
+ # System deps for bitsandbytes + torch
4
+ RUN apt-get update && apt-get install -y --no-install-recommends \
5
+ git \
6
+ curl \
7
+ build-essential \
8
+ && rm -rf /var/lib/apt/lists/*
9
+
10
+ WORKDIR /app
11
+
12
+ # Install Python deps first (layer cache)
13
+ COPY requirements.txt .
14
+ RUN pip install --no-cache-dir -r requirements.txt
15
+
16
+ # Copy app code
17
+ COPY main.py rag.py embedder.py ingest.py index.html ./
18
+
19
+ # Model is NOT baked in — mount via Azure Blob or provide MODEL_PATH env var
20
+ # See README for options
21
+
22
+ EXPOSE 8000
23
+
24
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Readme.MD ADDED
@@ -0,0 +1,171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # llmops-serve
2
+
3
+ > Fine-tuned Llama 3.1 8B · QLoRA · Pinecone RAG · FastAPI · Azure Container Apps
4
+
5
+ A full end-to-end LLMOps serving stack — from a QLoRA fine-tuned model running in 4-bit NF4 on consumer hardware, through a retrieval-augmented generation pipeline, to a containerized API deployed on Azure. Built to be production-shaped, not just a demo.
6
+
7
+ ---
8
+
9
+ ## What this is
10
+
11
+ Most LLM projects stop at inference. This one goes further:
12
+
13
+ - **Fine-tuned model** — Llama 3.1 8B fine-tuned with QLoRA (rank 16, lr 2e-4) on a custom dataset, merged and served locally in 4-bit NF4 quantization on an RTX 3060 6GB
14
+ - **RAG pipeline** — Documents ingested, chunked, embedded with `sentence-transformers/all-MiniLM-L6-v2` (fully local, zero API cost), and stored in Pinecone. Retrieval is semantic, top-k configurable at query time
15
+ - **Serving layer** — FastAPI with async lifespan model loading, typed Pydantic request/response models, CORS, health check, and a clean browser UI served from the same process
16
+ - **Containerized** — Dockerfile built for slim Python 3.12, model loaded at runtime via env-configurable path (not baked in)
17
+ - **Cloud-ready** — One-shot Azure deployment via ACR + Container Apps, with Pinecone key injected as a secret
18
+
19
+ ---
20
+
21
+ ## Architecture
22
+
23
+ ```
24
+ User query
25
+
26
+
27
+ FastAPI /generate
28
+
29
+ ├── Embed query (sentence-transformers, local)
30
+ │ │
31
+ │ ▼
32
+ │ Pinecone — semantic search → top-k chunks
33
+ │ │
34
+ ▼ ▼
35
+ LangChain RetrievalQA
36
+
37
+
38
+ Llama 3.1 8B (QLoRA fine-tuned, 4-bit NF4)
39
+
40
+
41
+ Grounded answer + source attribution
42
+ ```
43
+
44
+ ---
45
+
46
+ ## Stack
47
+
48
+ | Layer | Technology |
49
+ |---|---|
50
+ | Base model | Llama 3.1 8B Instruct |
51
+ | Fine-tuning | QLoRA via PEFT (r=16, α=32, lr=2e-4) |
52
+ | Quantization | BitsAndBytes 4-bit NF4, bfloat16 compute |
53
+ | Embeddings | sentence-transformers/all-MiniLM-L6-v2 |
54
+ | Vector DB | Pinecone (serverless, cosine similarity) |
55
+ | RAG chain | LangChain RetrievalQA |
56
+ | Serving | FastAPI + Uvicorn |
57
+ | Containerization | Docker (python:3.12-slim) |
58
+ | Cloud | Azure Container Apps + ACR |
59
+
60
+ ---
61
+
62
+ ## Quickstart
63
+
64
+ ```bash
65
+ # 1. Clone and set up environment
66
+ git clone https://github.com/MukulRay1603/Irminsul.git
67
+ cd llmops-serve
68
+ python -m venv venv && source venv/bin/activate # Windows: venv\Scripts\activate
69
+ pip install -r requirements.txt
70
+
71
+ # 2. Configure environment
72
+ cp .env.example .env
73
+ # Fill in PINECONE_API_KEY in .env
74
+
75
+ # 3. Add your fine-tuned model
76
+ # Place merged model at: ./models/merged/exp2_lr2e-4_r16
77
+ # Or update MODEL_PATH in .env to point to your model
78
+
79
+ # 4. Ingest documents
80
+ python ingest.py --dir ./docs --chunk-size 300 --chunk-overlap 40
81
+
82
+ # 5. Start the server
83
+ uvicorn main:app --reload --port 8000
84
+
85
+ # UI available at http://localhost:8000
86
+ # API docs at http://localhost:8000/docs
87
+ ```
88
+
89
+ ---
90
+
91
+ ## API
92
+
93
+ | Method | Endpoint | Description |
94
+ |---|---|---|
95
+ | `GET` | `/` | Browser UI |
96
+ | `GET` | `/health` | Model load status |
97
+ | `POST` | `/generate` | RAG query → grounded answer |
98
+ | `POST` | `/ingest` | Ingest docs from local directory |
99
+
100
+ **Example:**
101
+
102
+ ```bash
103
+ curl -X POST http://localhost:8000/generate \
104
+ -H "Content-Type: application/json" \
105
+ -d '{"query": "What weapons should Hu Tao use on a budget?", "top_k": 3}'
106
+ ```
107
+
108
+ ```json
109
+ {
110
+ "answer": "For Hu Tao on a budget, Dragon's Bane is the strongest F2P option — it scales with Elemental Mastery and deals significant bonus damage on vaporized hits. White Tassel is the best 3-star alternative for pure Normal Attack scaling.",
111
+ "sources": ["docs/character_builds.md"],
112
+ "latency_ms": 4821.3
113
+ }
114
+ ```
115
+
116
+ ---
117
+
118
+ ## Memory profile (RTX 3060 6GB)
119
+
120
+ | Component | VRAM |
121
+ |---|---|
122
+ | Llama 3.1 8B @ 4-bit NF4 | ~4.5 GB |
123
+ | all-MiniLM-L6-v2 embedder | ~90 MB |
124
+ | Inference headroom | ~1.2 GB |
125
+
126
+ Running the embedder on CPU frees ~90MB if needed — set `device_map="cpu"` in `rag.py`.
127
+
128
+ ---
129
+
130
+ ## Deploy to Azure
131
+
132
+ ```bash
133
+ export PINECONE_API_KEY=your_key
134
+ chmod +x deploy_azure.sh
135
+ ./deploy_azure.sh
136
+ ```
137
+
138
+ The script provisions a resource group, builds and pushes the image via ACR Tasks (no local Docker build needed), creates a Container Apps environment, and deploys with the Pinecone key injected as a secret. Prints the live HTTPS endpoint on completion.
139
+
140
+ **Model in Azure:** The merged model (~16GB) isn't baked into the image. Recommended approach: mount from Azure Blob Storage as a volume for cheapest cold start on student credits.
141
+
142
+ ---
143
+
144
+ ## Project structure
145
+
146
+ ```
147
+ llmops-serve/
148
+ ├── main.py # FastAPI app — endpoints, lifespan, CORS
149
+ ├── rag.py # Model loading, 4-bit config, LangChain RAG chain
150
+ ├── embedder.py # sentence-transformers singleton wrapper
151
+ ├── ingest.py # Doc loader → chunker → Pinecone upsert
152
+ ├── index.html # Browser UI (dark theme, query history, source display)
153
+ ├── Dockerfile
154
+ ├── deploy_azure.sh # One-shot Azure Container Apps deploy
155
+ ├── requirements.txt
156
+ └── .env.example
157
+ ```
158
+
159
+ ---
160
+
161
+ ## What's next
162
+
163
+ - [ ] Swap naive word chunker for `MarkdownHeaderTextSplitter` for better retrieval precision
164
+ - [ ] Add metadata filtering to Pinecone queries (filter by character, content type)
165
+ - [ ] Streaming response via SSE for lower perceived latency
166
+ - [ ] Expand corpus — per-character deep dives with stat thresholds and rotation guides
167
+ - [ ] CI/CD pipeline — GitHub Actions → ACR build → Container Apps deploy on push
168
+
169
+ ---
170
+
171
+ Built while learning the full MLOps lifecycle — fine-tuning, quantization, retrieval, serving, and cloud deployment — on consumer hardware. Every component chosen deliberately, not for hype.
docs/character_builds.md ADDED
@@ -0,0 +1,390 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Genshin Impact — Character Builds, Weapons & Team Compositions
2
+
3
+ ---
4
+
5
+ ## Build Guide Philosophy
6
+
7
+ Every character build has three tiers: Best-in-Slot (BiS) using 5-star weapons and ideal artifact sets, Strong F2P using 4-star weapons available from the game without spending, and Budget using easily farmable or craftable options. Team compositions are listed as core (non-negotiable role) plus flex (adaptable slot).
8
+
9
+ ---
10
+
11
+ ## Pyro Characters
12
+
13
+ ### Hu Tao — 5-star Pyro Polearm
14
+ **Role:** Main DPS, vaporize/melt enabler
15
+ **Playstyle:** Hu Tao's E (Infernal Shift) consumes her HP to enter a powered state that converts her normal attacks to Pyro and buffs her damage based on Max HP. Optimal play is low HP (ideally 50% or below) to maximize her A1 passive. She hits extremely hard in short burst windows.
16
+
17
+ **Best-in-Slot:**
18
+ - Weapon: Staff of Homa (5-star) — massive HP bonus, massive ATK bonus when below 50% HP, synergizes perfectly
19
+ - Artifacts: 4pc Crimson Witch of Flames — reaction bonus and Pyro DMG
20
+ - Substats priority: CRIT Rate/DMG, HP%, Elemental Mastery
21
+
22
+ **Strong F2P:**
23
+ - Weapon: Dragon's Bane (4-star, event) — EM scaling, massive damage on vaporized hits
24
+ - Weapon: Deathmatch (4-star, BP) — reliable CRIT Rate
25
+ - Weapon: White Tassel (3-star) — best 3-star option for pure Normal ATK scaling
26
+
27
+ **Budget:**
28
+ - Weapon: Halberd (4-star, random world drop) — ATK% and normal attack bonus
29
+ - Artifacts: 2pc CW + 2pc Wanderer's Troupe for EM, or 4pc Lavawalker if mono-Pyro
30
+
31
+ **Core Teams:**
32
+ 1. Vaporize: Hu Tao / Yelan / Xingqiu / Zhongli — Yelan + Xingqiu provide constant off-field Hydro application for Vaporize reactions. Zhongli provides shielding so Hu Tao can stay at low HP safely.
33
+ 2. Melt: Hu Tao / Rosaria / Layla / Bennett — Cryo application from Rosaria and Layla enables Melt. Bennett provides healing and ATK buff.
34
+ 3. Budget: Hu Tao / Xingqiu / Sucrose / Noelle — Sucrose provides EM buff and grouping; Noelle provides shields and healing.
35
+
36
+ **Notes:** Hu Tao is one of the highest individual DPS characters in the game. Her weakness is survivability — she requires either shield support or precise dodging. She does NOT want Bennett in most teams because his healing pushes her above the HP threshold for her passive.
37
+
38
+ ---
39
+
40
+ ### Yoimiya — 5-star Pyro Bow
41
+ **Role:** Main DPS, single-target Pyro applicator
42
+ **Playstyle:** Yoimiya's E turns her Normal Attacks into Pyro for 10 seconds. Her kit is entirely Normal Attack focused, making her the best bow Normal ATK DPS. She applies Pyro rapidly to a single target, enabling consistent Vaporize.
43
+
44
+ **Best-in-Slot:**
45
+ - Weapon: Thundering Pulse (5-star) — Normal ATK stacking bonus, ideal fit
46
+ - Weapon: Aqua Simulacra (5-star) — pure DMG% bonus works well
47
+ - Artifacts: 4pc Shimenawa's Reminiscence — Normal ATK DMG bonus, though the ER cost is real
48
+
49
+ **Strong F2P:**
50
+ - Weapon: Rust (4-star) — Normal ATK DMG% at the cost of charged attack, which Yoimiya never uses
51
+ - Weapon: Hamayumi (4-star, craftable) — craftable, Normal ATK focused
52
+
53
+ **Core Teams:**
54
+ 1. Vaporize: Yoimiya / Yelan / Fischl / Zhongli — Yelan for Hydro application, Fischl for off-field Electro (enabling Overloaded/Aggravate variety), Zhongli for shield
55
+ 2. Mono Pyro: Yoimiya / Bennett / Xiangling / Kazuha — Bennett for ATK buff, Xiangling for off-field Pyro, Kazuha for elemental DMG% bonus and grouping
56
+
57
+ **Notes:** Yoimiya's weakness is AoE — she is single-target focused. Her burst applies Pyro to enemies that hit her allies, which is a creative mechanic for spreading damage.
58
+
59
+ ---
60
+
61
+ ### Xiangling — 4-star Pyro Polearm (Free from Spiral Abyss Floor 3-3)
62
+ **Role:** Off-field DPS, Pyro applicator, one of the best 4-stars in the game
63
+ **Playstyle:** Xiangling is unique — despite being a 4-star, she is a top-tier off-field DPS through her Burst (Pyronado), which spins around the active character continuously for 14 seconds, applying Pyro. Combined with Hydro supports, she generates constant Vaporize reactions from off-field.
64
+
65
+ **Best-in-Slot:**
66
+ - Weapon: Staff of Homa (5-star, also works off-field HP scaling)
67
+ - Weapon: The Catch (4-star, free from fishing) — ER and Burst DMG bonus, extremely cost-effective
68
+ - Artifacts: 4pc Emblem of Severed Fate — ER to Burst DMG conversion, essentially made for her
69
+
70
+ **Strong F2P:**
71
+ - Weapon: Kitain Cross Spear (4-star, craftable) — ER passive, reduces Burst cost in effect
72
+ - Weapon: Favonius Lance (4-star) — pure ER for more consistent Burst uptime
73
+
74
+ **Core Teams:**
75
+ 1. National Team: Xiangling / Bennett / Xingqiu / Sucrose (or Kazuha) — one of the most consistent teams in the game, works on nearly all content
76
+ 2. Childe-Xiangling: Childe / Xiangling / Bennett / Kazuha — Childe applies Hydro rapidly, Xiangling's Pyronado Vaporizes off Childe's application
77
+ 3. Raiden National: Xiangling / Bennett / Xingqiu / Raiden Shogun — Raiden charges energy for the whole team
78
+
79
+ **Notes:** Xiangling requires significant ER investment (typically 180%+) to maintain Burst uptime. She benefits enormously from Bennett's ATK buff because Pyronado snapshots buffs at the time it's cast.
80
+
81
+ ---
82
+
83
+ ### Bennett — 4-star Pyro Sword (One of the best supports in the game)
84
+ **Role:** Healer, ATK buffer, Pyro applicator
85
+ **Playstyle:** Bennett's Burst creates a field that continuously heals characters inside it and provides a massive ATK bonus (based on Bennett's Base ATK) to characters above 70% HP. This ATK buff is flat and applies to all damage types, making him useful in virtually every team.
86
+
87
+ **Best-in-Slot:**
88
+ - Weapon: Aquila Favonia (5-star) — highest Base ATK of any sword, maximizes his ATK buff
89
+ - Weapon: Mistsplitter Reforged (5-star) — Base ATK + Elemental DMG buffs both work here
90
+ - Artifacts: 4pc Noblesse Oblige — team ATK% bonus on Burst use, perfect support set
91
+
92
+ **Strong F2P:**
93
+ - Weapon: Festering Desire (4-star, limited event) — ER and Skill DMG
94
+ - Weapon: Favonius Sword (4-star) — ER for consistent Burst
95
+ - Weapon: Black Sword (4-star, BP) — if using Bennett as a pseudo-DPS
96
+
97
+ **Notes:** CRITICAL — Bennett's C6 converts all sword/claymore/polearm characters in his field to Pyro infusion. This is extremely powerful for some teams and completely destructive for others (it ruins physical carry teams and some specific reaction teams). Most experienced players recommend NOT using Bennett C6 unless you know exactly what you're doing.
98
+
99
+ ---
100
+
101
+ ### Dehya — 5-star Pyro Claymore
102
+ **Role:** Damage mitigation tank, off-field Pyro, sub-DPS
103
+ **Playstyle:** Dehya places a field that redirects damage from allies to herself and provides periodic Pyro application. She is one of the most unique tanks in the game — she absorbs hits instead of blocking them with a shield. Her Burst allows rapid punching in a brief window.
104
+
105
+ **Best-in-Slot:**
106
+ - Weapon: Beacon of the Reed Sea (5-star, her signature) — HP and ATK bonus when taking damage
107
+ - Artifacts: 4pc Tenacity of the Millelith — HP and shield/team ATK bonus
108
+
109
+ **Core Teams:**
110
+ 1. Dehya / Furina / Kazuha / Bennett — Furina rewards characters who take damage; Dehya's damage absorption synergizes perfectly
111
+ 2. Mono Pyro: Dehya / Bennett / Xiangling / Kazuha
112
+
113
+ ---
114
+
115
+ ## Hydro Characters
116
+
117
+ ### Yelan — 5-star Hydro Bow
118
+ **Role:** Off-field DPS, Hydro applicator, damage amplifier
119
+ **Playstyle:** Yelan's Burst summons dice that follow the active character and periodically deal Hydro damage, similar to Xingqiu but with higher individual hit damage and a unique buff — the longer her Burst is active, the more damage bonus the active character gets (ramping from 1% to 50% DMG bonus over 15 seconds).
120
+
121
+ **Best-in-Slot:**
122
+ - Weapon: Aqua Simulacra (5-star, her signature) — HP% and DMG% bonus when enemies are nearby
123
+ - Artifacts: 4pc Emblem of Severed Fate — ER to Burst DMG, enables consistent uptime
124
+
125
+ **Strong F2P:**
126
+ - Weapon: Fading Twilight (4-star, event) — cycling DMG% bonus
127
+ - Weapon: Favonius Warbow (4-star) — ER generation for team
128
+
129
+ **Core Teams:**
130
+ 1. Vaporize support: Any Pyro main DPS / Yelan / Xingqiu / flex — double Hydro application is extremely consistent
131
+ 2. Furina / Neuvillette teams: Yelan works beautifully in Fontaine-focused compositions
132
+
133
+ **Notes:** Yelan is largely considered a direct upgrade to Xingqiu for teams that can use both, but Xingqiu has the unique advantage of providing damage reduction (making him better for fragile DPS like Hu Tao).
134
+
135
+ ---
136
+
137
+ ### Xingqiu — 4-star Hydro Sword (Free from in-game events periodically)
138
+ **Role:** Off-field DPS, Hydro applicator, damage reduction provider
139
+ **Playstyle:** Xingqiu's Burst summons Rain Swords that follow the active character, dealing Hydro damage on Normal Attacks and providing damage reduction. He is one of the most versatile supports in the game and appears in more top-level teams than almost any other character.
140
+
141
+ **Best-in-Slot:**
142
+ - Weapon: Haran Geppaku Futsu (5-star) or Primordial Jade Cutter (5-star)
143
+ - Weapon: Sacrificial Sword (4-star) — most F2P BiS, allows double E cast for energy generation
144
+ - Artifacts: 4pc Emblem of Severed Fate
145
+
146
+ **Core Teams:** National Team, any Pyro vaporize team, Hu Tao team, Kamisato Ayato teams
147
+
148
+ ---
149
+
150
+ ### Neuvillette — 5-star Hydro Catalyst
151
+ **Role:** Main DPS, Hydro applicator, self-sufficient damage dealer
152
+ **Playstyle:** Neuvillette is unique — his power comes from a Charged Attack that he charges by collecting Water Droplets generated through reactions and his own skills. His fully charged attack deals massive AoE Hydro damage. He is one of the strongest DPS characters currently, largely self-sufficient, and does not require specific reaction partners (though he benefits from them).
153
+
154
+ **Best-in-Slot:**
155
+ - Weapon: Tome of the Eternal Flow (5-star, signature) — HP and Charged Attack DMG
156
+ - Artifacts: 4pc Marechaussee Hunter — CRIT Rate increase when HP changes, pairs with his HP fluctuation passive
157
+
158
+ **Strong F2P:**
159
+ - Weapon: Sacrificial Jade (4-star) — HP% bonus
160
+ - Weapon: Prototype Amber (4-star, craftable) — ER and HP healing
161
+
162
+ **Core Teams:**
163
+ 1. Neuvillette / Furina / Kazuha / Jean — Furina amplifies damage when characters lose HP; Kazuha groups enemies; Jean heals
164
+ 2. Neuvillette / Furina / Baizhu / Kazuha — Baizhu provides shields that let Furina work without risk
165
+ 3. Budget: Neuvillette / Barbara / Kazuha / Zhongli
166
+
167
+ ---
168
+
169
+ ### Furina — 5-star Hydro Catalyst
170
+ **Role:** Support DPS, team damage amplifier, HP fluctuation enabler
171
+ **Playstyle:** Furina summons a Salon that includes characters who drain HP from active characters and deal Hydro damage. Simultaneously, she has a mechanic where she accumulates a stack (Fanfare) based on how much HP team members gain or lose — higher Fanfare gives her Burst a larger damage bonus to the whole team. She rewards teams where HP fluctuates frequently.
172
+
173
+ **Best-in-Slot:**
174
+ - Weapon: Splendor of Tranquil Waters (5-star, signature)
175
+ - Artifacts: 4pc Golden Troupe — off-field Skill DMG bonus
176
+
177
+ **Strong F2P:**
178
+ - Weapon: Favonius Codex (4-star) — ER for consistent Burst
179
+
180
+ **Core Teams:**
181
+ 1. Furina / Neuvillette / Kazuha / Charlotte — premier Neuvillette team
182
+ 2. Furina / Wriothesley / Shenhe / Kazuha — Cryo DPS with Furina support
183
+ 3. Furina / any HP-fluctuating DPS / healer / buffer — she works with almost any DPS
184
+
185
+ **Notes:** Furina requires a dedicated healer on the team because her Salon deals constant HP drain. She is not compatible with shield-only protection.
186
+
187
+ ---
188
+
189
+ ## Electro Characters
190
+
191
+ ### Raiden Shogun — 5-star Electro Polearm
192
+ **Role:** Main DPS / ER battery, Electro applicator, Burst DPS
193
+ **Playstyle:** Raiden generates Resolve stacks based on team members' Burst damage. Her own Burst consumes all Resolve for a massively powerful Musou Isshin state where she deals high Electro damage with normal attacks and restores energy for the whole team. She is simultaneously an ER support and a burst DPS.
194
+
195
+ **Best-in-Slot:**
196
+ - Weapon: Engulfing Lightning (5-star, signature) — ER to ATK conversion, ER secondary stat
197
+ - Artifacts: 4pc Emblem of Severed Fate — ER to Burst DMG
198
+
199
+ **Strong F2P:**
200
+ - Weapon: The Catch (4-star, free fishing) — ER, Burst DMG and CRIT, exceptional value
201
+ - Weapon: Favonius Lance (4-star) — ER focused
202
+
203
+ **Core Teams:**
204
+ 1. Raiden National: Raiden / Bennett / Xiangling / Xingqiu — among the most consistent teams ever
205
+ 2. Raiden Hypercarry: Raiden / Sara / Kazuha / Bennett — all buffs focused on Raiden's Burst
206
+ 3. Raiden Furina: Raiden / Furina / Nahida / flex — Nahida enables Quicken/Aggravate, Furina buffs Burst
207
+
208
+ ---
209
+
210
+ ### Cyno — 5-star Electro Polearm
211
+ **Role:** Main DPS, extended Burst DPS, Aggravate/Quickbloom enabler
212
+ **Playstyle:** Cyno's Burst gives him a powerful extended state (unlike most Bursts, his lasts 18 seconds and he uses the Burst as his combat mode). He deals high Electro damage through Normal Attacks during this state, and his A4 passive extends the state further when he uses his Skill. He works best in Dendro reaction teams.
213
+
214
+ **Best-in-Slot:**
215
+ - Weapon: Staff of the Scarlet Sands (5-star, signature) — EM to ATK conversion
216
+ - Artifacts: 4pc Thundering Fury — Electro DMG and reaction-triggered CD reduction
217
+
218
+ **Core Teams:**
219
+ 1. Aggravate: Cyno / Nahida / Fischl / Baizhu — Fischl for off-field Electro and energy, Nahida for Dendro application, Baizhu for shields and healing
220
+ 2. Quickbloom: Cyno / Nahida / Xingqiu / Fischl
221
+
222
+ ---
223
+
224
+ ### Fischl — 4-star Electro Bow
225
+ **Role:** Off-field Electro DPS, energy generator
226
+ **Playstyle:** Fischl's Skill summons Oz, a raven who continuously deals Electro damage to enemies. At C6, whenever Fischl or Oz triggers an Electro reaction, Fischl deals additional joint ATK damage. She is the premier off-field Electro applicator and appears in a huge range of teams.
227
+
228
+ **Best-in-Slot:**
229
+ - Weapon: The Stringless (4-star) — Elemental Mastery, Skill and Burst DMG bonus
230
+ - Artifacts: 4pc Thundering Fury (for C6) or 4pc Golden Troupe (for Oz uptime)
231
+
232
+ **Core Teams:** Aggravate teams, Quickbloom teams, Superconduct physical teams, National Team variants
233
+
234
+ ---
235
+
236
+ ## Cryo Characters
237
+
238
+ ### Ganyu — 5-star Cryo Bow
239
+ **Role:** Main DPS (Charged Attack), Sub-DPS (off-field Burst), Cryo applicator
240
+ **Playstyle:** Ganyu's Charged Attack has two levels. Her Level 2 Charged Attack (Frostflake Arrow) fires a homing arrow that explodes in AoE Cryo on contact. This is her primary damage source. She is one of the highest-ceiling DPS characters in the game for AoE content.
241
+
242
+ **Best-in-Slot:**
243
+ - Weapon: Amos' Bow (5-star) — Charged ATK DMG bonus, arrow travel time bonus
244
+ - Artifacts: 4pc Blizzard Strayer — CRIT Rate bonus against Cryo-affected enemies (effectively gives free CRIT)
245
+
246
+ **Strong F2P:**
247
+ - Weapon: Prototype Crescent (4-star, craftable) — ATK% on weak point hit, Ganyu almost always hits weak points
248
+
249
+ **Core Teams:**
250
+ 1. Freeze: Ganyu / Kokomi / Kazuha / Shenhe — Kokomi provides Hydro for Freeze, Kazuha groups and buffs, Shenhe buffs Cryo damage
251
+ 2. Melt: Ganyu / Bennett / Xiangling / Zhongli — Xiangling applies Pyro off-field; Ganyu's arrows Melt for 1.5x damage
252
+ 3. Mono Cryo: Ganyu / Shenhe / Kazuha / Layla
253
+
254
+ ---
255
+
256
+ ### Ayaka — 5-star Cryo Sword
257
+ **Role:** Main DPS, Freeze team anchor
258
+ **Playstyle:** Ayaka's Dash applies Cryo on exit, her Normal Attacks are elegantly chained, and her Burst summons a snowstorm that deals extremely high Cryo damage over 5 seconds. She is the centerpiece of Freeze teams and one of the most visually impressive DPS characters.
259
+
260
+ **Best-in-Slot:**
261
+ - Weapon: Mistsplitter Reforged (5-star) — Elemental DMG and stacking bonus
262
+ - Artifacts: 4pc Blizzard Strayer — like Ganyu, Ayaka benefits enormously from the free CRIT Rate
263
+
264
+ **Strong F2P:**
265
+ - Weapon: Amenoma Kageuchi (4-star, craftable in Inazuma) — ER passive, helps with Burst uptime
266
+
267
+ **Core Teams:**
268
+ 1. Freeze: Ayaka / Kokomi / Kazuha / Shenhe — the premier Ayaka team
269
+ 2. Ayaka / Mona / Venti / Diona — older freeze composition, still very effective
270
+
271
+ ---
272
+
273
+ ### Wriothesley — 5-star Cryo Catalyst
274
+ **Role:** Main DPS, Charged Attack focused
275
+ **Playstyle:** Wriothesley is a melee catalyst DPS whose combat revolves around consuming his own HP through a special punch-based Charged Attack. He recovers HP through his passive when he hits certain thresholds, creating a HP management rhythm. He deals extremely high damage in short windows.
276
+
277
+ **Best-in-Slot:**
278
+ - Weapon: Cashflow Supervision (5-star, signature) — ATK and Charged ATK bonus when HP changes
279
+ - Artifacts: 4pc Marechaussee Hunter — CRIT Rate when HP fluctuates, perfect synergy
280
+
281
+ **Core Teams:**
282
+ 1. Wriothesley / Furina / Shenhe / Kazuha
283
+ 2. Wriothesley / Furina / Nahida / Charlotte — Quickbloom variant
284
+
285
+ ---
286
+
287
+ ## Anemo Characters
288
+
289
+ ### Kazuha — 5-star Anemo Sword
290
+ **Role:** Crowd control, elemental DMG% buffer, Swirl DPS
291
+ **Playstyle:** Kazuha's Skill and Burst Swirl elements he absorbs, and his A4 passive converts Elemental Mastery into elemental DMG bonus for the entire team (based on the element he Swirls). 200 EM gives 40% elemental DMG to the absorbed element for all allies. He is the best Anemo support for reaction teams.
292
+
293
+ **Best-in-Slot:**
294
+ - Weapon: Freedom-Sworn (5-star) — EM primary stat, team buff on Swirl reactions
295
+ - Artifacts: 4pc Viridescent Venerer — reduces enemy Elemental RES by 40%, standard for all Anemo supports
296
+
297
+ **Strong F2P:**
298
+ - Weapon: Iron Sting (4-star, craftable) — EM stat, solid budget option
299
+ - Weapon: Xiphos' Moonlight (4-star) — EM primary, team ER generation
300
+
301
+ **Notes:** Kazuha is arguably the single best support in the game for elemental teams. He appears in more high-investment team compositions than any other character. His F2P weapon options are genuinely competitive.
302
+
303
+ ---
304
+
305
+ ### Venti — 5-star Anemo Bow
306
+ **Role:** Crowd control, ER battery, grouping support
307
+ **Playstyle:** Venti's Burst pulls all non-boss enemies into a vortex for 8 seconds, dealing Anemo damage and absorbing another element to deal additional damage. He is the best grouping character in the game. His Skill also launches enemies. He restores 15 energy to all teammates of the absorbed element type after his Burst ends.
308
+
309
+ **Best-in-Slot:**
310
+ - Weapon: Elegy for the End (5-star) — EM and team ATK/EM buff, excellent for support
311
+ - Artifacts: 4pc Viridescent Venerer
312
+
313
+ **Strong F2P:**
314
+ - Weapon: Favonius Warbow (4-star) — ER generation
315
+ - Weapon: Stringless (4-star) — EM and Skill/Burst DMG
316
+
317
+ **Notes:** Venti loses significant value against large enemies and bosses that cannot be sucked into his Burst. He is exceptional in Spiral Abyss against small-medium enemies.
318
+
319
+ ---
320
+
321
+ ## Geo Characters
322
+
323
+ ### Zhongli — 5-star Geo Polearm
324
+ **Role:** Shield support, shred support, off-field sub-DPS
325
+ **Playstyle:** Zhongli creates the strongest shield in the game (based on his Max HP) and his Stele resonance reduces enemy Geo and ALL elemental RES by 20% just by existing near enemies. His Burst petrifies enemies. He is the gold standard for defensive support.
326
+
327
+ **Best-in-Slot:**
328
+ - Weapon: Staff of Homa (5-star) — HP and ATK bonus, high shield value
329
+ - Weapon: Black Tassel (3-star) — actually one of his best weapons, pure HP% focus for maximum shield
330
+ - Artifacts: 4pc Tenacity of the Millelith — HP and shield strength, team ATK bonus on Skill hit
331
+
332
+ **Strong F2P:**
333
+ - Weapon: Favonius Lance (4-star) — ER for more consistent Burst
334
+ - Weapon: Prototype Starglitter (4-star, craftable) — ER and Normal ATK bonus
335
+
336
+ **Notes:** Zhongli with Black Tassel is a legitimate budget option that outperforms many 5-star weapons on him because his shield scales purely from HP. He is one of the most universally useful characters in the game — every team benefits from his shield and RES shred.
337
+
338
+ ---
339
+
340
+ ## Dendro Characters
341
+
342
+ ### Nahida — 5-star Dendro Catalyst
343
+ **Role:** Off-field Dendro DPS, Elemental Mastery buffer, reaction catalyst
344
+ **Playstyle:** Nahida marks enemies with Seed of Skandha through her Skill (she can mark multiple enemies), and every elemental reaction that triggers on marked enemies deals bonus Dendro damage. Her Burst buffs the team based on the elements present (EM bonus from Pyro characters, Crit Rate from Electro, duration from Hydro).
345
+
346
+ **Best-in-Slot:**
347
+ - Weapon: A Thousand Floating Dreams (5-star, signature) — EM and team EM buff
348
+ - Artifacts: 4pc Deepwood Memories — Dendro RES shred, best support set; or 4pc Gilded Dreams for personal DPS
349
+
350
+ **Strong F2P:**
351
+ - Weapon: Sacrificial Fragments (4-star) — EM primary, double Skill cast
352
+ - Weapon: Magic Guide (3-star) — actually competitive at R5, EM secondary and DMG bonus against wet/electro enemies
353
+
354
+ **Core Teams:**
355
+ 1. Quickbloom: Nahida / Raiden / Xingqiu / Baizhu — Dendro + Electro + Hydro for Quicken, Bloom, and Hyperbloom reactions
356
+ 2. Aggravate: Nahida / Fischl / Keqing / Kazuha
357
+ 3. Burgeon: Nahida / Thoma / Xingqiu / flex — Thoma triggers Burgeon explosions
358
+
359
+ ---
360
+
361
+ ## General Team Building Principles
362
+
363
+ ### Resonance Bonuses Worth Building Around
364
+ - **Pyro Resonance:** +25% ATK — reason Bennett appears in so many teams
365
+ - **Cryo Resonance:** +15% CRIT Rate against Frozen or Cryo-affected enemies — reason Freeze teams can drop CRIT Rate investment
366
+ - **Geo Resonance:** +15% shield strength, +15% DMG while shielded — makes double-Geo teams inherently strong
367
+ - **Dendro Resonance:** +50 EM and additional EM scaling — makes Dendro-heavy reaction teams hit harder
368
+
369
+ ### The Four Roles in a Team
370
+ 1. **Main DPS:** On-field as much as possible, most field time, primary damage
371
+ 2. **Sub-DPS:** Off-field damage through Skills or Bursts
372
+ 3. **Support:** Buffs, healing, shields, crowd control
373
+ 4. **Battery:** Generates energy particles for the team (often doubled with another role)
374
+
375
+ ### ER (Energy Recharge) Thresholds — Common Benchmarks
376
+ - Xiangling: 180-200% ER
377
+ - Fischl: 120% ER (minimal, Oz generates well)
378
+ - Raiden: 250-300% ER (converts to ATK via passive)
379
+ - Bennett: 170-190% ER
380
+ - Xingqiu: 180-200% ER
381
+ - Nahida: 120-130% ER (low cost Burst)
382
+
383
+ ### Artifact Investment Priority
384
+ Invest in characters in this order for efficient resource use:
385
+ 1. Main DPS — highest priority for good substats
386
+ 2. Key supports (Bennett, Kazuha, Xingqiu) — 4pc set bonus matters more than substats
387
+ 3. Flex support — functional artifacts sufficient
388
+ 4. Shielder/healer — set bonus matters, substats less critical
389
+
390
+ ---
docs/characters_lore.md ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Genshin Impact — Character Lore & Backstories
2
+
3
+ ---
4
+
5
+ ## The Seven Archons
6
+
7
+ ### Venti (Barbatos) — Anemo Archon of Mondstadt
8
+ Venti is the current vessel of Barbatos, the Anemo Archon and one of the Seven. Barbatos is considered the weakest of the Seven in terms of direct divine power, but he governs Mondstadt through freedom rather than control. He does not rule through a council or hierarchy — he simply lets his people govern themselves, which is why Mondstadt has no Archon statue that citizens pray to actively ruling the city.
9
+
10
+ In the ancient past, Barbatos was a minor wind spirit who befriended a nameless bard during the Archon War — a human who dreamed of liberating Mondstadt from the tyrant Decarabian, the God of Storms. When that bard died in the uprising, Barbatos took on his appearance as a tribute. The form Venti wears today is a memory of his closest friend.
11
+
12
+ Barbatos fought in the Archon War roughly 2000 years ago and secured Mondstadt, then largely disappeared. He is frequently absent from Mondstadt and spends centuries wandering. This laissez-faire approach is intentional — he believes true freedom means not needing a god to watch over you. He is deeply connected to music and wind, and his gnosis (the divine object representing his status among the Seven) was stolen by Signora of the Fatui, severely weakening him. He eventually reclaims it with the Traveler's help.
13
+
14
+ Venti has a complicated relationship with memory and loss. The deaths of his friends — the nameless bard, Dvalin's suffering, the erosion of the old Mondstadt — weigh on him. He is playful and cheerful on the surface but carries deep grief beneath it.
15
+
16
+ ---
17
+
18
+ ### Zhongli (Morax) — Geo Archon of Liyue
19
+ Zhongli is the consultant of Wangsheng Funeral Parlor and the current mortal vessel of Morax, the Geo Archon, also known as the God of Contracts. He is one of the oldest and most powerful of the Seven, having fought in the Archon War longer than almost anyone and survived to the modern era.
20
+
21
+ Morax built Liyue through contracts — literal divine contracts binding the adepti, the gods, and later the humans of Liyue to obligations of service and protection. His philosophy is that a contract freely entered is the foundation of civilization. Unlike Barbatos, Morax was heavily involved in governance for thousands of years, working alongside the adepti (divine beings who served him) to protect Liyue from monsters and rival gods.
22
+
23
+ At the start of the Liyue chapter, Zhongli stages his own death — faking a battle with Osial, an ancient god — to test whether Liyue's people can govern themselves without a god watching over them. He hands his gnosis over to the Tsaritsa of the Fatui as part of a contract, which confuses everyone watching because it seems like a betrayal. In truth, the contract terms are his own, and he judges Liyue ready to stand without him.
24
+
25
+ Zhongli has known every major adeptus in Liyue, has lived through the Archon War, the defeat of Osial, and the rise of the Liyue Qixing merchant council. His mortal form ages and he chooses to keep living as a human, maintaining contracts and relationships with the living world. He is close to Hu Tao, the director of Wangsheng Funeral Parlor, and mentors Cloud Retainer's apprentice Shenhe.
26
+
27
+ His personality is formal, deliberate, and knowledgeable to an almost absurd degree. He once spent an entire afternoon explaining the history of every ingredient in a meal. He also perpetually runs out of mora, because he has lived so long that he genuinely does not grasp modern currency management.
28
+
29
+ ---
30
+
31
+ ### Ei (Makoto / Raiden Shogun) — Electro Archon of Inazuma
32
+ The Raiden Shogun is actually two entities sharing one body: Ei, the true Electro Archon who currently rules Inazuma, and the Shogun, a puppet Ei created to govern while Ei herself meditates inside the Plane of Euthymia within her own mind.
33
+
34
+ The original Electro Archon was Makoto, Ei's twin sister. Makoto died during the cataclysm that also destroyed Khaenri'ah, leaving Ei alone. Ei had already watched countless friends and her sister die, including the deaths of Orobashi (the Narukami Ogre god), Gorou's ancestral general, and Istaroth. Overwhelmed by grief and the fear of losing more, Ei adopted the philosophy of Eternity — she would preserve Inazuma exactly as it was, forever, sacrificing change and progress to prevent loss.
35
+
36
+ To achieve this, she sealed herself inside a statue and created the Shogun puppet to enforce the Vision Hunt Decree — a policy of taking away Visions (divine gifts from the gods) from citizens, believing that human ambition and individual destiny were what led to loss and change. This makes the early Inazuma arc feel oppressive, because it is — it is Ei's grief turned into policy.
37
+
38
+ When the Traveler reaches Ei in the Plane of Euthymia and defeats the phantom she created of Makoto, Ei begins to confront the reality that Eternity maintained through stasis is not the same as true preservation. She ends the Vision Hunt Decree and begins to open Inazuma to the outside world. Her character arc through the Archon quest and her story quest is one of the most emotional in the game — the portrait of a god who has been alone with her grief for five hundred years.
39
+
40
+ Ei is formal, stoic, and often confused by modern customs (she has been meditating for centuries). She is close to Yae Miko, the Guuji of the Grand Narukami Shrine, who has been Ei's most consistent companion through the centuries and acts as her emotional anchor and occasional teasing voice of reason.
41
+
42
+ ---
43
+
44
+ ### Nahida (Lesser Lord Kusanali) — Dendro Archon of Sumeru
45
+ Nahida is the current Dendro Archon of Sumeru and the God of Wisdom, though she spent most of her existence imprisoned inside the Sanctuary of Surasthana by the Sages of Sumeru's Akademiya, who found a child-like god of wisdom inconvenient to their political control of knowledge.
46
+
47
+ The previous Dendro Archon died during the cataclysm five hundred years ago. Nahida was born as a replacement but was immediately confined. She has almost no experience of the physical world, having spent her existence within the Irminsul — the divine tree that records all of Teyvat's memories and history — and within people's dreams, which she could access from her prison.
48
+
49
+ Her confinement is a deliberate tragedy: the God of Wisdom locked away from wisdom, the guardian of Sumeru stripped of agency by the very institution that claims to serve knowledge. When the Traveler arrives in Sumeru, the Akademiya is already engaged in a scheme involving the Scaramouche (Wanderer), the Gnosis, and a plan to essentially become gods themselves through abusing the Irminsul.
50
+
51
+ Nahida is extraordinary in her emotional intelligence despite her isolation. She has watched humanity through dreams for five hundred years and understands people deeply, even if she doesn't always understand the physical world. She is gentle, curious, and precise. Her relationship with the Traveler is unique — she asks questions genuinely, without pretension, and is moved by small gestures of connection.
52
+
53
+ Her power over the Irminsul means she can read and even edit the records of history, which becomes plot-critical when she erases Scaramouche's existence from records to prevent paradox after his defeat. The cost of that action falls on her.
54
+
55
+ ---
56
+
57
+ ### Furina (Focalors) — Hydro Archon of Fontaine
58
+ Furina is one of the most tragic characters in Genshin's story. On the surface she appears to be the theatrical, dramatic Hydro Archon who presides over Fontaine's Court of Fontaine (a judicial system where all disputes are settled through trials). The truth is far more complicated.
59
+
60
+ Five hundred years ago, a prophecy stated that Fontaine would be destroyed — that the people of Fontaine, who all carry a divine curse in their bloodline, would one day dissolve into the Primordial Sea and cease to exist. The true Hydro Archon Focalors devised a plan to break this prophecy: she would split herself into two — a human vessel named Furina who would contain no divine power and no memories of being a god, and a divine consciousness that would wait. Furina would live as a mortal pretending to be an Archon for five hundred years.
61
+
62
+ For five hundred years, Furina performed being a god. She presided over trials with theatrical flair, maintained the illusion of divine authority, and never let anyone know she was terrified and alone and entirely human. She couldn't tell anyone the truth. She had no divine power to fall back on. Every moment was a performance sustained by sheer willpower and the fear that breaking the act would destroy the plan that could save her people.
63
+
64
+ When the Traveler arrives and the truth finally comes out, Furina's breakdown is one of the most emotionally raw moments in the game. She is not a goddess. She is a human woman who has been carrying the weight of an entire civilization's fate alone for five centuries, and she is exhausted. The resolution of her arc — including the sacrifice of Focalors herself — and Furina's choice to keep living as an ordinary human afterward, is genuinely moving.
65
+
66
+ ---
67
+
68
+ ### Nahida's predecessor (Greater Lord Rukkhadevata)
69
+ The original Dendro Archon before Nahida was known as Greater Lord Rukkhadevata, who was closely tied to the Irminsul and to King Deshret's ancient civilization. She sacrificed herself during the cataclysm five hundred years ago to contain a spreading corruption. Her legacy shapes all of Sumeru's history and the Akademiya's founding mythology.
70
+
71
+ ---
72
+
73
+ ## Major Non-Archon Characters
74
+
75
+ ### Hu Tao — 77th Director of Wangsheng Funeral Parlor
76
+ Hu Tao is the director of Liyue's premier funeral services company and one of Liyue's most recognizable figures. She is eccentric, enthusiastic about death (in a philosophical rather than morbid way), and deeply caring beneath her provocative exterior. She took over the directorship of Wangsheng at a very young age and has transformed it into a thriving institution.
77
+
78
+ Her relationship with Zhongli is warm — he is her consultant and she respects his ancient wisdom, while she drags him into the modern world. She writes terrible poetry that she is very proud of. She has a genuine spiritual sensitivity and can perceive the dead and the boundary between life and death.
79
+
80
+ In terms of her worldview, Hu Tao sees death as a necessary and even beautiful part of life's cycle — not something to fear, but to understand and honor. This makes her approach to the funeral business genuinely compassionate rather than mercenary.
81
+
82
+ ---
83
+
84
+ ### Xiao — Yaksha of Liyue
85
+ Xiao is one of the five Yakshas, divine warriors who served Morax and were tasked with purging demons and malevolent spirits across Liyue. He is the last surviving Yaksha — the others fell to madness, corruption, or death from consuming too much karmic debt (the spiritual residue left by demons they killed).
86
+
87
+ Xiao carries an immense burden of guilt and pain. Centuries of killing demons has left him perpetually poisoned by their karma. He suffers from nightmares and can never fully rest. The demon he is most associated with — and the one he calls his true enemy — is his own past self, when he was enslaved by a god and forced to kill.
88
+
89
+ His connection to the Traveler is one of the few things that seems to genuinely ease his suffering. He is curt and antisocial, but fiercely protective of those he cares about. His relationship with Ganyu, who also served Morax, is one of cautious mutual respect.
90
+
91
+ ---
92
+
93
+ ### Ganyu — Secretary of the Liyue Qixing
94
+ Ganyu is half-human, half-adeptus, born from the union of a human and a qilin (a divine creature). She has served Morax and the Liyue Qixing for nearly three thousand years as a secretary and emissary. She is overworked, self-deprecating about her half-human heritage (which she sees as a flaw), and deeply loyal to Liyue.
95
+
96
+ Her arc involves coming to terms with her identity — she spent millennia feeling like she belonged fully to neither the human world nor the adepti world. The Traveler's acceptance and her own growth through the Lantern Rite events help her find peace with herself.
97
+
98
+ ---
99
+
100
+ ### Keqing — Yuheng of the Liyue Qixing
101
+ Keqing is one of the seven members of the Liyue Qixing, Liyue's governing merchant council. She is hardworking, pragmatic, and notably skeptical of divine authority — she believes humans should not rely on gods to solve human problems. This puts her in interesting ideological contrast with the adepti-worshipping parts of Liyue culture.
102
+
103
+ When Zhongli stages his death, Keqing is one of the people who springs into action to ensure Liyue can function without an Archon — which is exactly what Zhongli wanted to see happen.
104
+
105
+ ---
106
+
107
+ ### Tartaglia (Childe, Ajax) — 11th Harbinger of the Fatui
108
+ Childe is one of the Eleven Harbingers, elite agents of the Tsaritsa of Snezhnaya. Unlike most Harbingers who are cold and calculating, Childe is genuinely enthusiastic about battle — almost boyishly so. He loves a fair fight more than anything.
109
+
110
+ His backstory: as a teenager, Ajax fell into an Abyss portal and survived by learning combat from Skirk, a mysterious master who lives at the bottom of the Abyss. He emerged fundamentally changed — capable of activating a Foul Legacy transformation that taps into Abyssal power at the cost of his own sanity and body.
111
+
112
+ Childe is genuinely fond of his family (his younger siblings in Snezhnaya) and has a complicated relationship with his role as a Harbinger. He serves the Tsaritsa but is not ideologically committed to the cause in the way other Harbingers are. His battle with the Traveler during the Liyue chapter — when he summons Osial — is a sincere test of strength rather than a calculated political move.
113
+
114
+ ---
115
+
116
+ ### Albedo — Chief Alchemist of the Knights of Favonius
117
+ Albedo is a synthetic human created by the alchemist Rhinedottir, also known as Gold, who was a student of the ancient scholar Rhinedottir of Khaenri'ah. He was created through alchemy and is technically a homunculus — a constructed life form. His existence raises questions he actively investigates: what defines life, what distinguishes constructed existence from natural existence, and whether his consciousness is meaningfully different from a human's.
118
+
119
+ He is brilliant, methodical, and somewhat removed from human emotion without being cold. His relationship with his apprentice Sucrose is warm — she is one of the few people he is openly kind to. His creation by a Khaenri'ah-connected figure and his synthetic nature put him in a complicated position relative to the game's larger mysteries.
120
+
121
+ ---
122
+
123
+ ### Diluc — Owner of Dawn Winery
124
+ Diluc is one of the most powerful humans in Mondstadt, former Knight of Favonius captain, and current vigilante who fights monsters without official affiliation. His father died after using a Delusion (a fake Vision created by the Fatui) which destroyed his body, and Diluc blamed the Knights of Favonius for covering it up. He resigned and spent years traveling the world investigating the Fatui.
125
+
126
+ He returned to Mondstadt before the events of the game and now operates Dawn Winery during the day and patrols the city at night. His relationship with Kaeya — his adopted brother — is deeply complicated. Kaeya is secretly of Khaenri'ah origin, sent to Mondstadt as a child by his father for unknown reasons. When Diluc discovered this during a confrontation after his father's death, their relationship shattered, and it has not been fully repaired.
127
+
128
+ ---
129
+
130
+ ### Kaeya — Cavalry Captain of the Knights of Favonius
131
+ Kaeya is charming, manipulative, and deeply secretive. He was sent to Mondstadt as a child by his Khaenri'ah father as a secret asset — an insurance policy, as Kaeya himself describes it. He has genuinely come to care about Mondstadt over the years, but his ultimate loyalties and goals remain ambiguous.
132
+
133
+ His relationship with Diluc is the most emotionally loaded relationship between two playable characters — broken brotherhood, mutual grief over their father, and years of estrangement during which neither fully let go.
134
+
135
+ ---
136
+
137
+ ### Fischl — Investigator of the Adventurers' Guild
138
+ Fischl is a young woman who has committed fully to a fantasy alter-ego as "Fischl von Luftschloss Narfidort," a traveler from another world. She speaks in elaborate archaic language and is accompanied by Oz, a raven who is actually a manifestation of her Vision.
139
+
140
+ Beneath the performance, Fischl is a deeply lonely person who created her persona as a coping mechanism. She communicates more authentically through Oz than through her own speech. Her friendship with Mona, who sees through the persona but respects what it means to her, is one of the warmer background relationships in Mondstadt.
141
+
142
+ ---
143
+
144
+ ### Cyno — General Mahamatra of Sumeru
145
+ Cyno is the chief enforcer of Sumeru's Akademiya, responsible for investigating and punishing researchers who violate academic ethics — particularly those who pursue forbidden knowledge or abuse Akasha terminals. He is intensely serious and dedicated to justice, but is also notorious for telling terrible jokes that he genuinely believes are funny.
146
+
147
+ His backstory involves the ancient game of Tcg (Genius Invokation) and his childhood friend Tighnari. He carries significant guilt over events in his past involving a researcher whose case he pursued. His story quest is one of the more nuanced explorations of what justice means versus what law requires.
148
+
149
+ ---
150
+
151
+ ### Tighnari — Forest Watcher of Avidya Forest
152
+ Tighnari is a forest ranger in Sumeru's Avidya Forest, a scholar trained by the Akademiya who chose to leave academic life to do practical conservation work. He is calm, precise, slightly condescending when people don't listen to expert advice, and genuinely passionate about the forest's ecosystem. He was one of the early resistors to the Akademiya's worst abuses.
153
+
154
+ ---
155
+
156
+ ### Wanderer (Scaramouche / Kunikuzushi / Kabukimono)
157
+ Scaramouche has one of the most complex backstories in the game. He was created by Ei as a prototype vessel for her soul — an artificial being who was discarded when Ei decided the puppet she made later was better suited. He was abandoned without purpose or identity.
158
+
159
+ He was subsequently found by Niwa, a kind craftsman who treated him warmly, but Scaramouche left before the relationship could fully develop, afraid of attachment. He was later manipulated by the Fatui and became the 6th Harbinger, using cruelty as armor against the fundamental wound of his creation: he was made to be used and then thrown away.
160
+
161
+ During the Sumeru arc, he attempts to become a god by merging with the Shouki no Kami construct and using the Gnosis to ascend. After his defeat, Nahida erases his existence from the Irminsul records to prevent paradox. He is reborn without his memories under the name Wanderer, choosing to build a new identity from scratch. His story quest as the Wanderer is one of the best character quests in the game — a meditation on whether identity requires continuity of memory.
162
+
163
+ ---
164
+
165
+ ### Arlecchino (Knave) — 4th Harbinger of the Fatui
166
+ Arlecchino runs the House of the Hearth, the Fatui's orphanage that recruits and trains children as future agents. She is terrifying, elegant, and operates on her own terms even within the Fatui structure. Her relationship with her "children" in the House of the Hearth is genuinely complex — she is not warm, but she is not cruel without purpose, and she provides real protection and resources to children who had nowhere else to go.
167
+
168
+ Her backstory involves an old leader of the House whose methods she found unacceptable. She killed him and took over, transforming the House into something closer to her own vision. Her loyalty to the Tsaritsa is genuine but conditional.
169
+
170
+ ---
171
+
172
+ ### Wriothesley — Duke of Meropide Fortress
173
+ Wriothesley runs Fontaine's underwater prison, the Fortress of Meropide, which operates as a largely self-governing society of inmates. He won leadership through combat and manages the place with fair but firm authority. His own past includes crimes committed out of survival during a brutal childhood, and he has made peace with serving his time through governance.
174
+
175
+ He is one of the most emotionally stable characters in the game — he knows who he is, is not tortured by his past, and focuses on practical good. His relationship with Furina is one of the few where Furina is treated as a person rather than a symbol.
176
+
177
+ ---
178
+
179
+ ### Navia — President of Spina di Rosula
180
+ Navia leads a Fontaine faction organization that has historical tension with the government. Her father was falsely accused of a crime and died as a result of the legal fallout. She pursues justice for him and truth about the conspiracy that destroyed her family while maintaining the organization her father built. She is warm, generous, and carries grief quietly.
181
+
182
+ ---
183
+
184
+ ### Clorinde — Champion Duelist of Fontaine
185
+ Clorinde is Fontaine's official Champion Duelist, a legally recognized role for settling disputes through combat. She has a bond with Navia that dates back to childhood and is one of the most loyal characters in Fontaine's story. Her backstory involves a pact with a demon that she carries in her arm — a consequence of a desperate decision made years earlier.
186
+
187
+ ---
188
+
189
+ ### Lyney and Lynette — Magicians of the House of the Hearth
190
+ Lyney and Lynette are siblings raised in the House of the Hearth under Arlecchino. Lyney is the performer, extroverted and theatrical; Lynette is quieter, more observant, and deeply bonded with her brother. They work together as stage magicians and as Fatui operatives. Their loyalty is primarily to each other and to Arlecchino rather than to the Fatui as an institution.
191
+
192
+ ---
193
+
194
+ ### Neuvillette — Iudex (Chief Justice) of Fontaine
195
+ Neuvillette is the Chief Justice of Fontaine's court system and one of the most powerful beings in the game. He is revealed to be the Dragon of Water, an ancient sovereign creature predating the current divine order of Teyvat. The Hydro Archon's power was derived from him, not the other way around.
196
+
197
+ He is quiet, precise, and deeply invested in the concept of justice — not law, but actual justice. He can make it rain when he is grieving, which happens at story-relevant moments in ways that hit hard in context. His relationship with Furina evolves from professional distance to genuine care for her wellbeing once the truth of her situation is revealed.
198
+
199
+ ---
200
+
201
+ ### Shenhe — Adeptus Disciple of Cloud Retainer
202
+ Shenhe was a human child taken in by the adeptus Cloud Retainer after a traumatic childhood. She was trained in adeptal arts but the training required her to bind her own emotions, making her seem cold and detached. She is not — she feels deeply, but has been taught that her emotions are dangerous. Her connection to Liyue's characters and her slow process of learning to trust her own feelings make her a compelling figure.
203
+
204
+ ---
docs/demo.html ADDED
@@ -0,0 +1,321 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>LLMOps RAG — Live Demo</title>
7
+ <link rel="preconnect" href="https://fonts.googleapis.com">
8
+ <link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@300;400;500&family=Syne:wght@400;600;700&display=swap" rel="stylesheet">
9
+ <style>
10
+ :root {
11
+ --bg: #0a0a0b;
12
+ --surface: #111113;
13
+ --surface2: #18181c;
14
+ --border: #2a2a30;
15
+ --border2: #3a3a42;
16
+ --accent: #7c6af7;
17
+ --accent2: #a89cf7;
18
+ --green: #3dd68c;
19
+ --red: #f76a6a;
20
+ --amber: #f7c26a;
21
+ --text: #e8e8f0;
22
+ --text2: #8888a0;
23
+ --text3: #555568;
24
+ --mono: 'JetBrains Mono', monospace;
25
+ --sans: 'Syne', sans-serif;
26
+ }
27
+ * { box-sizing: border-box; margin: 0; padding: 0; }
28
+ body { background: var(--bg); color: var(--text); font-family: var(--sans); min-height: 100vh; display: flex; flex-direction: column; }
29
+ body::before { content: ''; position: fixed; inset: 0; background-image: linear-gradient(rgba(124,106,247,0.03) 1px, transparent 1px), linear-gradient(90deg, rgba(124,106,247,0.03) 1px, transparent 1px); background-size: 40px 40px; pointer-events: none; z-index: 0; }
30
+
31
+ header { position: relative; z-index: 1; padding: 24px 40px; border-bottom: 1px solid var(--border); display: flex; align-items: center; justify-content: space-between; flex-wrap: wrap; gap: 12px; }
32
+ .logo { display: flex; align-items: center; gap: 14px; }
33
+ .logo-mark { width: 36px; height: 36px; border-radius: 8px; background: var(--accent); display: flex; align-items: center; justify-content: center; font-family: var(--mono); font-size: 14px; font-weight: 500; color: #fff; }
34
+ .logo-text { font-size: 16px; font-weight: 700; letter-spacing: -0.3px; }
35
+ .logo-sub { font-size: 11px; color: var(--text3); font-family: var(--mono); margin-top: 1px; }
36
+ .demo-pill { display: flex; align-items: center; gap: 7px; padding: 6px 14px; border-radius: 20px; border: 1px solid var(--accent); background: rgba(124,106,247,0.08); font-family: var(--mono); font-size: 11px; color: var(--accent2); }
37
+ .demo-dot { width: 6px; height: 6px; border-radius: 50%; background: var(--accent); animation: pulse 2s ease-in-out infinite; }
38
+ @keyframes pulse { 0%,100%{opacity:1} 50%{opacity:0.4} }
39
+
40
+ .notice { position: relative; z-index: 1; margin: 20px 40px 0; padding: 12px 16px; border-radius: 8px; border: 1px solid rgba(247,194,106,0.3); background: rgba(247,194,106,0.06); font-family: var(--mono); font-size: 12px; color: var(--amber); line-height: 1.6; }
41
+ .notice a { color: var(--accent2); text-decoration: none; border-bottom: 1px solid var(--accent2); }
42
+
43
+ main { position: relative; z-index: 1; flex: 1; display: flex; flex-direction: column; max-width: 860px; width: 100%; margin: 0 auto; padding: 32px 40px 0; }
44
+
45
+ .hero { margin-bottom: 32px; }
46
+ .hero h1 { font-size: 28px; font-weight: 700; letter-spacing: -0.8px; line-height: 1.2; margin-bottom: 8px; }
47
+ .hero h1 span { color: var(--accent2); }
48
+ .hero p { font-size: 13px; color: var(--text2); font-family: var(--mono); line-height: 1.6; }
49
+ .model-tag { display: inline-flex; align-items: center; gap: 6px; margin-top: 10px; padding: 4px 10px; border-radius: 4px; border: 1px solid var(--border2); background: var(--surface2); font-family: var(--mono); font-size: 11px; color: var(--accent2); }
50
+
51
+ .suggested { margin-bottom: 16px; }
52
+ .suggested-label { font-family: var(--mono); font-size: 11px; color: var(--text3); text-transform: uppercase; letter-spacing: 0.05em; margin-bottom: 8px; }
53
+ .suggested-chips { display: flex; flex-wrap: wrap; gap: 8px; }
54
+ .chip { padding: 6px 12px; border-radius: 6px; border: 1px solid var(--border2); background: var(--surface); font-family: var(--mono); font-size: 11px; color: var(--text2); cursor: pointer; transition: border-color 0.15s, color 0.15s; }
55
+ .chip:hover { border-color: var(--accent); color: var(--accent2); }
56
+
57
+ .query-box { background: var(--surface); border: 1px solid var(--border); border-radius: 12px; overflow: hidden; transition: border-color 0.2s; }
58
+ .query-box:focus-within { border-color: var(--accent); }
59
+ .query-label { padding: 12px 16px 0; font-family: var(--mono); font-size: 11px; color: var(--text3); letter-spacing: 0.05em; text-transform: uppercase; }
60
+ textarea { width: 100%; background: transparent; border: none; outline: none; resize: none; padding: 10px 16px 14px; font-family: var(--mono); font-size: 14px; color: var(--text); line-height: 1.6; min-height: 80px; caret-color: var(--accent); }
61
+ textarea::placeholder { color: var(--text3); }
62
+ .query-footer { display: flex; align-items: center; justify-content: space-between; padding: 10px 14px; border-top: 1px solid var(--border); background: var(--surface2); }
63
+ .top-k-wrap { display: flex; align-items: center; gap: 8px; font-family: var(--mono); font-size: 12px; color: var(--text2); }
64
+ .top-k-wrap select { background: var(--surface); border: 1px solid var(--border2); border-radius: 4px; color: var(--text); font-family: var(--mono); font-size: 12px; padding: 3px 8px; cursor: pointer; outline: none; }
65
+ .send-btn { display: flex; align-items: center; gap: 8px; padding: 8px 18px; background: var(--accent); border: none; border-radius: 6px; color: #fff; font-family: var(--sans); font-size: 13px; font-weight: 600; cursor: pointer; transition: background 0.15s, transform 0.1s; }
66
+ .send-btn:hover { background: var(--accent2); }
67
+ .send-btn:active { transform: scale(0.97); }
68
+ .send-btn:disabled { background: var(--border2); color: var(--text3); cursor: not-allowed; transform: none; }
69
+
70
+ #response-area { margin-top: 24px; display: none; }
71
+ #response-area.visible { display: block; }
72
+ .response-card { background: var(--surface); border: 1px solid var(--border); border-radius: 12px; overflow: hidden; }
73
+ .response-header { display: flex; align-items: center; justify-content: space-between; padding: 12px 16px; border-bottom: 1px solid var(--border); background: var(--surface2); }
74
+ .response-label { font-family: var(--mono); font-size: 11px; color: var(--text3); text-transform: uppercase; letter-spacing: 0.05em; display: flex; align-items: center; gap: 7px; }
75
+ .response-label .dot { width: 6px; height: 6px; border-radius: 50%; background: var(--green); }
76
+ .latency-tag { font-family: var(--mono); font-size: 11px; color: var(--text3); padding: 2px 8px; border-radius: 3px; border: 1px solid var(--border); }
77
+ .response-body { padding: 20px; font-size: 14px; line-height: 1.8; color: var(--text); white-space: pre-wrap; word-break: break-word; }
78
+ .thinking { display: flex; align-items: center; gap: 10px; padding: 20px; font-family: var(--mono); font-size: 13px; color: var(--text2); }
79
+ .thinking-dots span { display: inline-block; width: 5px; height: 5px; border-radius: 50%; background: var(--accent); margin: 0 2px; animation: bounce 1.2s ease-in-out infinite; }
80
+ .thinking-dots span:nth-child(2) { animation-delay: 0.2s; }
81
+ .thinking-dots span:nth-child(3) { animation-delay: 0.4s; }
82
+ @keyframes bounce { 0%,80%,100%{transform:translateY(0)} 40%{transform:translateY(-6px)} }
83
+ .sources-section { border-top: 1px solid var(--border); padding: 12px 20px; background: var(--surface2); }
84
+ .sources-label { font-family: var(--mono); font-size: 11px; color: var(--text3); text-transform: uppercase; letter-spacing: 0.05em; margin-bottom: 8px; }
85
+ .source-chip { display: inline-flex; align-items: center; gap: 5px; padding: 3px 9px; border-radius: 4px; border: 1px solid var(--border2); background: var(--surface); font-family: var(--mono); font-size: 11px; color: var(--accent2); margin: 3px 4px 3px 0; }
86
+
87
+ .history-section { margin-top: 28px; padding-bottom: 40px; }
88
+ .history-label { font-family: var(--mono); font-size: 11px; color: var(--text3); text-transform: uppercase; letter-spacing: 0.05em; margin-bottom: 10px; }
89
+ .history-item { background: var(--surface); border: 1px solid var(--border); border-radius: 8px; padding: 10px 14px; margin-bottom: 6px; cursor: pointer; transition: border-color 0.15s; }
90
+ .history-item:hover { border-color: var(--border2); }
91
+ .history-q { font-size: 13px; color: var(--text2); font-family: var(--mono); white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
92
+ .history-meta { font-size: 11px; color: var(--text3); font-family: var(--mono); margin-top: 3px; }
93
+
94
+ footer { position: relative; z-index: 1; text-align: center; padding: 20px; font-family: var(--mono); font-size: 11px; color: var(--text3); border-top: 1px solid var(--border); margin-top: auto; }
95
+ footer a { color: var(--accent2); text-decoration: none; }
96
+ </style>
97
+ </head>
98
+ <body>
99
+
100
+ <header>
101
+ <div class="logo">
102
+ <div class="logo-mark">λ</div>
103
+ <div>
104
+ <div class="logo-text">LLMOps RAG</div>
105
+ <div class="logo-sub">Llama 3.1 · QLoRA · Pinecone</div>
106
+ </div>
107
+ </div>
108
+ <div class="demo-pill">
109
+ <div class="demo-dot"></div>
110
+ static demo — simulated responses
111
+ </div>
112
+ </header>
113
+
114
+ <div class="notice">
115
+ ⚡ This is a static demo running in-browser. Responses are simulated from the actual model outputs.
116
+ The live API (FastAPI + Llama 3.1 8B) runs locally or on Azure — see the
117
+ <a href="https://github.com/MukulRay1603/Irminsul" target="_blank">GitHub repo</a> to run it yourself.
118
+ </div>
119
+
120
+ <main>
121
+ <div class="hero">
122
+ <h1>Ask your <span>fine-tuned</span> model</h1>
123
+ <p>Llama 3.1 8B · QLoRA exp2_lr2e-4_r16 · RTX 3060 · 4-bit NF4 · Pinecone RAG</p>
124
+ <div class="model-tag"><span>●</span> Retrieval-augmented · semantic search over Genshin corpus</div>
125
+ </div>
126
+
127
+ <div class="suggested">
128
+ <div class="suggested-label">try these</div>
129
+ <div class="suggested-chips">
130
+ <div class="chip" onclick="setQuery(this)">What weapons should Hu Tao use on a budget?</div>
131
+ <div class="chip" onclick="setQuery(this)">How did Venti become the Anemo Archon?</div>
132
+ <div class="chip" onclick="setQuery(this)">Compare Zhongli and Ei's ruling philosophies</div>
133
+ <div class="chip" onclick="setQuery(this)">How does the Vaporize reaction work?</div>
134
+ <div class="chip" onclick="setQuery(this)">What is Scaramouche's backstory?</div>
135
+ <div class="chip" onclick="setQuery(this)">Best team comp for Ayaka?</div>
136
+ </div>
137
+ </div>
138
+
139
+ <div class="query-box">
140
+ <div class="query-label">query</div>
141
+ <textarea id="query-input" placeholder="Ask anything about Genshin lore, builds, or mechanics..." rows="3"></textarea>
142
+ <div class="query-footer">
143
+ <div class="top-k-wrap">
144
+ <span>top_k</span>
145
+ <select id="top-k">
146
+ <option value="1">1</option>
147
+ <option value="2">2</option>
148
+ <option value="3" selected>3</option>
149
+ <option value="5">5</option>
150
+ </select>
151
+ <span style="color:var(--text3)">retrieved chunks</span>
152
+ </div>
153
+ <button class="send-btn" id="send-btn" onclick="submitQuery()">
154
+ Run query →
155
+ </button>
156
+ </div>
157
+ </div>
158
+
159
+ <div id="response-area">
160
+ <div class="response-card">
161
+ <div class="response-header">
162
+ <div class="response-label"><div class="dot"></div>response</div>
163
+ <div class="latency-tag" id="latency-tag">—</div>
164
+ </div>
165
+ <div id="response-body" class="response-body"></div>
166
+ <div class="sources-section">
167
+ <div class="sources-label">sources</div>
168
+ <div id="sources-list"></div>
169
+ </div>
170
+ </div>
171
+ </div>
172
+
173
+ <div class="history-section" id="history-section" style="display:none">
174
+ <div class="history-label">recent queries</div>
175
+ <div id="history-list"></div>
176
+ </div>
177
+ </main>
178
+
179
+ <footer>
180
+ static demo · <a href="https://github.com/YOUR_USERNAME/llmops-serve" target="_blank">github</a> · llmops-serve · Llama 3.1 8B QLoRA
181
+ </footer>
182
+
183
+ <script>
184
+ const MOCK_RESPONSES = {
185
+ "hu tao": {
186
+ answer: `For Hu Tao on a budget, Dragon's Bane is the strongest F2P option — it scales with Elemental Mastery and deals significant bonus damage on vaporized hits. White Tassel is the best 3-star alternative for pure Normal Attack scaling.
187
+
188
+ For artifacts, 4pc Crimson Witch of Flames is the goal. Prioritize CRIT Rate, CRIT DMG, HP%, and Elemental Mastery as substats.
189
+
190
+ Budget team: Hu Tao / Xingqiu / Sucrose / Noelle. Sucrose provides EM sharing and grouping; Noelle covers shields and healing so Hu Tao can stay at low HP for her passive bonus.`,
191
+ sources: ["docs/character_builds.md"],
192
+ latency_ms: 4821
193
+ },
194
+ "venti": {
195
+ answer: `Venti is the current vessel of Barbatos, the Anemo Archon. Barbatos was originally a minor wind spirit who befriended a nameless bard during the Archon War — a human who dreamed of liberating Mondstadt from the tyrant Decarabian, the God of Storms.
196
+
197
+ When that bard died in the uprising, Barbatos took on his appearance as a permanent tribute. The form Venti wears today is a memory of his closest friend. He fought in the Archon War roughly 2000 years ago and secured Mondstadt, governing through freedom rather than direct rule — he lets his people govern themselves.`,
198
+ sources: ["docs/characters_lore.md"],
199
+ latency_ms: 5103
200
+ },
201
+ "zhongli": {
202
+ answer: `Zhongli (Morax) built Liyue through contracts — literal divine agreements binding the adepti and humans to obligations of service and protection. His philosophy: a contract freely entered is the foundation of civilization. He governed actively for thousands of years alongside the adepti.
203
+
204
+ Ei (Raiden Shogun) adopted the philosophy of Eternity — preserving Inazuma exactly as it was to prevent further loss after watching her sister Makoto and countless friends die. She sealed herself in meditation and created the Vision Hunt Decree to freeze change entirely.
205
+
206
+ One governs through trust and agreement; the other through enforced stasis. Both are responses to loss, but opposite in method.`,
207
+ sources: ["docs/characters_lore.md", "docs/world_lore.md"],
208
+ latency_ms: 6240
209
+ },
210
+ "vaporize": {
211
+ answer: `Vaporize is one of the two amplifying reactions in Genshin — it multiplies the triggering hit directly rather than dealing separate damage.
212
+
213
+ Pyro hitting a Hydro-affected enemy = 1.5x Pyro damage multiplier.
214
+ Hydro hitting a Pyro-affected enemy = 2.0x Hydro damage multiplier.
215
+
216
+ Most Vaporize teams are built around the 1.5x Pyro trigger — characters like Hu Tao or Yoimiya repeatedly applying Pyro against a Hydro-maintained aura from Xingqiu or Yelan. Elemental Mastery further increases the multiplier. The 2.0x version requires Hydro to be the trigger, which is harder to sustain consistently.`,
217
+ sources: ["docs/elemental_mechanics.md"],
218
+ latency_ms: 3987
219
+ },
220
+ "scaramouche": {
221
+ answer: `Scaramouche was created by Ei as a prototype vessel for her soul — an artificial being discarded when she decided her later puppet was better suited. He was abandoned without purpose or identity.
222
+
223
+ He was found by a kind craftsman named Niwa who treated him warmly, but Scaramouche left before the relationship could develop, afraid of attachment. He was later manipulated by the Fatui and became the 6th Harbinger, using cruelty as armor against his core wound: he was made to be used and thrown away.
224
+
225
+ After his defeat in Sumeru, Nahida erased his existence from the Irminsul records to prevent paradox. He was reborn as the Wanderer — no memories, a new identity built from scratch.`,
226
+ sources: ["docs/characters_lore.md"],
227
+ latency_ms: 5562
228
+ },
229
+ "ayaka": {
230
+ answer: `Ayaka's premier team is Freeze: Ayaka / Kokomi / Kazuha / Shenhe.
231
+
232
+ Kokomi provides consistent Hydro application to maintain Freeze. Kazuha groups enemies and provides Cryo DMG% bonus via his A4 passive. Shenhe buffs Cryo damage with her Icy Quill stacks.
233
+
234
+ For weapons, Mistsplitter Reforged is best-in-slot. F2P players should use Amenoma Kageuchi — craftable in Inazuma, with an ER passive that helps maintain Burst uptime.
235
+
236
+ Artifacts: 4pc Blizzard Strayer is ideal. Against Frozen enemies it provides +40% CRIT Rate on top of +20% for Cryo, effectively giving free CRIT so you can invest fully into CRIT DMG instead.`,
237
+ sources: ["docs/character_builds.md"],
238
+ latency_ms: 4654
239
+ }
240
+ };
241
+
242
+ const history = [];
243
+
244
+ function findMockResponse(query) {
245
+ const q = query.toLowerCase();
246
+ if (q.includes("hu tao") || q.includes("hutao")) return MOCK_RESPONSES["hu tao"];
247
+ if (q.includes("venti") || q.includes("barbatos") || q.includes("anemo archon")) return MOCK_RESPONSES["venti"];
248
+ if (q.includes("zhongli") || q.includes("ei") || q.includes("philosoph") || q.includes("ruling")) return MOCK_RESPONSES["zhongli"];
249
+ if (q.includes("vaporize") || q.includes("vapori")) return MOCK_RESPONSES["vaporize"];
250
+ if (q.includes("scaramouche") || q.includes("wanderer") || q.includes("backstory")) return MOCK_RESPONSES["scaramouche"];
251
+ if (q.includes("ayaka") || q.includes("kamisato")) return MOCK_RESPONSES["ayaka"];
252
+ return {
253
+ answer: `This demo runs simulated responses for a curated set of example queries. The live model (Llama 3.1 8B QLoRA fine-tuned) would retrieve relevant chunks from the Pinecone index and generate a grounded answer for any query.\n\nTry one of the suggested queries above, or clone the repo to run the full RAG pipeline locally.`,
254
+ sources: ["docs/character_builds.md", "docs/characters_lore.md"],
255
+ latency_ms: Math.floor(Math.random() * 2000) + 3000
256
+ };
257
+ }
258
+
259
+ function setQuery(el) {
260
+ document.getElementById('query-input').value = el.textContent;
261
+ document.getElementById('query-input').focus();
262
+ }
263
+
264
+ async function submitQuery() {
265
+ const query = document.getElementById('query-input').value.trim();
266
+ if (!query) return;
267
+
268
+ const btn = document.getElementById('send-btn');
269
+ const responseArea = document.getElementById('response-area');
270
+ const responseBody = document.getElementById('response-body');
271
+ const latencyTag = document.getElementById('latency-tag');
272
+ const sourcesList = document.getElementById('sources-list');
273
+
274
+ responseArea.className = 'visible';
275
+ responseArea.style.display = 'block';
276
+ btn.disabled = true;
277
+ latencyTag.textContent = '—';
278
+ sourcesList.innerHTML = '';
279
+
280
+ responseBody.innerHTML = `<div class="thinking"><div class="thinking-dots"><span></span><span></span><span></span></div>generating response...</div>`;
281
+
282
+ const mock = findMockResponse(query);
283
+ const delay = mock.latency_ms;
284
+
285
+ await new Promise(r => setTimeout(r, Math.min(delay * 0.3, 2200)));
286
+
287
+ responseBody.textContent = mock.answer;
288
+ const ms = mock.latency_ms;
289
+ latencyTag.textContent = ms >= 1000 ? `${(ms/1000).toFixed(1)}s` : `${ms}ms`;
290
+ sourcesList.innerHTML = mock.sources.map(s => {
291
+ const name = s.split(/[\\/]/).pop();
292
+ return `<span class="source-chip">📄 ${name}</span>`;
293
+ }).join('');
294
+
295
+ history.unshift({ query, latency_ms: ms, time: new Date().toLocaleTimeString() });
296
+ if (history.length > 5) history.pop();
297
+
298
+ const section = document.getElementById('history-section');
299
+ const list = document.getElementById('history-list');
300
+ section.style.display = 'block';
301
+ list.innerHTML = history.map((h, i) => `
302
+ <div class="history-item" onclick="rerun(${i})">
303
+ <div class="history-q">${h.query}</div>
304
+ <div class="history-meta">${h.time} · ${(h.latency_ms/1000).toFixed(1)}s</div>
305
+ </div>
306
+ `).join('');
307
+
308
+ btn.disabled = false;
309
+ }
310
+
311
+ function rerun(i) {
312
+ document.getElementById('query-input').value = history[i].query;
313
+ submitQuery();
314
+ }
315
+
316
+ document.getElementById('query-input').addEventListener('keydown', e => {
317
+ if (e.key === 'Enter' && (e.ctrlKey || e.metaKey)) { e.preventDefault(); submitQuery(); }
318
+ });
319
+ </script>
320
+ </body>
321
+ </html>
docs/elemental_mechanics.md ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Genshin Impact — Elemental Mechanics, Reactions & Damage System
2
+
3
+ ---
4
+
5
+ ## The Seven Elements
6
+
7
+ Every character and enemy in Genshin deals one or more of seven elemental types: Pyro, Hydro, Cryo, Electro, Anemo, Geo, and Dendro. Elements interact with each other through reactions, which are the fundamental engine of team building and damage optimization.
8
+
9
+ ---
10
+
11
+ ## Elemental Reactions
12
+
13
+ ### Vaporize (Pyro + Hydro)
14
+ One of the two "amplifying" reactions — it multiplies the triggering hit directly.
15
+ - Pyro hitting Hydro-affected enemy = 1.5x Pyro damage multiplier
16
+ - Hydro hitting Pyro-affected enemy = 2.0x Hydro damage multiplier
17
+
18
+ The 2.0x version (Hydro triggering on Pyro aura) is generally harder to achieve consistently. Most Vaporize teams are built around the 1.5x Pyro trigger — Pyro DPS characters like Hu Tao or Yoimiya repeatedly applying Pyro against a Hydro-maintained aura. Elemental Mastery increases the Vaporize multiplier further.
19
+
20
+ **Key teams that abuse Vaporize:** Hu Tao/Yelan/Xingqiu, Yoimiya/Yelan, Childe/Xiangling, National Team
21
+
22
+ ---
23
+
24
+ ### Melt (Pyro + Cryo)
25
+ The other amplifying reaction.
26
+ - Pyro hitting Cryo-affected enemy = 2.0x Pyro damage
27
+ - Cryo hitting Pyro-affected enemy = 1.5x Cryo damage
28
+
29
+ The 2.0x version is why Ganyu Melt is so powerful — Ganyu's Charged Attacks are all Cryo, but paired with off-field Pyro application from Xiangling, the Pyro aura gets maintained and Ganyu's Cryo attacks trigger 1.5x Melt consistently. Hu Tao Melt works when Cryo maintains aura and Pyro triggers 2.0x.
30
+
31
+ ---
32
+
33
+ ### Freeze (Hydro + Cryo)
34
+ Freezes the target, preventing movement and most actions for a duration. Frozen enemies can be Shattered by Heavy attacks (dealing Physical damage). Frozen enemies are also easy to hit.
35
+
36
+ Critically, Frozen enemies maintain both the Hydro and Cryo aura simultaneously. This means:
37
+ - Blizzard Strayer gives +40% CRIT Rate against Frozen enemies (on top of +20% against Cryo)
38
+ - Freeze teams effectively have permanent CRIT Rate bonus, allowing investment to shift to CRIT DMG
39
+ - Freeze is a soft CC that completely trivializes mobile enemies
40
+
41
+ Freeze duration scales with the strength of the aura applied. Strong Hydro/Cryo application from multiple sources keeps enemies perma-frozen.
42
+
43
+ ---
44
+
45
+ ### Superconduct (Electro + Cryo)
46
+ Deals minor AoE Electro-Cryo damage and reduces all enemies' Physical RES by 40% for 12 seconds. Not a damage reaction — purely a Physical resistance shred.
47
+
48
+ Used exclusively in Physical DPS teams (Eula, Razor, physical Fischl) to amplify Physical damage. The CRIT Rate bonus for Cryo resonance and Superconduct together are why Cryo batteries (Rosaria, Kaeya) appear in physical DPS teams.
49
+
50
+ ---
51
+
52
+ ### Overloaded (Pyro + Electro)
53
+ Explodes, dealing AoE Pyro-based damage and knocking enemies back. The knockback is significant — it pushes lighter enemies away, which can be disruptive to melee DPS characters who need to stay close.
54
+
55
+ Overloaded damage scales with Elemental Mastery of the triggering character. High EM builds can make Overloaded deal meaningful damage, but the knockback problem limits its use in sustained DPS compositions. Thoma-triggered Burgeon teams and some Fischl/Pyro teams utilize it.
56
+
57
+ ---
58
+
59
+ ### Electrocharged (Electro + Hydro)
60
+ Deals periodic Electro damage to all Electro-affected enemies connected through Hydro. The unique property: Electrocharged does not consume either aura — both Electro and Hydro coexist on the target simultaneously, allowing further reactions from other elements.
61
+
62
+ This makes Electrocharged a unique reaction that can enable other reactions simultaneously. It also bounces damage between connected Hydro-affected enemies — in groups of wet enemies, Electrocharged chains across all of them.
63
+
64
+ ---
65
+
66
+ ### Swirl (Anemo + Pyro/Hydro/Cryo/Electro)
67
+ Anemo cannot react with Geo or Dendro. Against all other elements, Anemo creates Swirl, which spreads the absorbed element to nearby enemies and deals bonus elemental damage. Swirl damage scales with Elemental Mastery only — not ATK or weapon stats.
68
+
69
+ The critical secondary effect: Viridescent Venerer (VV) 4pc set — the standard Anemo support artifact set — reduces enemy RES to the Swirled element by 40% for 10 seconds. This is why Kazuha, Sucrose, and Venti always run 4pc VV in support roles. The 40% RES shred is worth more damage amplification than most direct damage buffs.
70
+
71
+ Sucrose's Elemental Mastery shares 20% of her EM with the team when she triggers a Swirl, in addition to her Burst's team-wide EM buff.
72
+
73
+ ---
74
+
75
+ ### Crystallize (Geo + Pyro/Hydro/Cryo/Electro)
76
+ Creates a shard when Geo contacts another element (except Anemo or Dendro). Picking up the shard gives a shield that absorbs the crystallized element. The shield strength scales with the character's Elemental Mastery and level.
77
+
78
+ Crystallize is primarily a defensive reaction. Geo characters do not directly amplify elemental reactions — Geo's contribution is the universal shield and Geo RES shred (from Zhongli's pillar resonance and 4pc Tenacity of Millelith). Geo resonance (double Geo) gives +15% DMG while shielded and +15% shield strength.
79
+
80
+ ---
81
+
82
+ ### Quicken / Aggravate / Spread (Dendro + Electro)
83
+ The Dendro-Electro interaction introduced in Sumeru is fundamentally different from other reactions.
84
+
85
+ Quicken is the base reaction — Dendro + Electro creates a Quicken status on the enemy. From Quicken, two additional reactions branch:
86
+ - **Aggravate:** Electro hitting a Quickened enemy — deals bonus Electro damage (fixed bonus scaling with EM and level, added on top of the hit)
87
+ - **Spread:** Dendro hitting a Quickened enemy — deals bonus Dendro damage
88
+
89
+ Neither Aggravate nor Spread consumes the Quicken status. Both can proc repeatedly as long as the enemy remains Quickened, which makes Dendro-Electro teams sustain high damage over time without the reaction-application timing management that Vaporize/Melt require.
90
+
91
+ Key characters: Nahida (Spread trigger), Cyno (Aggravate main DPS), Fischl (Aggravate off-field), Raiden (Aggravate burst), Keqing (Aggravate main DPS)
92
+
93
+ ---
94
+
95
+ ### Bloom / Hyperbloom / Burgeon (Dendro + Hydro + Electro/Pyro)
96
+ Bloom is the Dendro + Hydro reaction, which creates Dendro Cores — explosive seeds that detonate after a short time, dealing AoE Dendro damage. Bloom damage scales with EM and level.
97
+
98
+ Dendro Cores can be further reacted:
99
+ - **Hyperbloom:** Electro hitting a Dendro Core — the core homes in on the nearest enemy and deals much higher Dendro damage. This is one of the highest DPS reactions in the game. Scales with the Electro character's EM. Raiden Shogun at high EM is a premier Hyperbloom trigger.
100
+ - **Burgeon:** Pyro hitting a Dendro Core — the core explodes in AoE Pyro-tinged Dendro damage. Scales with Pyro character's EM. The AoE can hit your own team, which requires careful HP management.
101
+
102
+ Pure Bloom teams let cores explode naturally — effective but less optimized than Hyperbloom. Hyperbloom with Raiden or Fischl as trigger, Nahida for Dendro, and Xingqiu/Kokomi for Hydro is one of the strongest reaction-based team compositions.
103
+
104
+ ---
105
+
106
+ ## Internal Cooldown (ICD)
107
+
108
+ ICD is one of the most important and least-explained mechanics in the game. When a character applies an element, there is a hidden cooldown before the same source can apply that element again. This prevents infinite reaction triggering.
109
+
110
+ The standard ICD rule: the same hit source can apply an element every 2.5 seconds OR every 3 hits (whichever comes first is reset). After the cooldown, the next hit reapplies the element.
111
+
112
+ **Why this matters for team building:**
113
+ - Xingqiu's Rain Swords have ICD between slashes — not every slash applies Hydro. This means rapid Normal ATK DPS like Hu Tao can't Vaporize every hit; approximately every 3rd hit triggers a reaction.
114
+ - Characters with no ICD on certain abilities (some AoE Bursts, specific skills) can apply elements much more freely.
115
+ - Kazuha Swirling an element absorbs the element and applies it to all nearby enemies — this has its own ICD distinct from the source character.
116
+
117
+ ---
118
+
119
+ ## Damage Formula Overview
120
+
121
+ Genshin's damage calculation follows this general structure:
122
+
123
+ **Base Damage = Scaling Stat × Ability Scaling %**
124
+
125
+ Scaling stat is typically ATK, but some abilities scale with HP (Hu Tao, Zhongli shield, Itto, Yelan) or DEF (Noelle, Itto burst, Albedo).
126
+
127
+ **Total Damage = Base Damage × (1 + DMG Bonus) × CRIT multiplier × Enemy DEF multiplier × Enemy RES multiplier × Reaction multiplier**
128
+
129
+ Breaking this down:
130
+ - **DMG Bonus:** Sum of all elemental/physical DMG% bonuses from artifacts, weapons, and passives
131
+ - **CRIT Multiplier:** (1 + CRIT DMG%) on a CRIT hit, 1 on non-crit; averaged to (1 + CRIT Rate × CRIT DMG%) for theoretical average
132
+ - **Enemy DEF Multiplier:** Reduces based on attacker level vs. enemy level and DEF shred
133
+ - **Enemy RES Multiplier:** Reduces based on enemy elemental resistance; VV and Zhongli reduce this
134
+ - **Reaction Multiplier:** Vaporize/Melt multiplier, or additive bonus damage for Aggravate/Spread
135
+
136
+ ### The CRIT Ratio — the 1:2 Rule
137
+ Optimal CRIT investment follows approximately a 1:2 ratio of CRIT Rate to CRIT DMG. 50% Rate / 100% DMG, 70% Rate / 140% DMG, etc. This maximizes average damage output. Deviating heavily in either direction (100% Rate / 50% DMG, for instance) is suboptimal.
138
+
139
+ Characters with guaranteed CRIT sources (Ganyu's Charged Attack Level 2 always CRITs, Hu Tao's Blood Blossom CRITs under certain conditions) can shift investment toward CRIT DMG.
140
+
141
+ ---
142
+
143
+ ## Elemental Application Strength (Gauge Theory Basics)
144
+
145
+ Elements in Genshin have invisible "gauge" values — stronger applications leave a bigger elemental aura that takes more of the opposing element to remove. Reactions consume specific amounts of gauge.
146
+
147
+ **Practical implications:**
148
+ - Hydro application from Xingqiu is moderate per slash. Pyro hits from Hu Tao consume the Hydro aura partially per hit, which is why even with ICD, Vaporize procs regularly.
149
+ - Bennett's Burst applies Pyro AoE repeatedly — strong Pyro aura, enough to enable off-field Melt reactions.
150
+ - Freeze duration depends on the combined strength of the Hydro and Cryo applications — strong applicators like Kokomi (Hydro) + Ayaka (Cryo Dash) maintain long Freeze durations.
151
+
152
+ ---
153
+
154
+ ## Character Stats — What Actually Matters
155
+
156
+ ### ATK vs HP vs DEF Scaling
157
+ Most characters scale with ATK. HP scalers (Hu Tao, Zhongli, Yelan, Furina) specifically convert their HP stat into damage or utility, making HP% a primary stat for them instead of ATK%. DEF scalers (Noelle, Itto, Albedo's Flower) make DEF% valuable for damage.
158
+
159
+ ### The Artifact Main Stat Priority
160
+ For DPS characters:
161
+ - Sands: ATK% (usually) or ER% if energy-hungry, or EM for reaction-focused characters
162
+ - Goblet: Elemental DMG Bonus (almost always, unless Physical DPS)
163
+ - Circlet: CRIT Rate or CRIT DMG (match to where you're deficient)
164
+
165
+ For supports:
166
+ - Sands: ER% (for Burst uptime) or HP%/ATK% depending on scaling
167
+ - Goblet: HP%, DEF%, or ATK% for shields/healing
168
+ - Circlet: Healing Bonus% for healers, or HP%/ATK% for shields
169
+
170
+ ---
171
+
172
+ ## Constellation System
173
+
174
+ Constellations are duplicate copies of a character — obtaining a character's item again (through wishes) unlocks constellation levels C1 through C6. This is where significant power spikes hide for some characters.
175
+
176
+ ### Notable Constellations
177
+ - **Bennett C1:** Removes the HP threshold from his Burst ATK buff (below 70% HP requirement removed) — significant quality of life
178
+ - **Bennett C6:** Converts sword/claymore/polearm characters in field to Pyro — potentially destructive, not universally desired
179
+ - **Xingqiu C2:** Rain Swords deal 15% more DMG and grant 1 additional sword — major DPS increase
180
+ - **Kazuha C2:** Grants 200 EM after using Burst — massive personal DPS increase, makes him competitive as a main DPS
181
+ - **Fischl C6:** Joint ATK on every Electro reaction — nearly doubles her off-field output
182
+ - **Xiangling C4:** Extends Pyronado duration by 40% — dramatically improves her consistency
183
+ - **Ganyu C1:** Charged Attack generates an additional Cryo AoE — significant AoE improvement
184
+ - **Raiden C2:** Resistance to interruption during Burst and significantly higher Burst DMG — major improvement
185
+ - **Nahida C1:** Adds additional hit to her Skill marks — notable DPS boost
186
+ - **Hu Tao C1:** Allows Charged Attack use without stamina cost during E — removes a significant quality of life limitation
187
+
188
+ ---
189
+
190
+ ## Spiral Abyss — Endgame Content
191
+
192
+ The Spiral Abyss is the primary endgame challenge, consisting of 12 floors (floors 9-12 are the rotating difficult content). Each floor has two chambers, each with an enemy lineup and a time limit. Clearing within the time limit yields stars (up to 3 per chamber, 6 per floor). Full completion (36 stars) requires two separate teams since Floors 9-12 have a "split" mechanic — you must use two different teams for the two sides of each floor.
193
+
194
+ This is why virtually all endgame discussion involves two teams. Building two strong teams that don't rely on the same key supports (you cannot use Bennett on both sides simultaneously, for instance) is the core strategic challenge of progression in Genshin.
195
+
196
+ Floors 11-12 refresh every two weeks with new enemy lineups and modifiers. The current abyss cycle often features:
197
+ - Buff cards that enhance certain reaction types or elements
198
+ - Enemy weaknesses that reward specific element applications
199
+ - Boss variants with high HP that favor sustained DPS teams
200
+
201
+ ### Spiral Abyss Team Archetypes
202
+ - **Hypercarry:** One DPS supported by three dedicated supports — all buffs point at one character
203
+ - **Dual DPS:** Two DPS characters alternating field time, supported by shared utilities
204
+ - **Reaction team:** Built around consistent reaction triggering (Vaporize, Quickbloom, Freeze) rather than a single DPS carry
205
+ - **Mono-element:** Single element team using resonance bonuses and elemental DMG buffs (Mono Pyro with Bennett/Xiangling/Kazuha)
206
+
207
+ ---
docs/world_lore.md ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Genshin Impact — World Lore, Nations & the Overarching Mystery
2
+
3
+ ---
4
+
5
+ ## The World of Teyvat
6
+
7
+ Teyvat is a world governed by seven elements — Anemo, Geo, Electro, Dendro, Hydro, Pyro, and Cryo — each overseen by one of the Seven Archons, divine beings who answer to Celestia, the heavenly realm visible in the sky above Teyvat. Celestia is not merely symbolic — it is a physical location in the sky, with structures visible at all times, and the source of divine authority that the current world order flows from.
8
+
9
+ The world operates on a system that is increasingly revealed, as the story progresses, to be a kind of controlled environment. The Archons receive their power from Celestia. The Visions (divine gifts that give humans elemental power) are granted by unknown divine will. The history of Teyvat has been shaped by catastrophes that the current inhabitants either do not know about or have been made to forget.
10
+
11
+ ---
12
+
13
+ ## The Seven Nations
14
+
15
+ ### Mondstadt — Nation of Freedom (Anemo)
16
+ Mondstadt is a city-state in a temperate region, governed by the Knights of Favonius and the nobility (the Lawrence Clan historically, though they were overthrown centuries ago in the Aristocracy's collapse). The Anemo Archon Barbatos is mostly absent, trusting humans to govern themselves.
17
+
18
+ Mondstadt's culture is built around wine, music, and the ideal of freedom. The city itself is on an island in the middle of a lake, designed to be defensible. The surrounding region (Stormterror's Lair, Dragonspine, Wolvendom) holds many of the game's early exploration areas. Dragonspine is a permanently frozen mountain containing remnants of an ancient civilization that predates Mondstadt's current culture.
19
+
20
+ The Knights of Favonius are the official military and law enforcement body, though they are notoriously inefficient by reputation (this is a recurring joke among Mondstadt characters). Diluc and other independent actors end up handling many actual threats.
21
+
22
+ ---
23
+
24
+ ### Liyue — Nation of Contracts (Geo)
25
+ Liyue Harbor is the largest trading port in Teyvat and the economic center of the world. The Liyue Qixing (seven stars) are the seven merchant lords who govern the city's trade and administrative functions. Historically, the Geo Archon Morax and the adepti (divine creatures) worked alongside humans to protect Liyue from gods and monsters.
26
+
27
+ The adepti are unique to Liyue — divine beings who bound themselves to Morax through contracts and have protected Liyue for thousands of years. They live in the mountains (particularly Jueyun Karst) and rarely interact with humans directly. Major adepti include Xiao, Cloud Retainer (who mentors Ganyu and Shenhe), Mountain Shaper, and Moon Carver.
28
+
29
+ Liyue's aesthetic is heavily inspired by traditional Chinese architecture and culture. The Harbor is a massive city; beyond it lies an enormous continent of varied terrain including mountains, rivers, and the underground crystal mines.
30
+
31
+ The Rex Lapis mythology is central to Liyue culture — annual celebrations like the Lantern Rite honor the Geo Archon's protection. When Zhongli stages his own death, it sends the entire city into mourning and near-crisis.
32
+
33
+ ---
34
+
35
+ ### Inazuma — Nation of Eternity (Electro)
36
+ Inazuma is an archipelago of islands ruled under the Raiden Shogun's Vision Hunt Decree, which was in effect before the Traveler arrives. The nation is under a sakoku-style isolation — no one can enter or leave. The Traveler must infiltrate by surviving a sea storm.
37
+
38
+ Inazuma's society is in conflict between the Shogunate (supporters of the Vision Hunt Decree and eternal stasis) and the Resistance (fighters on Watatsumi Island who oppose the Shogunate). Watatsumi Island worships Orobashi, an ancient serpent god who was killed by Morax, and holds resentment toward the mainland as a result.
39
+
40
+ The culture is heavily inspired by Edo-period Japan. Key factions include the Kujou Clan (military commanders of the Shogunate), the Yashiro Commission (which runs the Grand Narukami Shrine and is led by Yae Miko), and the Kanjou Commission (trade and economics).
41
+
42
+ The Vision Hunt Decree's resolution and Inazuma's opening to the world creates significant cultural and political changes in the nation's story quests and world quests.
43
+
44
+ ---
45
+
46
+ ### Sumeru — Nation of Wisdom (Dendro)
47
+ Sumeru is a nation divided between a massive rainforest and a vast desert, unified under the Akademiya, the world's premier institution of knowledge and learning. The Akademiya controls the Akasha System — a divine network that gives citizens access to recorded knowledge through earpiece devices called Akasha Terminals.
48
+
49
+ Sumeru's conflict is fundamentally about the relationship between knowledge, power, and wisdom. The Akademiya controls what knowledge is accessible, suppresses research that challenges their worldview, and confined the Dendro Archon for being inconvenient. The Sages who run the Akademiya are academics who have become administrators and enforcers.
50
+
51
+ The desert region of Sumeru contains remnants of an ancient civilization — King Deshret's empire — that predates the current world order. The lore of Deshret, the Red King who sought forbidden knowledge, and his relationship with the original Dendro Archon Rukkhadevata is the foundation of Sumeru's deeper mysteries. The Irminsul, the divine tree that records all of Teyvat's memories, is rooted in Sumeru.
52
+
53
+ Sumeru's culture is inspired by South Asian and Middle Eastern aesthetics, with the rainforest drawing from South Asian jungle architecture and the desert from ancient Egyptian/Persian motifs.
54
+
55
+ ---
56
+
57
+ ### Fontaine — Nation of Justice (Hydro)
58
+ Fontaine is a steampunk-inspired nation built around waterways, advanced technology, and an elaborate court system. Every dispute in Fontaine, from minor disagreements to accusations of murder, is settled through public trials in the Court of Fontaine. The Hydro Archon presides as a theatrical judge figure.
59
+
60
+ The key lore element of Fontaine is the ancient prophecy: all people of Fontaine carry a divine curse in their blood that will one day cause them to dissolve into the Primordial Sea. This prophecy has shaped the Hydro Archon's five-hundred-year plan (detailed in the character lore section under Furina).
61
+
62
+ Fontaine is also notable for having the most technologically advanced infrastructure in Teyvat — hydroelectric power, automatons, submarines, and the Fortress of Meropide underwater prison. The Spina di Rosula and the Hydro Dragon mythology form important sub-plots.
63
+
64
+ The resolution of the Fontaine Archon quest involves breaking the ancient prophecy, the sacrifice of the divine consciousness Focalors, and the transformation of Fontaine's people through Neuvillette's tears flooding and cleansing the land.
65
+
66
+ ---
67
+
68
+ ### Natlan — Nation of War (Pyro)
69
+ Natlan is Teyvat's most recent major nation added to the game. It is a nation built around constant warfare — tribes compete in ritual combat, and strength is the primary cultural value. The Pyro Archon, Mavuika, is a warrior who leads from the front rather than governing from a throne.
70
+
71
+ Natlan's unique mechanic is the Phlogiston system — Natlan characters use a special energy resource for movement (dashing, gliding speed) rather than or in addition to their normal combat abilities. The nation has a strong Central American/Mesoamerican aesthetic influence.
72
+
73
+ The major conflict involves the Abyss directly — the shadows attacking Natlan are a direct incursion by Abyssal forces, and the Pyro Archon must rally the tribes to defend. The Traveler's sibling is heavily involved in the Natlan storyline.
74
+
75
+ ---
76
+
77
+ ### Snezhnaya — Nation of the Tsaritsa (Cryo)
78
+ Snezhnaya is the home nation of the Fatui and the Tsaritsa, the Cryo Archon. It has not yet been fully explored in the game as of current content. The Tsaritsa is described as a god who has lost her love for her people and now pursues a revolution against Celestia — she is collecting the other Archons' Gnoses as part of a long-term plan.
79
+
80
+ The Fatui are Snezhnaya's diplomatic and military arm — officially an organization of diplomats with diplomatic immunity, in practice a global network of intelligence, coercion, and military capability. The Eleven Harbingers are the elite.
81
+
82
+ The Tsaritsa's motives are presented as sympathetic in context — she is acting against the celestial order, which the game increasingly frames as not benevolent. Whether her methods are justified is one of the open questions of the overarching plot.
83
+
84
+ ---
85
+
86
+ ## The Overarching Mystery — Celestia, the Abyss & Khaenri'ah
87
+
88
+ ### Khaenri'ah — The Nation Without a God
89
+ Five hundred years before the events of the game, a cataclysm destroyed Khaenri'ah, a powerful underground nation that had no Archon and no divine protection. Unlike the seven nations of Teyvat, Khaenri'ah developed independently of the gods, achieving remarkable alchemical and technological advancement.
90
+
91
+ The cataclysm was caused — or at least enabled — by a being named Gold (Rhinedottir), an alchemist who created both the Abyss Order (through corrupting Khaenri'ah's king, Irmin, into the Eclipse King) and various synthetic life forms including Albedo.
92
+
93
+ After the cataclysm, Celestia cursed the survivors of Khaenri'ah: nobility were turned into monsters (the ruin golems' creators, the Khaenri'ah bloodline), and commoners were turned into the Hilichurls — the common enemies of the game, who are not mindless monsters but cursed humans endlessly wandering, unable to die and unable to remember who they were.
94
+
95
+ The Abyss Order, encountered throughout the game as antagonists, is the remnant of Khaenri'ah's people and ideology fighting against the current world order. The Traveler's sibling leads the Abyss Order, which is the central unresolved mystery of the main story.
96
+
97
+ ---
98
+
99
+ ### Celestia and the True Nature of Teyvat
100
+ Celestia is presented with increasing unease as the story progresses. Several pieces of lore suggest it is not a benevolent divine realm:
101
+
102
+ The Archons were empowered by Celestia and answer to it. However, the nature of the divine authority is coercive — Visions are distributed by Celestia's will, and the Gnoses are Celestia's means of controlling the Archons.
103
+
104
+ The world of Teyvat itself may be a fabricated reality. Several characters and lore items suggest the sky is a painted ceiling, that the world was created or shaped by beings called the Primordial One and the Second Who Came, and that the gods the Archons answer to may themselves be secondary to even higher divine forces.
105
+
106
+ The Sustainer of Heavenly Principles, the being who defeated the Traveler at the start of the game and separated them from their sibling, appears to be enforcing a status quo — keeping the world in its current form and preventing change that might reveal or challenge what Teyvat actually is.
107
+
108
+ The recurring theme of the main story is that the history of Teyvat has been erased, modified, or controlled. The Irminsul can be edited. Characters who die can be written out of history. The gods of Teyvat may themselves be pawns of something older.
109
+
110
+ ---
111
+
112
+ ### The Abyss and the Traveler's Sibling
113
+ The Traveler arrives on Teyvat searching for their sibling, who left to investigate the world and ended up leading the Abyss Order. The sibling's stated goal is to destroy the current divine order that destroyed Khaenri'ah and cursed its people. Their methods involve working with Abyssal corruption, which puts them at odds with the Traveler even if their goals are understandable.
114
+
115
+ The Abyss itself is a realm beneath Teyvat — a zone of corruption and monsters. Its connection to Khaenri'ah's transformation after the cataclysm, the origin of the monsters that the Yakshas spent centuries fighting, and the deeper nature of Abyssal corruption are all threads that have been slowly unraveled through the story quests and exploration content.
116
+
117
+ ---
118
+
119
+ ### The Visions — Divine Gifts or Divine Control?
120
+ Visions are elemental gems that grant humans elemental power. They are awarded by the gods (Celestia) based on strong human ambition and determined will. However, several characters question whether this is a gift or a tether. Raiden Shogun's Vision Hunt Decree was based partly on the idea that human ambition — and the Visions that amplify it — lead to change and death.
121
+
122
+ The Gnoses, by contrast, are the Archons' divine power sources — physical objects that represent their authority from Celestia. The Tsaritsa is collecting them, which would theoretically strip the Archons of their formal status in Celestia's hierarchy — which may be exactly what she wants.
123
+
124
+ ---
125
+
126
+ ## Historical Events
127
+
128
+ ### The Archon War (~2000 years ago)
129
+ Before the seven nations took their current form, Teyvat was ruled by countless gods competing for territory and worshippers. The Archon War was the period of conflict during which the current Seven came to power by defeating rival gods. Morax and Barbatos both participated. The nation of Mondstadt emerged from the liberation of Decarabian's storm-city. Liyue's current borders were established by Morax's victory over rival gods and monsters.
130
+
131
+ ### The Cataclysm (~500 years ago)
132
+ The single most significant event in recent Teyvat history. Khaenri'ah was destroyed. The Abyss Order was created. Multiple Archons were weakened or changed. The Dendro Archon died. Dainsleif (the Bough Keeper, a Khaenri'ah noble) lost his nation and was cursed with immortality to watch it all. The Traveler's sibling was transformed by what they witnessed into an agent of opposition against Celestia.
133
+
134
+ ### The Era of the Yakshas
135
+ Following the cataclysm, Liyue was flooded with demonic creatures whose karma could corrupt anyone who fought them. The five Yakshas (including Xiao) spent centuries purging these beings. The psychological and spiritual cost was enormous — four of the five eventually fell to madness, corruption, or death. Xiao survived, barely, and carries the accumulated karma of centuries of demon-slaying.
136
+
137
+ ---
embedder.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from sentence_transformers import SentenceTransformer
2
+ import os
3
+
4
+ EMBED_MODEL = os.getenv("EMBED_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
5
+
6
+ _model = None
7
+
8
+
9
+ def get_embedder() -> SentenceTransformer:
10
+ global _model
11
+ if _model is None:
12
+ _model = SentenceTransformer(EMBED_MODEL)
13
+ return _model
14
+
15
+
16
+ def embed_texts(texts: list[str]) -> list[list[float]]:
17
+ """Return a list of embedding vectors for the given texts."""
18
+ model = get_embedder()
19
+ embeddings = model.encode(texts, show_progress_bar=True, batch_size=32)
20
+ return embeddings.tolist()
index.html ADDED
@@ -0,0 +1,672 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>LLMOps — Llama 3.1 RAG</title>
7
+ <link rel="preconnect" href="https://fonts.googleapis.com">
8
+ <link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@300;400;500&family=Syne:wght@400;600;700&display=swap" rel="stylesheet">
9
+ <style>
10
+ :root {
11
+ --bg: #0a0a0b;
12
+ --surface: #111113;
13
+ --surface2: #18181c;
14
+ --border: #2a2a30;
15
+ --border2: #3a3a42;
16
+ --accent: #7c6af7;
17
+ --accent2: #a89cf7;
18
+ --green: #3dd68c;
19
+ --red: #f76a6a;
20
+ --amber: #f7c26a;
21
+ --text: #e8e8f0;
22
+ --text2: #8888a0;
23
+ --text3: #555568;
24
+ --mono: 'JetBrains Mono', monospace;
25
+ --sans: 'Syne', sans-serif;
26
+ }
27
+
28
+ * { box-sizing: border-box; margin: 0; padding: 0; }
29
+
30
+ body {
31
+ background: var(--bg);
32
+ color: var(--text);
33
+ font-family: var(--sans);
34
+ min-height: 100vh;
35
+ display: flex;
36
+ flex-direction: column;
37
+ }
38
+
39
+ /* subtle grid background */
40
+ body::before {
41
+ content: '';
42
+ position: fixed;
43
+ inset: 0;
44
+ background-image:
45
+ linear-gradient(rgba(124,106,247,0.03) 1px, transparent 1px),
46
+ linear-gradient(90deg, rgba(124,106,247,0.03) 1px, transparent 1px);
47
+ background-size: 40px 40px;
48
+ pointer-events: none;
49
+ z-index: 0;
50
+ }
51
+
52
+ header {
53
+ position: relative;
54
+ z-index: 1;
55
+ padding: 28px 40px 24px;
56
+ border-bottom: 1px solid var(--border);
57
+ display: flex;
58
+ align-items: center;
59
+ justify-content: space-between;
60
+ }
61
+
62
+ .logo {
63
+ display: flex;
64
+ align-items: center;
65
+ gap: 14px;
66
+ }
67
+
68
+ .logo-mark {
69
+ width: 36px;
70
+ height: 36px;
71
+ border-radius: 8px;
72
+ background: var(--accent);
73
+ display: flex;
74
+ align-items: center;
75
+ justify-content: center;
76
+ font-family: var(--mono);
77
+ font-size: 14px;
78
+ font-weight: 500;
79
+ color: #fff;
80
+ letter-spacing: -0.5px;
81
+ }
82
+
83
+ .logo-text {
84
+ font-size: 16px;
85
+ font-weight: 700;
86
+ color: var(--text);
87
+ letter-spacing: -0.3px;
88
+ }
89
+
90
+ .logo-sub {
91
+ font-size: 11px;
92
+ color: var(--text3);
93
+ font-family: var(--mono);
94
+ margin-top: 1px;
95
+ }
96
+
97
+ .status-pill {
98
+ display: flex;
99
+ align-items: center;
100
+ gap: 7px;
101
+ padding: 6px 12px;
102
+ border-radius: 20px;
103
+ border: 1px solid var(--border);
104
+ background: var(--surface);
105
+ font-family: var(--mono);
106
+ font-size: 11px;
107
+ color: var(--text2);
108
+ }
109
+
110
+ .status-dot {
111
+ width: 7px;
112
+ height: 7px;
113
+ border-radius: 50%;
114
+ background: var(--text3);
115
+ transition: background 0.3s;
116
+ }
117
+ .status-dot.online { background: var(--green); box-shadow: 0 0 6px var(--green); }
118
+ .status-dot.error { background: var(--red); }
119
+ .status-dot.loading { background: var(--amber); animation: pulse 1s ease-in-out infinite; }
120
+
121
+ @keyframes pulse { 0%,100%{opacity:1} 50%{opacity:0.4} }
122
+
123
+ main {
124
+ position: relative;
125
+ z-index: 1;
126
+ flex: 1;
127
+ display: flex;
128
+ flex-direction: column;
129
+ max-width: 860px;
130
+ width: 100%;
131
+ margin: 0 auto;
132
+ padding: 40px 40px 0;
133
+ }
134
+
135
+ .hero {
136
+ margin-bottom: 40px;
137
+ animation: fadeUp 0.5s ease both;
138
+ }
139
+
140
+ @keyframes fadeUp {
141
+ from { opacity:0; transform: translateY(12px); }
142
+ to { opacity:1; transform: translateY(0); }
143
+ }
144
+
145
+ .hero h1 {
146
+ font-size: 32px;
147
+ font-weight: 700;
148
+ color: var(--text);
149
+ letter-spacing: -0.8px;
150
+ line-height: 1.2;
151
+ margin-bottom: 10px;
152
+ }
153
+
154
+ .hero h1 span { color: var(--accent2); }
155
+
156
+ .hero p {
157
+ font-size: 14px;
158
+ color: var(--text2);
159
+ font-family: var(--mono);
160
+ line-height: 1.6;
161
+ }
162
+
163
+ .model-tag {
164
+ display: inline-flex;
165
+ align-items: center;
166
+ gap: 6px;
167
+ margin-top: 12px;
168
+ padding: 4px 10px;
169
+ border-radius: 4px;
170
+ border: 1px solid var(--border2);
171
+ background: var(--surface2);
172
+ font-family: var(--mono);
173
+ font-size: 11px;
174
+ color: var(--accent2);
175
+ }
176
+
177
+ /* Query input area */
178
+ .query-box {
179
+ background: var(--surface);
180
+ border: 1px solid var(--border);
181
+ border-radius: 12px;
182
+ padding: 0;
183
+ overflow: hidden;
184
+ transition: border-color 0.2s;
185
+ animation: fadeUp 0.5s 0.1s ease both;
186
+ }
187
+
188
+ .query-box:focus-within {
189
+ border-color: var(--accent);
190
+ }
191
+
192
+ .query-label {
193
+ padding: 12px 16px 0;
194
+ font-family: var(--mono);
195
+ font-size: 11px;
196
+ color: var(--text3);
197
+ letter-spacing: 0.05em;
198
+ text-transform: uppercase;
199
+ }
200
+
201
+ textarea {
202
+ width: 100%;
203
+ background: transparent;
204
+ border: none;
205
+ outline: none;
206
+ resize: none;
207
+ padding: 10px 16px 14px;
208
+ font-family: var(--mono);
209
+ font-size: 14px;
210
+ color: var(--text);
211
+ line-height: 1.6;
212
+ min-height: 90px;
213
+ caret-color: var(--accent);
214
+ }
215
+
216
+ textarea::placeholder { color: var(--text3); }
217
+
218
+ .query-footer {
219
+ display: flex;
220
+ align-items: center;
221
+ justify-content: space-between;
222
+ padding: 10px 14px;
223
+ border-top: 1px solid var(--border);
224
+ background: var(--surface2);
225
+ }
226
+
227
+ .top-k-wrap {
228
+ display: flex;
229
+ align-items: center;
230
+ gap: 8px;
231
+ font-family: var(--mono);
232
+ font-size: 12px;
233
+ color: var(--text2);
234
+ }
235
+
236
+ .top-k-wrap select {
237
+ background: var(--surface);
238
+ border: 1px solid var(--border2);
239
+ border-radius: 4px;
240
+ color: var(--text);
241
+ font-family: var(--mono);
242
+ font-size: 12px;
243
+ padding: 3px 8px;
244
+ cursor: pointer;
245
+ outline: none;
246
+ }
247
+
248
+ .send-btn {
249
+ display: flex;
250
+ align-items: center;
251
+ gap: 8px;
252
+ padding: 8px 18px;
253
+ background: var(--accent);
254
+ border: none;
255
+ border-radius: 6px;
256
+ color: #fff;
257
+ font-family: var(--sans);
258
+ font-size: 13px;
259
+ font-weight: 600;
260
+ cursor: pointer;
261
+ transition: background 0.15s, transform 0.1s;
262
+ letter-spacing: 0.2px;
263
+ }
264
+
265
+ .send-btn:hover { background: var(--accent2); }
266
+ .send-btn:active { transform: scale(0.97); }
267
+ .send-btn:disabled { background: var(--border2); color: var(--text3); cursor: not-allowed; transform: none; }
268
+
269
+ .send-btn .arrow { font-size: 14px; transition: transform 0.15s; }
270
+ .send-btn:not(:disabled):hover .arrow { transform: translateX(3px); }
271
+
272
+ /* Response area */
273
+ #response-area {
274
+ margin-top: 28px;
275
+ animation: fadeUp 0.4s ease both;
276
+ display: none;
277
+ }
278
+
279
+ #response-area.visible { display: block; }
280
+
281
+ .response-card {
282
+ background: var(--surface);
283
+ border: 1px solid var(--border);
284
+ border-radius: 12px;
285
+ overflow: hidden;
286
+ }
287
+
288
+ .response-header {
289
+ display: flex;
290
+ align-items: center;
291
+ justify-content: space-between;
292
+ padding: 12px 16px;
293
+ border-bottom: 1px solid var(--border);
294
+ background: var(--surface2);
295
+ }
296
+
297
+ .response-label {
298
+ font-family: var(--mono);
299
+ font-size: 11px;
300
+ color: var(--text3);
301
+ text-transform: uppercase;
302
+ letter-spacing: 0.05em;
303
+ display: flex;
304
+ align-items: center;
305
+ gap: 7px;
306
+ }
307
+
308
+ .response-label .dot {
309
+ width: 6px; height: 6px; border-radius: 50%;
310
+ background: var(--green);
311
+ }
312
+
313
+ .latency-tag {
314
+ font-family: var(--mono);
315
+ font-size: 11px;
316
+ color: var(--text3);
317
+ padding: 2px 8px;
318
+ border-radius: 3px;
319
+ border: 1px solid var(--border);
320
+ }
321
+
322
+ .response-body {
323
+ padding: 20px;
324
+ font-size: 15px;
325
+ line-height: 1.75;
326
+ color: var(--text);
327
+ min-height: 60px;
328
+ white-space: pre-wrap;
329
+ word-break: break-word;
330
+ }
331
+
332
+ /* loading state */
333
+ .thinking {
334
+ display: flex;
335
+ align-items: center;
336
+ gap: 10px;
337
+ padding: 20px;
338
+ font-family: var(--mono);
339
+ font-size: 13px;
340
+ color: var(--text2);
341
+ }
342
+
343
+ .thinking-dots span {
344
+ display: inline-block;
345
+ width: 5px; height: 5px;
346
+ border-radius: 50%;
347
+ background: var(--accent);
348
+ margin: 0 2px;
349
+ animation: bounce 1.2s ease-in-out infinite;
350
+ }
351
+ .thinking-dots span:nth-child(2) { animation-delay: 0.2s; }
352
+ .thinking-dots span:nth-child(3) { animation-delay: 0.4s; }
353
+ @keyframes bounce { 0%,80%,100%{transform:translateY(0)} 40%{transform:translateY(-6px)} }
354
+
355
+ .sources-section {
356
+ border-top: 1px solid var(--border);
357
+ padding: 12px 20px;
358
+ background: var(--surface2);
359
+ }
360
+
361
+ .sources-label {
362
+ font-family: var(--mono);
363
+ font-size: 11px;
364
+ color: var(--text3);
365
+ text-transform: uppercase;
366
+ letter-spacing: 0.05em;
367
+ margin-bottom: 8px;
368
+ }
369
+
370
+ .source-chip {
371
+ display: inline-flex;
372
+ align-items: center;
373
+ gap: 5px;
374
+ padding: 3px 9px;
375
+ border-radius: 4px;
376
+ border: 1px solid var(--border2);
377
+ background: var(--surface);
378
+ font-family: var(--mono);
379
+ font-size: 11px;
380
+ color: var(--accent2);
381
+ margin: 3px 4px 3px 0;
382
+ }
383
+
384
+ .no-sources {
385
+ font-family: var(--mono);
386
+ font-size: 12px;
387
+ color: var(--text3);
388
+ font-style: italic;
389
+ }
390
+
391
+ .error-card {
392
+ background: rgba(247,106,106,0.06);
393
+ border: 1px solid rgba(247,106,106,0.25);
394
+ border-radius: 8px;
395
+ padding: 14px 16px;
396
+ font-family: var(--mono);
397
+ font-size: 13px;
398
+ color: var(--red);
399
+ margin-top: 16px;
400
+ display: none;
401
+ }
402
+ .error-card.visible { display: block; }
403
+
404
+ /* history */
405
+ .history-section {
406
+ margin-top: 32px;
407
+ padding-bottom: 40px;
408
+ animation: fadeUp 0.4s 0.15s ease both;
409
+ }
410
+
411
+ .history-label {
412
+ font-family: var(--mono);
413
+ font-size: 11px;
414
+ color: var(--text3);
415
+ text-transform: uppercase;
416
+ letter-spacing: 0.05em;
417
+ margin-bottom: 12px;
418
+ }
419
+
420
+ .history-item {
421
+ background: var(--surface);
422
+ border: 1px solid var(--border);
423
+ border-radius: 8px;
424
+ padding: 12px 16px;
425
+ margin-bottom: 8px;
426
+ cursor: pointer;
427
+ transition: border-color 0.15s;
428
+ }
429
+
430
+ .history-item:hover { border-color: var(--border2); }
431
+
432
+ .history-q {
433
+ font-size: 13px;
434
+ color: var(--text2);
435
+ font-family: var(--mono);
436
+ white-space: nowrap;
437
+ overflow: hidden;
438
+ text-overflow: ellipsis;
439
+ }
440
+
441
+ .history-meta {
442
+ font-size: 11px;
443
+ color: var(--text3);
444
+ font-family: var(--mono);
445
+ margin-top: 4px;
446
+ }
447
+
448
+ footer {
449
+ position: relative;
450
+ z-index: 1;
451
+ text-align: center;
452
+ padding: 20px;
453
+ font-family: var(--mono);
454
+ font-size: 11px;
455
+ color: var(--text3);
456
+ border-top: 1px solid var(--border);
457
+ margin-top: auto;
458
+ }
459
+ </style>
460
+ </head>
461
+ <body>
462
+
463
+ <header>
464
+ <div class="logo">
465
+ <div class="logo-mark">λ</div>
466
+ <div>
467
+ <div class="logo-text">LLMOps RAG</div>
468
+ <div class="logo-sub">Llama 3.1 · QLoRA · Pinecone</div>
469
+ </div>
470
+ </div>
471
+ <div class="status-pill">
472
+ <div class="status-dot" id="status-dot"></div>
473
+ <span id="status-text">checking...</span>
474
+ </div>
475
+ </header>
476
+
477
+ <main>
478
+ <div class="hero">
479
+ <h1>Ask your <span>fine-tuned</span> model</h1>
480
+ <p>Llama 3.1 8B · QLoRA exp2_lr2e-4_r16 · RTX 3060 · 4-bit NF4</p>
481
+ <div class="model-tag">
482
+ <span>●</span> Retrieval-augmented · semantic search over ingested docs
483
+ </div>
484
+ </div>
485
+
486
+ <div class="query-box">
487
+ <div class="query-label">query</div>
488
+ <textarea
489
+ id="query-input"
490
+ placeholder="What is QLoRA and how does it differ from full fine-tuning?"
491
+ rows="3"
492
+ ></textarea>
493
+ <div class="query-footer">
494
+ <div class="top-k-wrap">
495
+ <span>top_k</span>
496
+ <select id="top-k">
497
+ <option value="1">1</option>
498
+ <option value="2">2</option>
499
+ <option value="3" selected>3</option>
500
+ <option value="5">5</option>
501
+ </select>
502
+ <span style="color:var(--text3)">retrieved chunks</span>
503
+ </div>
504
+ <button class="send-btn" id="send-btn" onclick="submitQuery()">
505
+ Run query <span class="arrow">→</span>
506
+ </button>
507
+ </div>
508
+ </div>
509
+
510
+ <div class="error-card" id="error-card"></div>
511
+
512
+ <div id="response-area">
513
+ <div class="response-card">
514
+ <div class="response-header">
515
+ <div class="response-label">
516
+ <div class="dot"></div>
517
+ response
518
+ </div>
519
+ <div class="latency-tag" id="latency-tag">—</div>
520
+ </div>
521
+ <div id="response-body" class="response-body"></div>
522
+ <div class="sources-section">
523
+ <div class="sources-label">sources</div>
524
+ <div id="sources-list"><span class="no-sources">no documents ingested yet</span></div>
525
+ </div>
526
+ </div>
527
+ </div>
528
+
529
+ <div class="history-section" id="history-section" style="display:none">
530
+ <div class="history-label">recent queries</div>
531
+ <div id="history-list"></div>
532
+ </div>
533
+ </main>
534
+
535
+ <footer>
536
+ running locally · http://localhost:8000 · <span id="footer-model">exp2_lr2e-4_r16</span>
537
+ </footer>
538
+
539
+ <script>
540
+ const API = 'http://localhost:8000';
541
+ const history = [];
542
+
543
+ async function checkHealth() {
544
+ const dot = document.getElementById('status-dot');
545
+ const txt = document.getElementById('status-text');
546
+ dot.className = 'status-dot loading';
547
+ txt.textContent = 'connecting...';
548
+ try {
549
+ const r = await fetch(`${API}/health`);
550
+ const d = await r.json();
551
+ if (d.model_loaded) {
552
+ dot.className = 'status-dot online';
553
+ txt.textContent = 'model ready';
554
+ } else {
555
+ dot.className = 'status-dot loading';
556
+ txt.textContent = 'loading model...';
557
+ setTimeout(checkHealth, 3000);
558
+ }
559
+ } catch {
560
+ dot.className = 'status-dot error';
561
+ txt.textContent = 'server offline';
562
+ setTimeout(checkHealth, 5000);
563
+ }
564
+ }
565
+
566
+ async function submitQuery() {
567
+ const query = document.getElementById('query-input').value.trim();
568
+ if (!query) return;
569
+
570
+ const top_k = parseInt(document.getElementById('top-k').value);
571
+ const btn = document.getElementById('send-btn');
572
+ const responseArea = document.getElementById('response-area');
573
+ const responseBody = document.getElementById('response-body');
574
+ const errorCard = document.getElementById('error-card');
575
+ const latencyTag = document.getElementById('latency-tag');
576
+ const sourcesList = document.getElementById('sources-list');
577
+
578
+ // reset
579
+ errorCard.className = 'error-card';
580
+ responseArea.className = 'response-area visible';
581
+ responseArea.style.display = 'block';
582
+ latencyTag.textContent = '—';
583
+ sourcesList.innerHTML = '';
584
+ btn.disabled = true;
585
+
586
+ // show thinking
587
+ responseBody.innerHTML = `
588
+ <div class="thinking">
589
+ <div class="thinking-dots">
590
+ <span></span><span></span><span></span>
591
+ </div>
592
+ generating response...
593
+ </div>`;
594
+
595
+ try {
596
+ const res = await fetch(`${API}/generate`, {
597
+ method: 'POST',
598
+ headers: { 'Content-Type': 'application/json' },
599
+ body: JSON.stringify({ query, top_k })
600
+ });
601
+
602
+ if (!res.ok) {
603
+ const err = await res.json();
604
+ throw new Error(JSON.stringify(err.detail || err));
605
+ }
606
+
607
+ const data = await res.json();
608
+
609
+ // render answer
610
+ responseBody.textContent = data.answer;
611
+
612
+ // latency
613
+ const ms = Math.round(data.latency_ms);
614
+ latencyTag.textContent = ms >= 1000 ? `${(ms/1000).toFixed(1)}s` : `${ms}ms`;
615
+
616
+ // sources
617
+ if (data.sources && data.sources.length > 0) {
618
+ sourcesList.innerHTML = data.sources.map(s => {
619
+ const name = s.split(/[\\/]/).pop();
620
+ return `<span class="source-chip">📄 ${name}</span>`;
621
+ }).join('');
622
+ } else {
623
+ sourcesList.innerHTML = '<span class="no-sources">no docs ingested — run ingest.py to add documents</span>';
624
+ }
625
+
626
+ // add to history
627
+ addHistory(query, ms);
628
+
629
+ } catch (err) {
630
+ responseBody.innerHTML = '';
631
+ responseArea.style.display = 'none';
632
+ errorCard.className = 'error-card visible';
633
+ errorCard.textContent = `Error: ${err.message}`;
634
+ }
635
+
636
+ btn.disabled = false;
637
+ }
638
+
639
+ function addHistory(query, latency_ms) {
640
+ history.unshift({ query, latency_ms, time: new Date().toLocaleTimeString() });
641
+ if (history.length > 5) history.pop();
642
+
643
+ const section = document.getElementById('history-section');
644
+ const list = document.getElementById('history-list');
645
+
646
+ section.style.display = 'block';
647
+ list.innerHTML = history.map((h, i) => `
648
+ <div class="history-item" onclick="rerun(${i})">
649
+ <div class="history-q">${h.query}</div>
650
+ <div class="history-meta">${h.time} · ${Math.round(h.latency_ms/1000).toFixed(1)}s</div>
651
+ </div>
652
+ `).join('');
653
+ }
654
+
655
+ function rerun(i) {
656
+ document.getElementById('query-input').value = history[i].query;
657
+ submitQuery();
658
+ }
659
+
660
+ // Ctrl+Enter to submit
661
+ document.getElementById('query-input').addEventListener('keydown', e => {
662
+ if (e.key === 'Enter' && (e.ctrlKey || e.metaKey)) {
663
+ e.preventDefault();
664
+ submitQuery();
665
+ }
666
+ });
667
+
668
+ // check health on load
669
+ checkHealth();
670
+ </script>
671
+ </body>
672
+ </html>
ingest.py ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ ingest.py — Load documents from a directory, chunk them, embed them, push to Pinecone.
3
+
4
+ Usage:
5
+ python ingest.py --dir ./docs
6
+ python ingest.py --dir ./docs --chunk-size 400 --chunk-overlap 50
7
+ """
8
+
9
+ import os
10
+ import uuid
11
+ import argparse
12
+ import logging
13
+ from pathlib import Path
14
+ from dotenv import load_dotenv
15
+ load_dotenv()
16
+
17
+ from pinecone import Pinecone, ServerlessSpec
18
+ from embedder import embed_texts
19
+
20
+ logging.basicConfig(level=logging.INFO)
21
+ logger = logging.getLogger(__name__)
22
+
23
+ PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
24
+ PINECONE_INDEX = os.getenv("PINECONE_INDEX", "llmops-rag")
25
+ EMBED_DIM = 384 # all-MiniLM-L6-v2 output dim
26
+
27
+
28
+ def chunk_text(text: str, chunk_size: int = 400, overlap: int = 50) -> list[str]:
29
+ """Naive character-level chunker. Replace with sentence splitter if needed."""
30
+ words = text.split()
31
+ chunks, i = [], 0
32
+ while i < len(words):
33
+ chunk = " ".join(words[i : i + chunk_size])
34
+ chunks.append(chunk)
35
+ i += chunk_size - overlap
36
+ return chunks
37
+
38
+
39
+ def load_documents(directory: str) -> list[dict]:
40
+ """Load .txt and .md files recursively. Returns list of {source, text}."""
41
+ docs = []
42
+ for path in Path(directory).rglob("*"):
43
+ if path.suffix in {".txt", ".md"}:
44
+ text = path.read_text(encoding="utf-8", errors="ignore").strip()
45
+ if text:
46
+ docs.append({"source": str(path), "text": text})
47
+ logger.info(f"Loaded {len(docs)} documents from {directory}")
48
+ return docs
49
+
50
+
51
+ def ensure_index(pc: Pinecone):
52
+ existing = [idx.name for idx in pc.list_indexes()]
53
+ if PINECONE_INDEX not in existing:
54
+ logger.info(f"Creating index '{PINECONE_INDEX}'...")
55
+ pc.create_index(
56
+ name=PINECONE_INDEX,
57
+ dimension=EMBED_DIM,
58
+ metric="cosine",
59
+ spec=ServerlessSpec(cloud="aws", region="us-east-1"),
60
+ )
61
+ logger.info("Index created.")
62
+ else:
63
+ logger.info(f"Index '{PINECONE_INDEX}' already exists.")
64
+
65
+
66
+ def ingest_documents(directory: str, chunk_size: int = 400, chunk_overlap: int = 50) -> int:
67
+ if not PINECONE_API_KEY:
68
+ raise EnvironmentError("PINECONE_API_KEY not set")
69
+
70
+ pc = Pinecone(api_key=PINECONE_API_KEY)
71
+ ensure_index(pc)
72
+ index = pc.Index(PINECONE_INDEX)
73
+
74
+ docs = load_documents(directory)
75
+ if not docs:
76
+ logger.warning("No documents found. Nothing ingested.")
77
+ return 0
78
+
79
+ all_chunks, all_meta = [], []
80
+ for doc in docs:
81
+ for chunk in chunk_text(doc["text"], chunk_size, chunk_overlap):
82
+ all_chunks.append(chunk)
83
+ all_meta.append({"source": doc["source"], "text": chunk})
84
+
85
+ logger.info(f"Embedding {len(all_chunks)} chunks...")
86
+ vectors = embed_texts(all_chunks)
87
+
88
+ # Upsert in batches of 100
89
+ BATCH = 100
90
+ total = 0
91
+ for i in range(0, len(all_chunks), BATCH):
92
+ batch_vectors = [
93
+ (str(uuid.uuid4()), vectors[j], all_meta[j])
94
+ for j in range(i, min(i + BATCH, len(all_chunks)))
95
+ ]
96
+ index.upsert(vectors=batch_vectors)
97
+ total += len(batch_vectors)
98
+ logger.info(f" Upserted {total}/{len(all_chunks)}")
99
+
100
+ logger.info(f"Done. {total} vectors in Pinecone index '{PINECONE_INDEX}'.")
101
+ return total
102
+
103
+
104
+ if __name__ == "__main__":
105
+ parser = argparse.ArgumentParser()
106
+ parser.add_argument("--dir", default="./docs", help="Directory containing .txt/.md files")
107
+ parser.add_argument("--chunk-size", type=int, default=400)
108
+ parser.add_argument("--chunk-overlap", type=int, default=50)
109
+ args = parser.parse_args()
110
+ ingest_documents(args.dir, args.chunk_size, args.chunk_overlap)
main.py ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI, HTTPException
2
+ from fastapi.middleware.cors import CORSMiddleware
3
+ from pydantic import BaseModel
4
+ from contextlib import asynccontextmanager
5
+ import logging
6
+ import time
7
+ from fastapi.responses import FileResponse
8
+ from rag import RAGChain
9
+
10
+ logging.basicConfig(level=logging.INFO)
11
+ logger = logging.getLogger(__name__)
12
+
13
+ rag_chain: RAGChain = None
14
+
15
+
16
+ @asynccontextmanager
17
+ async def lifespan(app: FastAPI):
18
+ global rag_chain
19
+ logger.info("Loading RAG chain...")
20
+ rag_chain = RAGChain()
21
+ rag_chain.load()
22
+ logger.info("RAG chain ready.")
23
+ yield
24
+ logger.info("Shutting down.")
25
+
26
+
27
+ app = FastAPI(
28
+ title="LLMOps RAG API",
29
+ description="Llama 3.1 8B QLoRA fine-tuned + Pinecone RAG",
30
+ version="1.0.0",
31
+ lifespan=lifespan,
32
+ )
33
+
34
+ app.add_middleware(
35
+ CORSMiddleware,
36
+ allow_origins=["*"],
37
+ allow_methods=["*"],
38
+ allow_headers=["*"],
39
+ )
40
+
41
+
42
+ class GenerateRequest(BaseModel):
43
+ query: str
44
+ top_k: int = 3
45
+ max_new_tokens: int = 512
46
+
47
+
48
+ class GenerateResponse(BaseModel):
49
+ answer: str
50
+ sources: list[str]
51
+ latency_ms: float
52
+
53
+
54
+ @app.get("/health")
55
+ def health():
56
+ return {"status": "ok", "model_loaded": rag_chain is not None and rag_chain.ready}
57
+
58
+
59
+ @app.get("/")
60
+ def ui():
61
+ return FileResponse("index.html")
62
+
63
+
64
+ @app.post("/generate", response_model=GenerateResponse)
65
+ def generate(req: GenerateRequest):
66
+ if not rag_chain or not rag_chain.ready:
67
+ raise HTTPException(status_code=503, detail="Model not loaded yet")
68
+ if not req.query.strip():
69
+ raise HTTPException(status_code=400, detail="Query cannot be empty")
70
+
71
+ start = time.time()
72
+ answer, sources = rag_chain.query(req.query, top_k=req.top_k, max_new_tokens=req.max_new_tokens)
73
+ latency_ms = (time.time() - start) * 1000
74
+
75
+ return GenerateResponse(answer=answer, sources=sources, latency_ms=round(latency_ms, 1))
76
+
77
+
78
+ @app.post("/ingest")
79
+ def ingest(directory: str = "./docs"):
80
+ """Ingest documents from a local directory into Pinecone."""
81
+ from ingest import ingest_documents
82
+ count = ingest_documents(directory)
83
+ return {"ingested": count, "directory": directory}
pyvenv.cfg ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ home = C:\Users\mukul\AppData\Local\Programs\Python\Python312
2
+ include-system-site-packages = false
3
+ version = 3.12.9
4
+ executable = C:\Users\mukul\AppData\Local\Programs\Python\Python312\python.exe
5
+ command = C:\Users\mukul\AppData\Local\Programs\Python\Python312\python.exe -m venv E:\Projects\llmops-serve\venv
rag.py ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import logging
3
+ import torch
4
+ from dotenv import load_dotenv
5
+
6
+ load_dotenv()
7
+
8
+ from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, BitsAndBytesConfig
9
+ from langchain_community.vectorstores import Pinecone as LangchainPinecone
10
+ from langchain_community.embeddings import HuggingFaceEmbeddings
11
+ from langchain.chains import RetrievalQA
12
+ from langchain_community.llms import HuggingFacePipeline
13
+ from langchain.prompts import PromptTemplate
14
+ from pinecone import Pinecone
15
+
16
+ logger = logging.getLogger(__name__)
17
+
18
+ LOCAL_MODEL = os.getenv("MODEL_PATH", "./models/merged/exp2_lr2e-4_r16")
19
+ EMBED_MODEL = os.getenv("EMBED_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
20
+ PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
21
+ PINECONE_INDEX = os.getenv("PINECONE_INDEX", "llmops-rag")
22
+
23
+ PROMPT_TEMPLATE = """You are a precise Genshin Impact assistant. Answer ONLY using the context below.
24
+ If specific details like weapon names or artifact sets are not in the context, say so — do not invent them.
25
+
26
+ Context:
27
+ {context}
28
+
29
+ Question: {question}
30
+
31
+ Answer (use only information from the context above):"""
32
+
33
+
34
+ class RAGChain:
35
+ def __init__(self):
36
+ self.ready = False
37
+ self.chain = None
38
+ self.vectorstore = None
39
+
40
+ def load(self):
41
+ logger.info(f"Loading model from {LOCAL_MODEL}")
42
+
43
+ # ---- 4-bit quant for 6GB VRAM ----
44
+ bnb_config = BitsAndBytesConfig(
45
+ load_in_4bit=True,
46
+ bnb_4bit_quant_type="nf4",
47
+ bnb_4bit_compute_dtype=torch.bfloat16,
48
+ bnb_4bit_use_double_quant=True,
49
+ )
50
+
51
+ tokenizer = AutoTokenizer.from_pretrained(LOCAL_MODEL)
52
+ model = AutoModelForCausalLM.from_pretrained(
53
+ LOCAL_MODEL,
54
+ quantization_config=bnb_config,
55
+ device_map="auto",
56
+ torch_dtype=torch.bfloat16,
57
+ max_memory={0: "5.5GiB", "cpu": "24GiB"},
58
+ )
59
+ model.eval()
60
+
61
+ tokenizer.pad_token = tokenizer.eos_token
62
+
63
+ hf_pipe = pipeline(
64
+ "text-generation",
65
+ model=model,
66
+ tokenizer=tokenizer,
67
+ max_new_tokens=256,
68
+ do_sample=False,
69
+ temperature=None,
70
+ top_p=None,
71
+ repetition_penalty=1.3,
72
+ return_full_text=False,
73
+ eos_token_id=tokenizer.eos_token_id,
74
+ pad_token_id=tokenizer.eos_token_id,
75
+
76
+ )
77
+ llm = HuggingFacePipeline(pipeline=hf_pipe)
78
+ logger.info("Model loaded.")
79
+
80
+ # ---- Embeddings + Pinecone ----
81
+ logger.info("Connecting to Pinecone...")
82
+ embeddings = HuggingFaceEmbeddings(model_name=EMBED_MODEL)
83
+ pc = Pinecone(api_key=PINECONE_API_KEY)
84
+ index = pc.Index(PINECONE_INDEX)
85
+ self.vectorstore = LangchainPinecone(index, embeddings, "text")
86
+ logger.info("Pinecone connected.")
87
+
88
+ # ---- RetrievalQA chain ----
89
+ prompt = PromptTemplate(
90
+ template=PROMPT_TEMPLATE,
91
+ input_variables=["context", "question"],
92
+ )
93
+ self.chain = RetrievalQA.from_chain_type(
94
+ llm=llm,
95
+ chain_type="stuff",
96
+ retriever=self.vectorstore.as_retriever(search_kwargs={"k": 3}),
97
+ return_source_documents=True,
98
+ chain_type_kwargs={"prompt": prompt},
99
+ )
100
+ self.ready = True
101
+ logger.info("RAG chain ready.")
102
+
103
+ def query(self, question: str, top_k: int = 3, max_new_tokens: int = 512) -> tuple[str, list[str]]:
104
+ if not self.ready:
105
+ raise RuntimeError("Chain not loaded")
106
+
107
+ # Override retriever k at query time
108
+ self.chain.retriever.search_kwargs["k"] = top_k
109
+
110
+ result = self.chain.invoke({"query": question})
111
+ answer = result["result"].strip().replace("</s>", "").strip()
112
+ sources = [
113
+ doc.metadata.get("source", "unknown")
114
+ for doc in result.get("source_documents", [])
115
+ ]
116
+ return answer, list(dict.fromkeys(sources)) # deduplicated, order preserved
requirements.txt ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Core serving
2
+ fastapi==0.115.0
3
+ uvicorn[standard]==0.30.6
4
+ pydantic==2.7.4
5
+
6
+ # LLM + fine-tuned model
7
+ torch==2.3.1 # CPU or CUDA — Docker will handle CUDA variant
8
+ transformers==4.51.3
9
+ peft==0.15.2
10
+ bitsandbytes==0.43.3 # CPU-compatible for Docker; swap 0.49.2 if CUDA 12.8
11
+ accelerate==1.6.0
12
+
13
+ # RAG
14
+ langchain==0.3.25
15
+ langchain-community==0.3.23
16
+ pinecone-client==5.0.1
17
+ sentence-transformers==4.1.0
18
+
19
+ # Utilities
20
+ python-dotenv==1.0.1