Spaces:
Running
Running
| title: VORTEXRAG | |
| emoji: "๐" | |
| colorFrom: purple | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: "5.29.0" | |
| app_file: app.py | |
| pinned: true | |
| license: mit | |
| short_description: "7-Layer RAG: +13.6 EM, 0.94 Faithfulness" | |
| tags: | |
| - retrieval-augmented-generation | |
| - RAG | |
| - NLP | |
| - question-answering | |
| - causal-reasoning | |
| - hallucination-reduction | |
| - LLM | |
| - machine-learning | |
| # VORTEXRAG | |
| **Vector Orthogonal Resonance-Tuned EXtraction Retrieval-Augmented Generation** | |
| > A 7-layer RAG framework that simultaneously eliminates **Semantic Drift** and **Context Window Poisoning**. | |
| [](https://doi.org/10.5281/zenodo.20579702) | |
| [](https://github.com/vignesh2027/VORTEXRAG) | |
| [](https://github.com/vignesh2027/VORTEXRAG) | |
| [](https://github.com/vignesh2027/VORTEXRAG/blob/main/LICENSE) | |
| --- | |
| ## The Problem Standard RAG Cannot Solve | |
| Ask "Why did Lehman Brothers collapse?" | |
| Standard RAG retrieves both Dodd-Frank provisions (cosine 0.87, topically related but WRONG) and the CDS mispricing mechanism (cosine 0.91, causally correct). The LLM sees both and hallucinates a policy-response narrative. **This is Semantic Drift.** | |
| Even with the right chunk retrieved, 7 surrounding irrelevant chunks dilute the LLM attention. **This is Context Window Poisoning.** | |
| VORTEXRAG solves both with a principled 7-layer pipeline. | |
| --- | |
| ## The 7 Layers | |
| | Layer | Name | Formula | | |
| |-------|------|---------| | |
| | 1 | TVE - Tri-Vector Encoding | score = alpha * cos_sem + beta * cos_syn + gamma * cos_cau | | |
| | 2 | VRC - Vortex Retrieval Cone | spiral = TVE * exp(-lambda*r) * cos(n*theta) | | |
| | 3 | SDC - Semantic Drift Corrector | SDS = 1 - tanh(norm(D)/tau) >= 0.72 | | |
| | 4 | CPG - Context Poison Guard | ESR = sum(S*w)/(P+eps) >= 3.5 (provably optimal) | | |
| | 5 | RFG - Rank Fusion Gate | Phi = TVE^alpha * SDS^beta * ESR^gamma | | |
| | 6 | CCB - Causal Context Builder | pos = rank(Phi) * causal_depth | | |
| | 7 | FV - Faithfulness Verifier | Delta_R = 1 - ROUGE-L * NLI <= 0.15 | | |
| --- | |
| ## Benchmark Results (v3.0) | |
| | System | EM | F1 | Faithfulness | Latency | | |
| |--------|----|----|-------------|---------| | |
| | **VORTEXRAG** | **74.8** | **82.6** | **0.94** | **185ms** | | |
| | Self-RAG | 68.4 | 77.1 | 0.81 | 410ms | | |
| | CRAG | 66.9 | 75.8 | 0.79 | 320ms | | |
| | Naive RAG | 61.2 | 69.4 | 0.71 | 95ms | | |
| **+13.6 EM** over Naive RAG - Semantic Drift **-61%** - Context Poisoning **-71%** - **2.2x faster** than Self-RAG | |
| --- | |
| ## Ablation | |
| ``` | |
| Baseline: 61.2 EM / 0.68 Faithfulness | |
| +TVE: 65.3 EM (+4.1) | |
| +VRC: 67.8 EM (+2.5) | |
| +SDC: 70.4 EM (+2.6) | |
| +CPG: 72.1 EM (+1.7) | |
| +RFG: 73.4 EM (+1.3) | |
| +CCB: 73.9 EM (+0.5) | |
| VORTEXRAG (full): 74.8 EM (+0.9) | |
| ``` | |
| --- | |
| ## 11 Domain Presets | |
| scientific (tau=0.30) - medical (tau=0.35) - legal (tau=0.40) - cybersecurity (tau=0.45) - financial (tau=0.50) - code (tau=0.60) - educational (tau=0.65) - general (tau=0.80) - historical (tau=0.90) - customer (tau=0.95) - creative (tau=1.20) | |
| --- | |
| ## Citation | |
| ```bibtex | |
| @article{vignesh2026vortexrag, | |
| title = {VORTEXRAG: Vector Orthogonal Resonance-Tuned EXtraction RAG}, | |
| author = {Vignesh, L}, | |
| year = {2026}, | |
| doi = {10.5281/zenodo.20579702}, | |
| url = {https://doi.org/10.5281/zenodo.20579702} | |
| } | |
| ``` | |
| **Paper:** https://doi.org/10.5281/zenodo.20579702 | |
| **Code:** https://github.com/vignesh2027/VORTEXRAG | |
| **ORCID:** https://orcid.org/0009-0004-9777-7592 | |