Spaces:

vigneshwar234
/

VORTEXRAG

Running

App Files Files Community

VORTEXRAG / README.md

vigneshwar234

Update README: DOI 10.5281/zenodo.20579702, benchmark table, viral description

b78b417 verified 2 days ago

preview code

raw

history blame contribute delete

3.66 kB

	---
	title: VORTEXRAG
	emoji: "🌀"
	colorFrom: purple
	colorTo: blue
	sdk: gradio
	sdk_version: "5.29.0"
	app_file: app.py
	pinned: true
	license: mit
	short_description: "7-Layer RAG: +13.6 EM, 0.94 Faithfulness"
	tags:
	- retrieval-augmented-generation
	- RAG
	- NLP
	- question-answering
	- causal-reasoning
	- hallucination-reduction
	- LLM
	- machine-learning
	---

	# VORTEXRAG

	Vector Orthogonal Resonance-Tuned EXtraction Retrieval-Augmented Generation

	> A 7-layer RAG framework that simultaneously eliminates Semantic Drift and Context Window Poisoning.

	[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20579702.svg)](https://doi.org/10.5281/zenodo.20579702)
	[![GitHub](https://img.shields.io/badge/GitHub-vignesh2027%2FVORTEXRAG-blue)](https://github.com/vignesh2027/VORTEXRAG)
	[![Tests](https://img.shields.io/badge/Tests-229%20passing-brightgreen)](https://github.com/vignesh2027/VORTEXRAG)
	[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/vignesh2027/VORTEXRAG/blob/main/LICENSE)

	---

	## The Problem Standard RAG Cannot Solve

	Ask "Why did Lehman Brothers collapse?"

	Standard RAG retrieves both Dodd-Frank provisions (cosine 0.87, topically related but WRONG) and the CDS mispricing mechanism (cosine 0.91, causally correct). The LLM sees both and hallucinates a policy-response narrative. This is Semantic Drift.

	Even with the right chunk retrieved, 7 surrounding irrelevant chunks dilute the LLM attention. This is Context Window Poisoning.

	VORTEXRAG solves both with a principled 7-layer pipeline.

	---

	## The 7 Layers

	\| Layer \| Name \| Formula \|
	\|-------\|------\|---------\|
	\| 1 \| TVE - Tri-Vector Encoding \| score = alpha * cos_sem + beta * cos_syn + gamma * cos_cau \|
	\| 2 \| VRC - Vortex Retrieval Cone \| spiral = TVE * exp(-lambdar) cos(n*theta) \|
	\| 3 \| SDC - Semantic Drift Corrector \| SDS = 1 - tanh(norm(D)/tau) >= 0.72 \|
	\| 4 \| CPG - Context Poison Guard \| ESR = sum(S*w)/(P+eps) >= 3.5 (provably optimal) \|
	\| 5 \| RFG - Rank Fusion Gate \| Phi = TVE^alpha * SDS^beta * ESR^gamma \|
	\| 6 \| CCB - Causal Context Builder \| pos = rank(Phi) * causal_depth \|
	\| 7 \| FV - Faithfulness Verifier \| Delta_R = 1 - ROUGE-L * NLI <= 0.15 \|

	---

	## Benchmark Results (v3.0)

	\| System \| EM \| F1 \| Faithfulness \| Latency \|
	\|--------\|----\|----\|-------------\|---------\|
	\| VORTEXRAG \| 74.8 \| 82.6 \| 0.94 \| 185ms \|
	\| Self-RAG \| 68.4 \| 77.1 \| 0.81 \| 410ms \|
	\| CRAG \| 66.9 \| 75.8 \| 0.79 \| 320ms \|
	\| Naive RAG \| 61.2 \| 69.4 \| 0.71 \| 95ms \|

	+13.6 EM over Naive RAG - Semantic Drift -61% - Context Poisoning -71% - 2.2x faster than Self-RAG

	---

	## Ablation

	```
	Baseline: 61.2 EM / 0.68 Faithfulness
	+TVE: 65.3 EM (+4.1)
	+VRC: 67.8 EM (+2.5)
	+SDC: 70.4 EM (+2.6)
	+CPG: 72.1 EM (+1.7)
	+RFG: 73.4 EM (+1.3)
	+CCB: 73.9 EM (+0.5)
	VORTEXRAG (full): 74.8 EM (+0.9)
	```

	---

	## 11 Domain Presets

	scientific (tau=0.30) - medical (tau=0.35) - legal (tau=0.40) - cybersecurity (tau=0.45) - financial (tau=0.50) - code (tau=0.60) - educational (tau=0.65) - general (tau=0.80) - historical (tau=0.90) - customer (tau=0.95) - creative (tau=1.20)

	---

	## Citation

	```bibtex
	@article{vignesh2026vortexrag,
	title = {VORTEXRAG: Vector Orthogonal Resonance-Tuned EXtraction RAG},
	author = {Vignesh, L},
	year = {2026},
	doi = {10.5281/zenodo.20579702},
	url = {https://doi.org/10.5281/zenodo.20579702}
	}
	```

	Paper: https://doi.org/10.5281/zenodo.20579702
	Code: https://github.com/vignesh2027/VORTEXRAG
	ORCID: https://orcid.org/0009-0004-9777-7592