Spaces:

denizkaya2022
/

IPARD-AI-Assistant

Sleeping

App Files Files Community

IPARD-AI-Assistant / README.md

denizkaya20

Initial clean commit: removed LFS and large data files

d93ca3e 27 days ago

preview code

raw

history blame contribute delete

5.53 kB

metadata

title: IPARD III RAG
emoji: 🌾
colorFrom: green
colorTo: blue
sdk: docker
pinned: false
app_port: 7860

IPARD III Document Assistant 🤖🌾

A production-grade Retrieval-Augmented Generation (RAG) system built specifically for IPARD III program documents. Using hybrid search, contextual retrieval, and intelligent query routing, it delivers accurate, context-aware answers about IPARD III grants, measures, and application processes.

✨ Features

Retrieval

Hybrid Search: BM25 keyword search + Turkish-E5-Large semantic embeddings
Reciprocal Rank Fusion (RRF): Intelligently merges keyword and semantic results
Cross-Encoder Reranking: BGE-reranker-base for precision scoring (0.95+ scores)
Neighboring Chunks: Previous and next chunks are included alongside retrieved chunks for fuller context
Contextual Retrieval (Anthropic method): Each chunk is enriched with a 2-3 sentence LLM-generated context before embedding — reduces retrieval failures by ~67%

Routing & Intelligence

Query Router: Llama-3.1-8b-instant automatically detects the relevant measure and document type from the user's question — even when no filter is selected
Identity Bypass: Meta questions like "Who are you?" skip the RAG pipeline and are answered directly from the system prompt
Metadata Filtering: Filter by measure (101/103/201/202/302) and document type

System

Real-time Streaming: Token-by-token responses via Server-Sent Events (SSE)
Dynamic System Prompt: Update LLM behavior via system_prompt.txt without touching code
Router Prompt: Full measure/sector/sub-sector hierarchy defined in router_prompt.txt
Turkish-Optimized: Turkish-E5-Large asymmetric model with passage/query prefix support

🏗️ Architecture

User Question
      ↓
┌─────────────────────────────┐
│       Query Router          │
│   (Llama-3.1-8b-instant)   │
│   measure + doc_type detect │
└─────────────┬───────────────┘
              ↓
┌─────────────────────────────┐
│      Hybrid Retrieval       │
│  BM25 (20) + Semantic (20)  │
│  → RRF Merge                │
│  → Cross-Encoder Rerank (5) │
│  + Neighboring Chunks       │
└─────────────┬───────────────┘
              ↓
┌─────────────────────────────┐
│      LLM Generation         │
│   GPT-OSS 120B (Groq API)   │
│   + Dynamic System Prompt   │
└─────────────────────────────┘

🚀 Technology Stack

Component	Technology	Purpose
Frontend	Streamlit	Web interface
Backend API	FastAPI	REST endpoints & streaming
Vector Database	ChromaDB	Embedding storage & retrieval
Keyword Search	BM25 (rank-bm25)	Lexical search
Embeddings	Turkish-E5-Large	Semantic text representation
Reranking	BGE-reranker-base	Precision improvement
Query Router	Llama-3.1-8b-instant (Groq)	Automatic measure detection
LLM	GPT-OSS 120B (Groq)	Answer generation
Deployment	Docker + HuggingFace Spaces	Containerized hosting

📊 Performance

Documents: 75+ Turkish IPARD III PDFs
Chunks: 7,717 segments
Embedding Dimension: 1,024
Retrieval Pipeline: 20 BM25 + 20 Semantic → RRF → 5 Reranked + Neighboring Chunks
Contextual Enrichment: LLM-generated context prepended to each chunk via DeepSeek API
Response Time: ~2-5 seconds

🛠️ Setup

Requirements

Python 3.11+
Groq API key

Local Setup

git clone https://github.com/denizkaya20/IPARD-AI-Assistant.git
cd IPARD-AI-Assistant

pip install -r requirements.txt

echo "GROQ_API_KEY=your_key_here" > .env

# Backend
uvicorn src.api:app --reload --port 8000

# Frontend
streamlit run src/app.py --server.port 7860

Docker

docker build -t ipard-rag .
docker run -p 7860:7860 -e GROQ_API_KEY=your_key_here ipard-rag

🌐 HuggingFace Spaces Configuration

Required Secrets

Secret	Description
`GROQ_API_KEY`	Groq API key (used for LLM + Query Router)

Key Files

File	Description
`src/system_prompt.txt`	LLM identity, measure info, response rules
`src/router_prompt.txt`	Full measure/sector/sub-sector hierarchy for routing
`data/all_chunks.json`	Contextually enriched document chunks
`data/embeddings.npy`	Pre-computed embedding vectors

⚠️ Disclaimer

This is an independent project with no affiliation to TKDK or any official institution. Responses are AI-generated and do not constitute official guidance. Always refer to tkdk.gov.tr for authoritative information.

🙏 Acknowledgments

TKDK — IPARD III documentation
Groq — LLM API
Hugging Face — Model hosting and Spaces
Anthropic — Contextual Retrieval method