Spaces:
Running
Running
File size: 1,381 Bytes
e60e01d 44ed401 99e08ef 44ed401 e60e01d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | ---
title: Mini Rag App
emoji: 📈
colorFrom: pink
colorTo: red
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: false
thumbnail: >-
https://cdn-uploads.huggingface.co/production/uploads/696cb435ea65e4b95276706e/yKmxaQF3FkZuUQgDM3pk-.png
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
A simple end-to-end RAG system built using FastAPI, Hugging Face models, Pinecone vector database, and Cohere reranker.
The application allows users to upload text, ask questions, and receive answers grounded in retrieved context with visible citations.
chunking Parameters
chunk size = 800
overlap = 80
Vector Database
Provide: Pinecone
Index Dimension : 384
Top-k retrieval k = 10
for matching cosine similarity is used
Reranking
Provider : Cohere
Top-N retrieval after reranking = 5
LLM
Provider : Hugging Face (HF)
Model: google/flan-t5-small
User Interface
Built using HTML inside FastAPI
title: Mini Rag App
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
Remark:
Initially, OpenAI models were used as the LLM for answer generation. However, due to free-tier credit exhaustion and API rate limits, OpenAI models were discontinued.
The system was migrated to a free Hugging Face LLM (google/flan-t5-base).
Tradeoff observed:
Reduction in answer fluency and coherence
Occasional shorter or less precise responses |