Spaces:

VRK1
/

mini-rag-app

Running

App Files Files Community

mini-rag-app / README.md

VRK1

Update README.md

e60e01d verified 5 months ago

preview code

raw

history blame contribute delete

1.38 kB

A newer version of the Gradio SDK is available: 6.16.0

Upgrade

metadata

title: Mini Rag App
emoji: 📈
colorFrom: pink
colorTo: red
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: false
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/696cb435ea65e4b95276706e/yKmxaQF3FkZuUQgDM3pk-.png

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

A simple end-to-end RAG system built using FastAPI, Hugging Face models, Pinecone vector database, and Cohere reranker. The application allows users to upload text, ask questions, and receive answers grounded in retrieved context with visible citations.

chunking Parameters chunk size = 800 overlap = 80

Vector Database Provide: Pinecone Index Dimension : 384

Top-k retrieval k = 10 for matching cosine similarity is used

Reranking Provider : Cohere Top-N retrieval after reranking = 5

LLM Provider : Hugging Face (HF) Model: google/flan-t5-small

User Interface Built using HTML inside FastAPI

title: Mini Rag App sdk: gradio sdk_version: 6.3.0 app_file: app.py

Remark: Initially, OpenAI models were used as the LLM for answer generation. However, due to free-tier credit exhaustion and API rate limits, OpenAI models were discontinued. The system was migrated to a free Hugging Face LLM (google/flan-t5-base). Tradeoff observed: Reduction in answer fluency and coherence Occasional shorter or less precise responses