Spaces:

VRK1
/

mini-rag-app

Running

mini-rag-app / README.md

Update README.md

e60e01d verified 5 months ago

1.38 kB

	---
	title: Mini Rag App
	emoji: 📈
	colorFrom: pink
	colorTo: red
	sdk: gradio
	sdk_version: 6.3.0
	app_file: app.py
	pinned: false
	thumbnail: >-
	https://cdn-uploads.huggingface.co/production/uploads/696cb435ea65e4b95276706e/yKmxaQF3FkZuUQgDM3pk-.png
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference



	A simple end-to-end RAG system built using FastAPI, Hugging Face models, Pinecone vector database, and Cohere reranker.
	The application allows users to upload text, ask questions, and receive answers grounded in retrieved context with visible citations.

	chunking Parameters
	chunk size = 800
	overlap = 80


	Vector Database
	Provide: Pinecone
	Index Dimension : 384

	Top-k retrieval k = 10
	for matching cosine similarity is used

	Reranking
	Provider : Cohere
	Top-N retrieval after reranking = 5

	LLM
	Provider : Hugging Face (HF)
	Model: google/flan-t5-small

	User Interface
	Built using HTML inside FastAPI


	title: Mini Rag App
	sdk: gradio
	sdk_version: 6.3.0
	app_file: app.py


	Remark:
	Initially, OpenAI models were used as the LLM for answer generation. However, due to free-tier credit exhaustion and API rate limits, OpenAI models were discontinued.
	The system was migrated to a free Hugging Face LLM (google/flan-t5-base).
	Tradeoff observed:
	Reduction in answer fluency and coherence
	Occasional shorter or less precise responses