Spaces:
Running
Running
| title: Mini Rag App | |
| emoji: ๐ | |
| colorFrom: pink | |
| colorTo: red | |
| sdk: gradio | |
| sdk_version: 6.3.0 | |
| app_file: app.py | |
| pinned: false | |
| thumbnail: >- | |
| https://cdn-uploads.huggingface.co/production/uploads/696cb435ea65e4b95276706e/yKmxaQF3FkZuUQgDM3pk-.png | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
| A simple end-to-end RAG system built using FastAPI, Hugging Face models, Pinecone vector database, and Cohere reranker. | |
| The application allows users to upload text, ask questions, and receive answers grounded in retrieved context with visible citations. | |
| chunking Parameters | |
| chunk size = 800 | |
| overlap = 80 | |
| Vector Database | |
| Provide: Pinecone | |
| Index Dimension : 384 | |
| Top-k retrieval k = 10 | |
| for matching cosine similarity is used | |
| Reranking | |
| Provider : Cohere | |
| Top-N retrieval after reranking = 5 | |
| LLM | |
| Provider : Hugging Face (HF) | |
| Model: google/flan-t5-small | |
| User Interface | |
| Built using HTML inside FastAPI | |
| title: Mini Rag App | |
| sdk: gradio | |
| sdk_version: 6.3.0 | |
| app_file: app.py | |
| Remark: | |
| Initially, OpenAI models were used as the LLM for answer generation. However, due to free-tier credit exhaustion and API rate limits, OpenAI models were discontinued. | |
| The system was migrated to a free Hugging Face LLM (google/flan-t5-base). | |
| Tradeoff observed: | |
| Reduction in answer fluency and coherence | |
| Occasional shorter or less precise responses |