BeRU Deployer
Deploy BeRU Streamlit RAG System - Add app, models logic, configs, and optimizations for HF Spaces
dec533d A newer version of the Streamlit SDK is available:
1.55.0
metadata
title: BeRU Chat - RAG Assistant
emoji: π€
colorFrom: indigo
colorTo: yellow
sdk: streamlit
app_file: app.py
pinned: false
short_description: 100% Offline RAG System with Mistral 7B and VLM2Vec
π€ BeRU Chat - RAG Assistant
A powerful 100% offline Retrieval-Augmented Generation (RAG) system combining Mistral 7B LLM with VLM2Vec embeddings for intelligent document search and conversation.
β¨ Features
- π 100% Offline Operation - No internet required after startup
- π§ Advanced RAG Architecture
- Hybrid retrieval (Vector + BM25 keyword search)
- Ensemble retriever combining multiple strategies
- Re-ranking with FlashRank for relevance
- Multi-turn conversation with history awareness
- β‘ Optimized Performance
- 4-bit quantization with BitsAndBytes
- Flash Attention 2 support
- FAISS vector indexing
- π Source Citations - Every answer cites original sources
π― Models Used
| Component | Model | Details |
|---|---|---|
| LLM | Mistral-7B-Instruct-v0.3 | 7B parameters |
| Embedding | VLM2Vec-Qwen2VL-2B | 2B parameters |
| Vector Store | FAISS | Meta's similarity search |
π Getting Started
- Wait for Models - First load takes 5-8 minutes (models download from HF Hub)
- Upload Documents - Add PDFs or text files for RAG
- Ask Questions - Chat with context-aware answers
- Get Sources - Each answer includes citations
π» System Requirements
- GPU: A10G (24GB VRAM) recommended
- RAM: 16GB minimum
- Cold Start: ~5-8 minutes (first time)
- Runtime: Streamlit app on port 7860
π Documentation
For more information, visit the GitHub repository