BeRu / README.md
BeRU Deployer
Deploy BeRU Streamlit RAG System - Add app, models logic, configs, and optimizations for HF Spaces
dec533d

A newer version of the Streamlit SDK is available: 1.55.0

Upgrade
metadata
title: BeRU Chat - RAG Assistant
emoji: πŸ€–
colorFrom: indigo
colorTo: yellow
sdk: streamlit
app_file: app.py
pinned: false
short_description: 100% Offline RAG System with Mistral 7B and VLM2Vec

πŸ€– BeRU Chat - RAG Assistant

A powerful 100% offline Retrieval-Augmented Generation (RAG) system combining Mistral 7B LLM with VLM2Vec embeddings for intelligent document search and conversation.

✨ Features

  • πŸ”’ 100% Offline Operation - No internet required after startup
  • 🧠 Advanced RAG Architecture
    • Hybrid retrieval (Vector + BM25 keyword search)
    • Ensemble retriever combining multiple strategies
    • Re-ranking with FlashRank for relevance
    • Multi-turn conversation with history awareness
  • ⚑ Optimized Performance
    • 4-bit quantization with BitsAndBytes
    • Flash Attention 2 support
    • FAISS vector indexing
  • πŸ“š Source Citations - Every answer cites original sources

🎯 Models Used

Component Model Details
LLM Mistral-7B-Instruct-v0.3 7B parameters
Embedding VLM2Vec-Qwen2VL-2B 2B parameters
Vector Store FAISS Meta's similarity search

πŸš€ Getting Started

  1. Wait for Models - First load takes 5-8 minutes (models download from HF Hub)
  2. Upload Documents - Add PDFs or text files for RAG
  3. Ask Questions - Chat with context-aware answers
  4. Get Sources - Each answer includes citations

πŸ’» System Requirements

  • GPU: A10G (24GB VRAM) recommended
  • RAM: 16GB minimum
  • Cold Start: ~5-8 minutes (first time)
  • Runtime: Streamlit app on port 7860

πŸ“– Documentation

For more information, visit the GitHub repository