Spaces:

AnwinMJ
/

BeRu

Runtime error

App Files Files Community

BeRu / README.md

BeRU Deployer

Deploy BeRU Streamlit RAG System - Add app, models logic, configs, and optimizations for HF Spaces

dec533d 5 days ago

preview code

raw

history blame contribute delete

1.75 kB

	---
	title: BeRU Chat - RAG Assistant
	emoji: 🤖
	colorFrom: indigo
	colorTo: yellow
	sdk: streamlit
	app_file: app.py
	pinned: false
	short_description: 100% Offline RAG System with Mistral 7B and VLM2Vec
	---

	# 🤖 BeRU Chat - RAG Assistant

	A powerful 100% offline Retrieval-Augmented Generation (RAG) system combining Mistral 7B LLM with VLM2Vec embeddings for intelligent document search and conversation.

	## ✨ Features

	- 🔒 100% Offline Operation - No internet required after startup
	- 🧠 Advanced RAG Architecture
	- Hybrid retrieval (Vector + BM25 keyword search)
	- Ensemble retriever combining multiple strategies
	- Re-ranking with FlashRank for relevance
	- Multi-turn conversation with history awareness
	- ⚡ Optimized Performance
	- 4-bit quantization with BitsAndBytes
	- Flash Attention 2 support
	- FAISS vector indexing
	- 📚 Source Citations - Every answer cites original sources

	## 🎯 Models Used

	\| Component \| Model \| Details \|
	\|-----------\|-------\|---------\|
	\| LLM \| Mistral-7B-Instruct-v0.3 \| 7B parameters \|
	\| Embedding \| VLM2Vec-Qwen2VL-2B \| 2B parameters \|
	\| Vector Store \| FAISS \| Meta's similarity search \|

	## 🚀 Getting Started

	1. Wait for Models - First load takes 5-8 minutes (models download from HF Hub)
	2. Upload Documents - Add PDFs or text files for RAG
	3. Ask Questions - Chat with context-aware answers
	4. Get Sources - Each answer includes citations

	## 💻 System Requirements

	- GPU: A10G (24GB VRAM) recommended
	- RAM: 16GB minimum
	- Cold Start: ~5-8 minutes (first time)
	- Runtime: Streamlit app on port 7860

	## 📖 Documentation

	For more information, visit the [GitHub repository](https://github.com/AnwinJosy/BeRU)