Spaces:

sinhapiyush86
/

convAI

Sleeping

App Files Files Community

sinhapiyush86 commited on Aug 24, 2025

Commit

192b2d2

verified ·

1 Parent(s): bd75c88

Upload 18 files

Browse files

Files changed (19) hide show

.gitattributes +8 -0
DEPLOYMENT_GUIDE.md +283 -0
Dockerfile +31 -6
README.md +242 -16
RIL-Q1-FY2024-25.pdf +3 -0
RIL-Q1-FY2025-26.pdf +3 -0
RIL-Q2-FY2023-24.pdf +3 -0
RIL-Q2-FY2024-25.pdf +3 -0
RIL-Q3-FY2023-24.pdf +3 -0
RIL-Q3-FY2024-25.pdf +3 -0
RIL-Q4-FY2023-24.pdf +3 -0
RIL-Q4-FY2024-25.pdf +3 -0
app.py +351 -0
docker-compose.yml +20 -0
pdf_processor.py +268 -0
rag_system.py +547 -0
requirements.txt +15 -3
test_deployment.py +293 -0
test_docker.py +290 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,11 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+RIL-Q1-FY2024-25.pdf filter=lfs diff=lfs merge=lfs -text
+RIL-Q1-FY2025-26.pdf filter=lfs diff=lfs merge=lfs -text
+RIL-Q2-FY2023-24.pdf filter=lfs diff=lfs merge=lfs -text
+RIL-Q2-FY2024-25.pdf filter=lfs diff=lfs merge=lfs -text
+RIL-Q3-FY2023-24.pdf filter=lfs diff=lfs merge=lfs -text
+RIL-Q3-FY2024-25.pdf filter=lfs diff=lfs merge=lfs -text
+RIL-Q4-FY2023-24.pdf filter=lfs diff=lfs merge=lfs -text
+RIL-Q4-FY2024-25.pdf filter=lfs diff=lfs merge=lfs -text

DEPLOYMENT_GUIDE.md ADDED Viewed

	@@ -0,0 +1,283 @@

+# 🚀 Hugging Face Spaces Deployment Guide (Docker + Streamlit)
+This guide will walk you through deploying your RAG system to Hugging Face Spaces using **Docker with Streamlit**.
+## 📋 Prerequisites
+- Hugging Face account
+- All files from the `huggingface_deploy/` folder
+- Basic understanding of Docker (optional)
+## 🎯 Step-by-Step Deployment
+### Step 1: Create a New Space
+1. **Go to Hugging Face Spaces:**
+   - Visit [https://huggingface.co/spaces](https://huggingface.co/spaces)
+   - Click "Create new Space"
+2. **Configure your Space:**
+   - **Owner**: Choose your username or organization
+   - **Space name**: Choose a unique name (e.g., `my-rag-system`)
+   - **License**: Choose appropriate license (e.g., MIT)
+   - **SDK**: Select **Docker**
+   - **Visibility**: Choose Public or Private
+   - **Hardware**: Select appropriate hardware (CPU is sufficient for basic usage)
+3. **Click "Create Space"**
+### Step 2: Upload Files
+#### Option A: Using Git (Recommended)
+1. **Clone your Space repository:**
+```bash
+git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
+cd YOUR_SPACE_NAME
+```
+2. **Copy files from the deployment folder:**
+```bash
+cp -r ../huggingface_deploy/* .
+```
+3. **Commit and push:**
+```bash
+git add .
+git commit -m "Initial RAG system deployment with Docker"
+git push
+```
+#### Option B: Using Web Interface
+1. **Upload files manually:**
+   - Go to your Space's "Files" tab
+   - Click "Add file" → "Upload files"
+   - Upload all files from the `huggingface_deploy/` folder:
+     - `app.py`
+     - `rag_system.py`
+     - `pdf_processor.py`
+     - `requirements.txt`
+     - `Dockerfile`
+     - `.dockerignore`
+     - `README.md`
+### Step 3: Configure the Space
+1. **Set up environment variables (optional):**
+   - Go to your Space's "Settings" tab
+   - Add environment variables if needed:
+     ```
+     EMBEDDING_MODEL=all-MiniLM-L6-v2
+     GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct
+     ```
+2. **Configure hardware (if needed):**
+   - Go to "Settings" → "Hardware"
+   - Select appropriate hardware based on your needs
+### Step 4: Deploy and Test
+1. **Wait for deployment:**
+   - Hugging Face will automatically build and deploy your Docker container
+   - This may take 10-15 minutes for the first deployment (model downloads)
+2. **Test your application:**
+   - Visit your Space URL: `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`
+   - Upload a PDF document
+   - Ask questions to test the RAG system
+## 🔧 Docker Configuration
+### Dockerfile Features
+- **Base Image**: Python 3.10 slim
+- **System Dependencies**: build-essential, curl
+- **Health Check**: Monitors Streamlit health endpoint
+- **Environment Variables**: Configured for Streamlit
+- **Port**: Exposes port 8501
+### Local Docker Testing
+You can test the Docker build locally:
+```bash
+# Build the Docker image
+docker build -t rag-system .
+# Run the container
+docker run -p 8501:8501 rag-system
+# Or use docker-compose
+docker-compose up --build
+```
+## 🔧 Configuration Options
+### Environment Variables
+You can customize your deployment by setting these environment variables in your Space settings:
+```bash
+# Model configuration
+EMBEDDING_MODEL=all-MiniLM-L6-v2
+GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct
+# Chunk sizes
+CHUNK_SIZES=100,400
+# Vector store path
+VECTOR_STORE_PATH=./vector_store
+# Streamlit configuration
+STREAMLIT_SERVER_PORT=8501
+STREAMLIT_SERVER_ADDRESS=0.0.0.0
+STREAMLIT_SERVER_HEADLESS=true
+```
+### Hardware Options
+- **CPU**: Sufficient for basic usage, slower inference
+- **T4**: Good for faster inference, limited memory
+- **A10G**: High performance, more memory
+- **A100**: Maximum performance, highest cost
+## 🐛 Troubleshooting
+### Common Issues
+1. **Build Fails**
+   - Check that all required files are uploaded
+   - Verify `requirements.txt` and `Dockerfile` are correct
+   - Check the build logs for specific errors
+2. **Model Loading Errors**
+   - Ensure internet connectivity for model downloads
+   - Check model names are correct
+   - Verify sufficient disk space
+3. **Memory Issues**
+   - Use smaller models
+   - Reduce chunk sizes
+   - Upgrade to higher-tier hardware
+4. **Slow Performance**
+   - Upgrade hardware tier
+   - Use smaller embedding models
+   - Optimize chunk sizes
+5. **Docker Build Issues**
+   - Check `.dockerignore` excludes unnecessary files
+   - Verify Dockerfile syntax
+   - Check for missing dependencies
+### Debug Mode
+To enable debug logging, add this to your `app.py`:
+```python
+import logging
+logging.basicConfig(level=logging.DEBUG)
+```
+## 📊 Monitoring
+### Space Metrics
+- **Build Status**: Check if Docker build was successful
+- **Runtime Logs**: Monitor application logs
+- **Resource Usage**: Track CPU and memory usage
+- **Error Logs**: Identify and fix issues
+### Docker Logs
+Check Docker logs in your Space:
+- Go to "Settings" → "Logs"
+- Monitor build and runtime logs
+- Look for error messages
+## 🔒 Security Considerations
+1. **File Upload:**
+   - Validate PDF files before processing
+   - Implement file size limits
+   - Check file types
+2. **Model Access:**
+   - Use appropriate model access tokens
+   - Consider private models for sensitive data
+3. **Data Privacy:**
+   - Be aware that uploaded documents are processed
+   - Consider data retention policies
+4. **Docker Security:**
+   - Use non-root user in Dockerfile
+   - Minimize attack surface
+   - Keep base images updated
+## 📈 Scaling
+### For Production Use
+1. **Multiple Spaces:**
+   - Create separate Spaces for different use cases
+   - Use different hardware tiers as needed
+2. **Custom Domains:**
+   - Set up custom domains for your Spaces
+   - Use proper SSL certificates
+3. **Load Balancing:**
+   - Consider multiple Space instances
+   - Implement proper caching strategies
+## 🎉 Success Checklist
+- [ ] Space created successfully with Docker SDK
+- [ ] All files uploaded (including Dockerfile)
+- [ ] Docker build completed without errors
+- [ ] Application loads correctly
+- [ ] PDF upload works
+- [ ] Question answering works
+- [ ] Search results display correctly
+- [ ] Performance is acceptable
+## 📞 Support
+If you encounter issues:
+1. **Check the logs** in your Space's "Logs" tab
+2. **Review this guide** for common solutions
+3. **Search Hugging Face documentation**
+4. **Create an issue** in the project repository
+5. **Contact Hugging Face support** for Space-specific issues
+## 🚀 Next Steps
+After successful deployment:
+1. **Test thoroughly** with different document types
+2. **Optimize performance** based on usage patterns
+3. **Add custom features** as needed
+4. **Share your Space** with others
+5. **Monitor usage** and gather feedback
+## 🔄 Updates and Maintenance
+### Updating Your Space
+1. **Make changes locally**
+2. **Test with Docker locally**
+3. **Push changes to your Space repository**
+4. **Monitor the rebuild process**
+### Version Management
+- Use specific versions in `requirements.txt`
+- Tag your Docker images
+- Keep track of model versions
+---
+**Happy deploying with Docker! 🐳🚀**

Dockerfile CHANGED Viewed

@@ -1,20 +1,45 @@
-FROM python:3.13.5-slim
 WORKDIR /app
 RUN apt-get update && apt-get install -y \
     build-essential \
     curl \
-    git \
     && rm -rf /var/lib/apt/lists/*
-COPY requirements.txt ./
-COPY src/ ./src/
-RUN pip3 install -r requirements.txt
 EXPOSE 8501
 HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
-ENTRYPOINT ["streamlit", "run", "src/streamlit_app.py", "--server.port=8501", "--server.address=0.0.0.0"]

+# Use Python 3.10 slim image
+FROM python:3.10-slim
+# Set working directory
 WORKDIR /app
+# Install system dependencies
 RUN apt-get update && apt-get install -y \
     build-essential \
     curl \
     && rm -rf /var/lib/apt/lists/*
+# Copy requirements first for better caching
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir --upgrade pip && \
+    pip install --no-cache-dir -r requirements.txt
+# Copy application files
+COPY . .
+# Create vector store directory
+RUN mkdir -p vector_store
+# Copy all PDF documents for testing
+COPY *.pdf /app/
+# Set environment variables
+ENV PYTHONPATH=/app
+ENV STREAMLIT_SERVER_PORT=8501
+ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
+ENV STREAMLIT_SERVER_HEADLESS=true
+ENV STREAMLIT_SERVER_ENABLE_CORS=false
+ENV STREAMLIT_SERVER_ENABLE_XSRF_PROTECTION=false
+ENV STREAMLIT_LOGGER_LEVEL=debug
+# Expose port
 EXPOSE 8501
+# Health check
 HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
+# Run the application
+CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]

README.md CHANGED Viewed

@@ -1,19 +1,245 @@
----
-title: ConvAI
-emoji: 🚀
-colorFrom: red
-colorTo: red
-sdk: docker
-app_port: 8501
-tags:
-- streamlit
-pinned: false
-short_description: Streamlit template space
----
-# Welcome to Streamlit!
-Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
-If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
-forums](https://discuss.streamlit.io).

+# RAG System for Hugging Face Spaces
+A simplified Retrieval-Augmented Generation (RAG) system optimized for deployment on Hugging Face Spaces.
+## 🚀 Features
+- **FAISS Vector Search**: Fast similarity search using FAISS
+- **BM25 Keyword Search**: Traditional keyword-based retrieval
+- **Hybrid Search**: Combines both dense and sparse retrieval
+- **Qwen 2.5 1.5B**: Advanced language model for answer generation
+- **Streamlit UI**: Clean, interactive web interface
+- **PDF Processing**: Extract and process PDF documents
+- **Persistent Storage**: Saves embeddings and metadata locally
+## 📁 Project Structure
+```
+huggingface_deploy/
+├── app.py                 # Main Streamlit application
+├── rag_system.py          # Simplified RAG system
+├── pdf_processor.py       # PDF processing utilities
+├── requirements.txt       # Python dependencies
+├── README.md             # This file
+└── vector_store/         # FAISS index and metadata (created automatically)
+```
+## 🛠️ Technologies Used
+- **Streamlit**: Web interface
+- **FAISS**: Vector similarity search
+- **BM25**: Keyword-based retrieval
+- **Sentence Transformers**: Text embeddings
+- **Transformers**: Qwen 2.5 1.5B model
+- **PyPDF**: PDF text extraction
+- **PyTorch**: Deep learning framework
+## 🚀 Quick Start
+### Local Development
+1. **Install dependencies:**
+```bash
+pip install -r requirements.txt
+```
+2. **Run the application:**
+```bash
+streamlit run app.py
+```
+3. **Open in browser:**
+Navigate to `http://localhost:8501`
+### Hugging Face Spaces Deployment
+1. **Create a new Space:**
+   - Go to [Hugging Face Spaces](https://huggingface.co/spaces)
+   - Click "Create new Space"
+   - Choose "Streamlit" as the SDK
+   - Set visibility (public or private)
+2. **Upload files:**
+   - Upload all files from this directory to your Space
+   - The Space will automatically install dependencies and run the app
+3. **Access your app:**
+   - Your RAG system will be available at your Space URL
+## 📖 How to Use
+### 1. Upload Documents
+- Use the sidebar to upload PDF documents
+- The system will automatically process and index the content
+- Multiple documents can be uploaded
+### 2. Ask Questions
+- Type your question in the chat interface
+- Choose your preferred retrieval method:
+  - **Hybrid**: Combines FAISS and BM25 (recommended)
+  - **Dense**: Uses only FAISS vector similarity
+  - **Sparse**: Uses only BM25 keyword matching
+### 3. View Results
+- See the generated answer
+- View search results with confidence scores
+- Check response time and method used
+## ⚙️ Configuration
+### Environment Variables
+You can customize the system by setting these environment variables:
+```bash
+# Model configuration
+EMBEDDING_MODEL=all-MiniLM-L6-v2
+GENERATIVE_MODEL=Qwen/Qwen2.5-1.5B-Instruct
+# Chunk sizes for document processing
+CHUNK_SIZES=100,400
+# Vector store path
+VECTOR_STORE_PATH=./vector_store
+```
+### Model Options
+**Embedding Models:**
+- `all-MiniLM-L6-v2` (default, 384 dimensions)
+- `all-mpnet-base-v2` (768 dimensions)
+- `multi-qa-MiniLM-L6-cos-v1` (384 dimensions)
+**Generative Models:**
+- `Qwen/Qwen2.5-1.5B-Instruct` (default)
+- `distilgpt2` (fallback)
+- `microsoft/DialoGPT-medium`
+## 🔧 Customization
+### Adding New Models
+To use different models, modify the `SimpleRAGSystem` initialization in `app.py`:
+```python
+st.session_state.rag_system = SimpleRAGSystem(
+    embedding_model="your-embedding-model",
+    generative_model="your-generative-model"
+)
+```
+### Custom Chunk Sizes
+Modify the chunk sizes for different document types:
+```python
+chunk_sizes = [50, 200, 800]  # Smaller chunks for technical docs
+```
+### Custom Search Methods
+Add new search methods in `rag_system.py`:
+```python
+def custom_search(self, query: str, top_k: int = 5):
+    # Your custom search implementation
+    pass
+```
+## 📊 Performance Optimization
+### Memory Usage
+- Use smaller embedding models for limited memory
+- Reduce chunk sizes for large documents
+- Enable model quantization
+### Speed Optimization
+- Use GPU acceleration when available
+- Optimize FAISS index parameters
+- Cache embeddings for repeated queries
+### Storage
+- FAISS index and metadata are saved locally
+- Consider cloud storage for production deployments
+## 🐛 Troubleshooting
+### Common Issues
+1. **Model Loading Errors**
+   - Check internet connection for model downloads
+   - Verify model names are correct
+   - Ensure sufficient disk space
+2. **Memory Issues**
+   - Reduce batch sizes
+   - Use smaller models
+   - Enable gradient checkpointing
+3. **PDF Processing Errors**
+   - Verify PDF files are not corrupted
+   - Check file permissions
+   - Ensure PyPDF is properly installed
+### Debug Mode
+Enable debug logging by adding to `app.py`:
+```python
+import logging
+logging.basicConfig(level=logging.DEBUG)
+```
+## 🔒 Security Considerations
+- **File Upload**: Validate PDF files before processing
+- **Model Access**: Use appropriate model access tokens
+- **Data Privacy**: Consider data retention policies
+- **Rate Limiting**: Implement query rate limiting for production
+## 📈 Monitoring
+### System Metrics
+- Document count and chunk count
+- Response times
+- Search result quality
+- Model performance
+### Logs
+- Application logs in Streamlit
+- Model loading and inference logs
+- Error tracking and debugging
+## 🤝 Contributing
+1. Fork the repository
+2. Create a feature branch
+3. Make your changes
+4. Test thoroughly
+5. Submit a pull request
+## 📄 License
+This project is licensed under the MIT License - see the LICENSE file for details.
+## 🆘 Support
+For issues and questions:
+1. Check the troubleshooting section
+2. Review the logs for error messages
+3. Create an issue on GitHub
+4. Contact the maintainers
+## 🎯 Roadmap
+- [ ] Add support for more document formats
+- [ ] Implement advanced search algorithms
+- [ ] Add model fine-tuning capabilities
+- [ ] Improve UI/UX design
+- [ ] Add export/import functionality
+- [ ] Implement user authentication
+- [ ] Add analytics dashboard
+---
+**Happy RAG-ing! 🚀**

RIL-Q1-FY2024-25.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e29390caae95cc8f28606d9f08317cda424bf544fd86383c7f9ac7d25ca8e808
+size 1253337

RIL-Q1-FY2025-26.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ce3f74a4a4012cdb85afaf7795aa2cc118f94af0f2b4d290f92248d042eb0976
+size 719459

RIL-Q2-FY2023-24.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0e07142e623cd116f6c18a6e17e803b06bff53eeaa149c4151022579ef305cbd
+size 1570743

RIL-Q2-FY2024-25.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f78f4ade6ab7640fb74560b76505754fe5751c3602d61925c764c177875d1097
+size 1664783

RIL-Q3-FY2023-24.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f2e4afa7e303df86a156c02fbdb07866238891a408cd79398c98b100693cafcc
+size 1446439

RIL-Q3-FY2024-25.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e9d0afa42b8fb75efcf2d1c1aea5b104c77dd63fd69fa0fcc059af8b350e8567
+size 1855556

RIL-Q4-FY2023-24.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:645d6658976d1f958703b951fd7c89b22738ed2c865f31077fa725ec27781115
+size 1662456

RIL-Q4-FY2024-25.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0ec375dcbc69b69a95cd13f37fe090d61071d6e6a66707f2c73b26b77c6bd0d0
+size 1719021

app.py ADDED Viewed

	@@ -0,0 +1,351 @@

+#!/usr/bin/env python3
+"""
+RAG System for Hugging Face Spaces
+A simplified RAG system using:
+- FAISS for vector search
+- BM25 for hybrid retrieval
+- Streamlit for UI
+- Qwen 2.5 1.5B for generation
+"""
+import streamlit as st
+import os
+import tempfile
+from pathlib import Path
+import time
+from typing import List, Dict, Optional
+import json
+import glob
+from concurrent.futures import ThreadPoolExecutor, as_completed
+from loguru import logger
+# Import our simplified components
+from rag_system import SimpleRAGSystem
+from pdf_processor import SimplePDFProcessor
+# Page configuration
+st.set_page_config(
+    page_title="RAG System - Hugging Face",
+    page_icon="🤖",
+    layout="wide",
+    initial_sidebar_state="expanded",
+)
+# Initialize session state
+if "rag_system" not in st.session_state:
+    st.session_state.rag_system = None
+if "documents_loaded" not in st.session_state:
+    st.session_state.documents_loaded = False
+if "chat_history" not in st.session_state:
+    st.session_state.chat_history = []
+if "initializing" not in st.session_state:
+    st.session_state.initializing = False
+def load_single_document(rag_system, pdf_path):
+    """Load a single document into the RAG system"""
+    try:
+        filename = os.path.basename(pdf_path)
+        success = rag_system.add_document(pdf_path, filename)
+        return filename, success, None
+    except Exception as e:
+        return os.path.basename(pdf_path), False, str(e)
+def initialize_rag_system():
+    """Initialize the RAG system"""
+    if st.session_state.rag_system is None and not st.session_state.initializing:
+        st.session_state.initializing = True
+        st.write("🚀 Starting RAG system initialization...")
+        with st.spinner("Initializing RAG system..."):
+            try:
+                st.session_state.rag_system = SimpleRAGSystem()
+                st.write("✅ RAG system created successfully")
+                # Auto-load all available PDF documents in parallel
+                pdf_files = glob.glob("/app/*.pdf")
+                st.write(f"📁 Found {len(pdf_files)} PDF files")
+                if pdf_files:
+                    loaded_count = 0
+                    failed_count = 0
+                    with st.spinner(
+                        f"Loading {len(pdf_files)} PDF documents in parallel..."
+                    ):
+                        # Use ThreadPoolExecutor for parallel loading
+                        with ThreadPoolExecutor(max_workers=4) as executor:
+                            # Submit all tasks
+                            future_to_pdf = {
+                                executor.submit(
+                                    load_single_document,
+                                    st.session_state.rag_system,
+                                    pdf_path,
+                                ): pdf_path
+                                for pdf_path in pdf_files
+                            }
+                            # Process completed tasks
+                            for future in as_completed(future_to_pdf):
+                                filename, success, error = future.result()
+                                if success:
+                                    loaded_count += 1
+                                    st.write(f"✅ Loaded: {filename}")
+                                    logger.info(f"✅ Loaded: {filename}")
+                                else:
+                                    failed_count += 1
+                                    st.write(f"⚠️ Failed: {filename} - {error}")
+                                    logger.warning(
+                                        f"⚠️ Failed to load {filename}: {error}"
+                                    )
+                    if loaded_count > 0:
+                        st.session_state.documents_loaded = True
+                        st.success(
+                            f"✅ Successfully loaded {loaded_count} PDF documents!"
+                        )
+                        if failed_count > 0:
+                            st.warning(f"⚠️ Failed to load {failed_count} documents")
+                    else:
+                        st.warning("⚠️ No documents could be loaded")
+                        # Still allow querying even if no documents loaded
+                        st.session_state.documents_loaded = True
+                else:
+                    st.info("📚 No PDF documents found in the container")
+                    # Still allow querying even if no documents found
+                    st.session_state.documents_loaded = True
+                st.success("✅ RAG system initialized!")
+            except Exception as e:
+                st.error(f"❌ Failed to initialize RAG system: {e}")
+                logger.error(f"RAG system initialization failed: {e}")
+                # Reset initialization flag on error
+                st.session_state.initializing = False
+                raise
+            finally:
+                # Always reset initialization flag
+                st.session_state.initializing = False
+def upload_document(uploaded_file):
+    """Upload and process a document"""
+    if uploaded_file is not None:
+        try:
+            # Create temporary file
+            with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as tmp_file:
+                tmp_file.write(uploaded_file.getvalue())
+                tmp_path = tmp_file.name
+            # Process the document
+            with st.spinner(f"Processing {uploaded_file.name}..."):
+                success = st.session_state.rag_system.add_document(
+                    tmp_path, uploaded_file.name
+                )
+                if success:
+                    st.success(f"✅ {uploaded_file.name} processed successfully!")
+                    st.session_state.documents_loaded = True
+                    # Clean up temporary file
+                    os.unlink(tmp_path)
+                else:
+                    st.error(f"❌ Failed to process {uploaded_file.name}")
+                    os.unlink(tmp_path)
+        except Exception as e:
+            st.error(f"❌ Error processing document: {str(e)}")
+def query_rag(query: str, method: str = "hybrid", top_k: int = 5):
+    """Query the RAG system"""
+    try:
+        st.write(f"🔍 Starting query: {query}")
+        st.write(f"🔍 Method: {method}, top_k: {top_k}")
+        if st.session_state.rag_system is None:
+            st.error("❌ RAG system is not initialized")
+            return None, "RAG system not initialized"
+        st.write(f"✅ RAG system is available")
+        start_time = time.time()
+        st.write(f"🔍 Calling rag_system.query...")
+        response = st.session_state.rag_system.query(query, method, top_k)
+        response_time = time.time() - start_time
+        st.write(f"✅ Response received in {response_time:.2f}s")
+        st.write(f"✅ Response type: {type(response)}")
+        if response:
+            st.write(f"✅ Response answer: {response.answer[:100]}...")
+        return response, response_time
+    except Exception as e:
+        st.error(f"❌ Error during query: {str(e)}")
+        logger.error(f"Query error: {e}")
+        import traceback
+        st.error(f"❌ Full error: {traceback.format_exc()}")
+        return None, f"Error: {str(e)}"
+def display_search_results(results: List[Dict]):
+    """Display search results"""
+    if not results:
+        st.info("No search results found.")
+        return
+    for i, result in enumerate(results, 1):
+        st.markdown(f"---")
+        st.markdown(f"**Result {i}** - Score: {result.score:.3f}")
+        st.write(f"**Source:** {result.filename}")
+        st.write(f"**Method:** {result.search_method}")
+        st.write(f"**Text:** {result.text[:500]}...")
+        if result.dense_score and result.sparse_score:
+            col1, col2 = st.columns(2)
+            with col1:
+                st.metric("Dense Score", f"{result.dense_score:.3f}")
+            with col2:
+                st.metric("Sparse Score", f"{result.sparse_score:.3f}")
+def main():
+    """Main application"""
+    st.write("🚀 App starting...")
+    st.title("🤖 RAG System - Hugging Face Spaces")
+    st.markdown("A simplified RAG system using FAISS + BM25 + Qwen 2.5 1.5B")
+    # Initialize RAG system
+    initialize_rag_system()
+    # Sidebar
+    with st.sidebar:
+        st.header("📁 Document Upload")
+        uploaded_file = st.file_uploader(
+            "Upload PDF Document",
+            type=["pdf"],
+            help="Upload a PDF document to add to the knowledge base",
+        )
+        if uploaded_file:
+            upload_document(uploaded_file)
+        st.divider()
+        st.header("⚙️ Settings")
+        method = st.selectbox(
+            "Retrieval Method",
+            ["hybrid", "dense", "sparse"],
+            help="Choose the retrieval method",
+        )
+        top_k = st.slider(
+            "Number of Results",
+            min_value=1,
+            max_value=10,
+            value=5,
+            help="Number of top results to retrieve",
+        )
+        st.divider()
+        # System info
+        if st.session_state.rag_system:
+            stats = st.session_state.rag_system.get_stats()
+            st.header("📊 System Info")
+            st.write(f"**Documents:** {stats['total_documents']}")
+            st.write(f"**Chunks:** {stats['total_chunks']}")
+            st.write(f"**Vector Size:** {stats['vector_size']}")
+            st.write(f"**Model:** {stats['model_name']}")
+    # Initialize RAG system if not already done
+    if not st.session_state.rag_system:
+        if st.session_state.initializing:
+            st.info("🔄 RAG system is initializing... Please wait.")
+            return
+        else:
+            initialize_rag_system()
+            return
+    # Show system info and allow querying immediately after initialization
+    stats = st.session_state.rag_system.get_stats()
+    documents_available = stats["total_documents"] > 0
+    if not documents_available:
+        st.info(
+            "📚 No documents loaded yet, but you can still ask questions. The system will respond based on its general knowledge."
+        )
+    # Chat interface
+    st.header("💬 Ask Questions About Your Documents")
+    # Chat input
+    query = st.chat_input("Ask a question about the loaded documents...")
+    if query:
+        st.write(f"📝 Processing query: {query}")
+        # Add user message to chat history
+        st.session_state.chat_history.append({"role": "user", "content": query})
+        # Get response
+        response, response_time = query_rag(query, method, top_k)
+        st.write(f"📊 Response type: {type(response)}")
+        st.write(f"📊 Response time: {response_time}")
+        if response:
+            st.write("✅ Got valid response, adding to chat history")
+            # Add assistant response to chat history
+            st.session_state.chat_history.append(
+                {
+                    "role": "assistant",
+                    "content": response.answer,
+                    "search_results": response.search_results,
+                    "method_used": response.method_used,
+                    "confidence": response.confidence,
+                    "response_time": response_time,
+                }
+            )
+        else:
+            st.write("❌ No valid response received")
+            st.session_state.chat_history.append(
+                {"role": "assistant", "content": f"Error: {response_time}"}
+            )
+    # Display chat history
+    for message in st.session_state.chat_history:
+        if message["role"] == "user":
+            with st.chat_message("user"):
+                st.write(message["content"])
+        else:
+            with st.chat_message("assistant"):
+                st.write(message["content"])
+                # Show additional info for assistant messages
+                if "search_results" in message:
+                    st.markdown("**🔍 Search Results:**")
+                    display_search_results(message["search_results"])
+                    # Show metrics
+                    col1, col2, col3 = st.columns(3)
+                    with col1:
+                        st.metric("Method", message["method_used"])
+                    with col2:
+                        st.metric("Confidence", f"{message['confidence']:.3f}")
+                    with col3:
+                        st.metric("Response Time", f"{message['response_time']:.2f}s")
+    # Clear chat button
+    if st.session_state.chat_history:
+        if st.button("🗑️ Clear Chat History"):
+            st.session_state.chat_history = []
+            st.rerun()
+if __name__ == "__main__":
+    main()

docker-compose.yml ADDED Viewed

	@@ -0,0 +1,20 @@

+version: '3.8'
+services:
+  rag-system:
+    build: .
+    ports:
+      - "8501:8501"
+    environment:
+      - PYTHONPATH=/app
+      - STREAMLIT_SERVER_PORT=8501
+      - STREAMLIT_SERVER_ADDRESS=0.0.0.0
+      - STREAMLIT_SERVER_HEADLESS=true
+    volumes:
+      - ./vector_store:/app/vector_store
+    restart: unless-stopped
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8501/_stcore/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3

pdf_processor.py ADDED Viewed

	@@ -0,0 +1,268 @@

+#!/usr/bin/env python3
+"""
+Simplified PDF Processor for Hugging Face Spaces
+This module provides PDF processing functionality for the simplified RAG system.
+"""
+import os
+import re
+import uuid
+from typing import List, Dict, Optional
+from dataclasses import dataclass
+from pathlib import Path
+import pypdf
+from loguru import logger
+@dataclass
+class DocumentChunk:
+    """Represents a document chunk"""
+    text: str
+    doc_id: str
+    filename: str
+    chunk_id: str
+    chunk_size: int
+@dataclass
+class ProcessedDocument:
+    """Represents a processed document"""
+    filename: str
+    title: str
+    author: str
+    chunks: List[DocumentChunk]
+class SimplePDFProcessor:
+    """Simplified PDF processor for Hugging Face Spaces"""
+    def __init__(self):
+        """Initialize the PDF processor"""
+        self.stop_words = {
+            "the",
+            "a",
+            "an",
+            "and",
+            "or",
+            "but",
+            "in",
+            "on",
+            "at",
+            "to",
+            "for",
+            "of",
+            "with",
+            "by",
+            "is",
+            "are",
+            "was",
+            "were",
+            "be",
+            "been",
+            "being",
+            "have",
+            "has",
+            "had",
+            "do",
+            "does",
+            "did",
+            "will",
+            "would",
+            "could",
+            "should",
+            "may",
+            "might",
+            "can",
+            "this",
+            "that",
+            "these",
+            "those",
+        }
+    def process_document(
+        self, file_path: str, chunk_sizes: List[int] = None
+    ) -> ProcessedDocument:
+        """
+        Process a PDF document
+        Args:
+            file_path: Path to the PDF file
+            chunk_sizes: List of chunk sizes to use
+        Returns:
+            Processed document
+        """
+        if chunk_sizes is None:
+            chunk_sizes = [100, 400]
+        try:
+            # Extract text from PDF
+            text = self._extract_text(file_path)
+            # Clean text
+            cleaned_text = self._clean_text(text)
+            # Extract metadata
+            metadata = self._extract_metadata(file_path)
+            # Create chunks
+            chunks = []
+            doc_id = str(uuid.uuid4())
+            for chunk_size in chunk_sizes:
+                chunk_list = self._create_chunks(
+                    cleaned_text, chunk_size, doc_id, metadata["filename"]
+                )
+                chunks.extend(chunk_list)
+            return ProcessedDocument(
+                filename=metadata["filename"],
+                title=metadata["title"],
+                author=metadata["author"],
+                chunks=chunks,
+            )
+        except Exception as e:
+            logger.error(f"Error processing document {file_path}: {e}")
+            raise
+    def _extract_text(self, file_path: str) -> str:
+        """Extract text from PDF file"""
+        try:
+            with open(file_path, "rb") as file:
+                pdf_reader = pypdf.PdfReader(file)
+                text = ""
+                for page in pdf_reader.pages:
+                    page_text = page.extract_text()
+                    if page_text:
+                        text += page_text + "\n"
+                return text
+        except Exception as e:
+            logger.error(f"Error extracting text from {file_path}: {e}")
+            raise
+    def _clean_text(self, text: str) -> str:
+        """Clean and preprocess text"""
+        # Remove extra whitespace
+        text = re.sub(r"\s+", " ", text)
+        # Remove special characters but keep punctuation
+        text = re.sub(r"[^\w\s\.\,\!\?\;\:\-\(\)\[\]\{\}]", "", text)
+        # Remove page numbers and headers/footers
+        text = re.sub(
+            r"\b\d+\b(?=\s*\n)", "", text
+        )  # Remove standalone numbers at line ends
+        # Remove excessive newlines
+        text = re.sub(r"\n\s*\n\s*\n+", "\n\n", text)
+        return text.strip()
+    def _extract_metadata(self, file_path: str) -> Dict[str, str]:
+        """Extract metadata from PDF file"""
+        try:
+            with open(file_path, "rb") as file:
+                pdf_reader = pypdf.PdfReader(file)
+                info = pdf_reader.metadata
+                return {
+                    "filename": Path(file_path).name,
+                    "title": (
+                        info.get("/Title", Path(file_path).stem)
+                        if info
+                        else Path(file_path).stem
+                    ),
+                    "author": info.get("/Author", "Unknown") if info else "Unknown",
+                }
+        except Exception as e:
+            logger.warning(f"Error extracting metadata from {file_path}: {e}")
+            return {
+                "filename": Path(file_path).name,
+                "title": Path(file_path).stem,
+                "author": "Unknown",
+            }
+    def _create_chunks(
+        self, text: str, chunk_size: int, doc_id: str, filename: str
+    ) -> List[DocumentChunk]:
+        """Create text chunks of specified size"""
+        chunks = []
+        # Split text into sentences
+        sentences = self._split_into_sentences(text)
+        current_chunk = ""
+        chunk_id = 0
+        for sentence in sentences:
+            # Estimate token count (rough approximation)
+            estimated_tokens = len(sentence.split())
+            if len(current_chunk.split()) + estimated_tokens <= chunk_size:
+                current_chunk += sentence + " "
+            else:
+                # Save current chunk if not empty
+                if current_chunk.strip():
+                    chunks.append(
+                        DocumentChunk(
+                            text=current_chunk.strip(),
+                            doc_id=doc_id,
+                            filename=filename,
+                            chunk_id=f"{doc_id}_{chunk_id}",
+                            chunk_size=chunk_size,
+                        )
+                    )
+                    chunk_id += 1
+                # Start new chunk
+                current_chunk = sentence + " "
+        # Add the last chunk if not empty
+        if current_chunk.strip():
+            chunks.append(
+                DocumentChunk(
+                    text=current_chunk.strip(),
+                    doc_id=doc_id,
+                    filename=filename,
+                    chunk_id=f"{doc_id}_{chunk_id}",
+                    chunk_size=chunk_size,
+                )
+            )
+        return chunks
+    def _split_into_sentences(self, text: str) -> List[str]:
+        """Split text into sentences"""
+        # Simple sentence splitting
+        sentences = re.split(r"[.!?]+", text)
+        # Clean and filter sentences
+        cleaned_sentences = []
+        for sentence in sentences:
+            sentence = sentence.strip()
+            if sentence and len(sentence.split()) > 3:  # Minimum 3 words
+                cleaned_sentences.append(sentence)
+        return cleaned_sentences
+    def preprocess_query(self, query: str) -> str:
+        """Preprocess query text"""
+        # Convert to lowercase
+        query = query.lower()
+        # Remove punctuation
+        query = re.sub(r"[^\w\s]", "", query)
+        # Remove stop words
+        words = query.split()
+        filtered_words = [word for word in words if word not in self.stop_words]
+        return " ".join(filtered_words)

rag_system.py ADDED Viewed

	@@ -0,0 +1,547 @@

+#!/usr/bin/env python3
+"""
+Simplified RAG System for Hugging Face Spaces
+This module provides a simplified RAG system using:
+- FAISS for vector storage
+- BM25 for sparse retrieval
+- Hybrid search combining both
+- Qwen 2.5 1.5B for generation
+"""
+import os
+import pickle
+import json
+import time
+from typing import List, Dict, Optional, Tuple
+from dataclasses import dataclass
+import numpy as np
+import torch
+from loguru import logger
+import threading
+# Import required libraries
+from sentence_transformers import SentenceTransformer
+from rank_bm25 import BM25Okapi
+import faiss
+from transformers import AutoTokenizer, AutoModelForCausalLM
+@dataclass
+class DocumentChunk:
+    """Represents a document chunk"""
+    text: str
+    doc_id: str
+    filename: str
+    chunk_id: str
+    chunk_size: int
+@dataclass
+class SearchResult:
+    """Represents a search result"""
+    text: str
+    score: float
+    doc_id: str
+    filename: str
+    search_method: str
+    dense_score: Optional[float] = None
+    sparse_score: Optional[float] = None
+@dataclass
+class RAGResponse:
+    """Represents a RAG response"""
+    answer: str
+    confidence: float
+    search_results: List[SearchResult]
+    method_used: str
+    response_time: float
+    query: str
+class SimpleRAGSystem:
+    """Simplified RAG system for Hugging Face Spaces"""
+    def __init__(
+        self,
+        embedding_model: str = "all-MiniLM-L6-v2",
+        generative_model: str = "Qwen/Qwen2.5-1.5B-Instruct",
+        chunk_sizes: List[int] = None,
+        vector_store_path: str = "./vector_store",
+    ):
+        """
+        Initialize the RAG system
+        Args:
+            embedding_model: Sentence transformer model for embeddings
+            generative_model: Language model for generation
+            chunk_sizes: List of chunk sizes to use
+            vector_store_path: Path to store FAISS index and metadata
+        """
+        self.embedding_model = embedding_model
+        self.generative_model = generative_model
+        self.chunk_sizes = chunk_sizes or [100, 400]
+        self.vector_store_path = vector_store_path
+        # Initialize components
+        self.embedder = None
+        self.tokenizer = None
+        self.model = None
+        self.faiss_index = None
+        self.bm25 = None
+        self.documents = []
+        self.chunks = []
+        self._lock = threading.Lock()  # Thread safety for concurrent loading
+        # Create vector store directory
+        os.makedirs(vector_store_path, exist_ok=True)
+        # Load or initialize components
+        self._load_models()
+        self._load_or_create_index()
+        logger.info("Simple RAG system initialized successfully!")
+    def _load_models(self):
+        """Load embedding and generative models"""
+        try:
+            # Load embedding model
+            self.embedder = SentenceTransformer(self.embedding_model)
+            self.vector_size = self.embedder.get_sentence_embedding_dimension()
+            # Load generative model with fallback
+            model_loaded = False
+            # Try Qwen model first
+            try:
+                self.tokenizer = AutoTokenizer.from_pretrained(
+                    self.generative_model,
+                    trust_remote_code=True,
+                    padding_side="left",
+                )
+                # Load model with explicit CPU configuration
+                self.model = AutoModelForCausalLM.from_pretrained(
+                    self.generative_model,
+                    trust_remote_code=True,
+                    torch_dtype=torch.float32,
+                    device_map=None,
+                    low_cpu_mem_usage=False,
+                )
+                # Move to CPU explicitly
+                self.model = self.model.to("cpu")
+                model_loaded = True
+            except Exception as e:
+                logger.warning(f"Failed to load Qwen model: {e}")
+            # Fallback to distilgpt2 if Qwen fails
+            if not model_loaded:
+                logger.info("Falling back to distilgpt2...")
+                self.generative_model = "distilgpt2"
+                try:
+                    self.tokenizer = AutoTokenizer.from_pretrained(
+                        self.generative_model,
+                        trust_remote_code=True,
+                        padding_side="left",
+                    )
+                    self.model = AutoModelForCausalLM.from_pretrained(
+                        self.generative_model,
+                        trust_remote_code=True,
+                    )
+                    # Ensure fallback model is also on CPU
+                    self.model = self.model.to("cpu")
+                    model_loaded = True
+                except Exception as e:
+                    logger.error(f"Failed to load distilgpt2: {e}")
+                    raise Exception("Could not load any generative model")
+            # Set pad token for tokenizer
+            if self.tokenizer.pad_token is None:
+                self.tokenizer.pad_token = self.tokenizer.eos_token
+                self.tokenizer.pad_token_id = self.tokenizer.eos_token_id
+            logger.info(f"✅ Models loaded successfully")
+            logger.info(f"   - Embedding: {self.embedding_model}")
+            logger.info(f"   - Generative: {self.generative_model}")
+        except Exception as e:
+            logger.error(f"❌ Failed to load models: {e}")
+            raise
+    def _load_or_create_index(self):
+        """Load existing FAISS index or create new one"""
+        faiss_path = os.path.join(self.vector_store_path, "faiss_index.bin")
+        metadata_path = os.path.join(self.vector_store_path, "metadata.pkl")
+        if os.path.exists(faiss_path) and os.path.exists(metadata_path):
+            # Load existing index
+            try:
+                self.faiss_index = faiss.read_index(faiss_path)
+                with open(metadata_path, "rb") as f:
+                    metadata = pickle.load(f)
+                    self.documents = metadata.get("documents", [])
+                    self.chunks = metadata.get("chunks", [])
+                # Rebuild BM25
+                if self.chunks:
+                    texts = [chunk.text for chunk in self.chunks]
+                    tokenized_texts = [text.lower().split() for text in texts]
+                    self.bm25 = BM25Okapi(tokenized_texts)
+                logger.info(f"✅ Loaded existing index with {len(self.chunks)} chunks")
+            except Exception as e:
+                logger.warning(f"Failed to load existing index: {e}")
+                self._create_new_index()
+        else:
+            self._create_new_index()
+    def _create_new_index(self):
+        """Create new FAISS index"""
+        vector_size = self.embedder.get_sentence_embedding_dimension()
+        self.faiss_index = faiss.IndexFlatIP(
+            vector_size
+        )  # Inner product for cosine similarity
+        self.bm25 = None
+        logger.info(f"✅ Created new FAISS index with dimension {vector_size}")
+    def _save_index(self):
+        """Save FAISS index and metadata"""
+        try:
+            # Save FAISS index
+            faiss_path = os.path.join(self.vector_store_path, "faiss_index.bin")
+            faiss.write_index(self.faiss_index, faiss_path)
+            # Save metadata
+            metadata_path = os.path.join(self.vector_store_path, "metadata.pkl")
+            metadata = {"documents": self.documents, "chunks": self.chunks}
+            with open(metadata_path, "wb") as f:
+                pickle.dump(metadata, f)
+            logger.info("✅ Index saved successfully")
+        except Exception as e:
+            logger.error(f"❌ Failed to save index: {e}")
+    def add_document(self, file_path: str, filename: str) -> bool:
+        """
+        Add a document to the RAG system
+        Args:
+            file_path: Path to the PDF file
+            filename: Name of the file
+        Returns:
+            True if successful, False otherwise
+        """
+        try:
+            from pdf_processor import SimplePDFProcessor
+            # Process the document
+            processor = SimplePDFProcessor()
+            processed_doc = processor.process_document(file_path, self.chunk_sizes)
+            # Thread-safe document addition
+            with self._lock:
+                # Add document to list
+                self.documents.append(
+                    {
+                        "filename": filename,
+                        "title": processed_doc.title,
+                        "author": processed_doc.author,
+                        "file_path": file_path,
+                    }
+                )
+                # Add chunks
+                for chunk in processed_doc.chunks:
+                    self.chunks.append(chunk)
+                # Update embeddings and BM25
+                self._update_embeddings()
+                self._update_bm25()
+                # Save index
+                self._save_index()
+            logger.info(
+                f"✅ Added document: {filename} ({len(processed_doc.chunks)} chunks)"
+            )
+            return True
+        except Exception as e:
+            logger.error(f"❌ Failed to add document {filename}: {e}")
+            return False
+    def _update_embeddings(self):
+        """Update FAISS index with new embeddings"""
+        if not self.chunks:
+            return
+        # Get embeddings for new chunks
+        texts = [chunk.text for chunk in self.chunks]
+        embeddings = self.embedder.encode(texts, show_progress_bar=False)
+        # Add to FAISS index
+        self.faiss_index.add(embeddings.astype("float32"))
+    def _update_bm25(self):
+        """Update BM25 index with new chunks"""
+        if not self.chunks:
+            return
+        # Rebuild BM25 with all chunks
+        texts = [chunk.text for chunk in self.chunks]
+        tokenized_texts = [text.lower().split() for text in texts]
+        self.bm25 = BM25Okapi(tokenized_texts)
+    def search(
+        self, query: str, method: str = "hybrid", top_k: int = 5
+    ) -> List[SearchResult]:
+        """
+        Search for relevant documents
+        Args:
+            query: Search query
+            method: Search method (hybrid, dense, sparse)
+            top_k: Number of results to return
+        Returns:
+            List of search results
+        """
+        if not self.chunks:
+            return []
+        results = []
+        if method == "dense" or method == "hybrid":
+            # Dense search using FAISS
+            query_embedding = self.embedder.encode([query])
+            scores, indices = self.faiss_index.search(
+                query_embedding.astype("float32"), min(top_k, len(self.chunks))
+            )
+            for score, idx in zip(scores[0], indices[0]):
+                if idx < len(self.chunks):
+                    chunk = self.chunks[idx]
+                    results.append(
+                        SearchResult(
+                            text=chunk.text,
+                            score=float(score),
+                            doc_id=chunk.doc_id,
+                            filename=chunk.filename,
+                            search_method="dense",
+                            dense_score=float(score),
+                        )
+                    )
+        if method == "sparse" or method == "hybrid":
+            # Sparse search using BM25
+            if self.bm25:
+                tokenized_query = query.lower().split()
+                bm25_scores = self.bm25.get_scores(tokenized_query)
+                # Get top BM25 results
+                top_indices = np.argsort(bm25_scores)[::-1][:top_k]
+                for idx in top_indices:
+                    if idx < len(self.chunks):
+                        chunk = self.chunks[idx]
+                        score = float(bm25_scores[idx])
+                        # Check if result already exists
+                        existing_result = next(
+                            (
+                                r
+                                for r in results
+                                if r.doc_id == chunk.doc_id and r.text == chunk.text
+                            ),
+                            None,
+                        )
+                        if existing_result:
+                            # Update existing result with sparse score
+                            existing_result.sparse_score = score
+                            if method == "hybrid":
+                                # Combine scores for hybrid
+                                existing_result.score = (
+                                    existing_result.dense_score + score
+                                ) / 2
+                        else:
+                            results.append(
+                                SearchResult(
+                                    text=chunk.text,
+                                    score=score,
+                                    doc_id=chunk.doc_id,
+                                    filename=chunk.filename,
+                                    search_method="sparse",
+                                    sparse_score=score,
+                                )
+                            )
+        # Sort by score and return top_k
+        results.sort(key=lambda x: x.score, reverse=True)
+        return results[:top_k]
+    def generate_response(self, query: str, context: str) -> str:
+        """
+        Generate response using the language model
+        Args:
+            query: User query
+            context: Retrieved context
+        Returns:
+            Generated response
+        """
+        try:
+            # Prepare prompt
+            if hasattr(self.tokenizer, "apply_chat_template"):
+                # Use chat template for Qwen
+                messages = [
+                    {
+                        "role": "system",
+                        "content": "You are a helpful AI assistant. Use the provided context to answer the user's question accurately and concisely. If the context doesn't contain enough information to answer the question, say so.",
+                    },
+                    {
+                        "role": "user",
+                        "content": f"Context: {context}\n\nQuestion: {query}",
+                    },
+                ]
+                prompt = self.tokenizer.apply_chat_template(
+                    messages, tokenize=False, add_generation_prompt=True
+                )
+            else:
+                # Fallback for non-chat models
+                prompt = f"Context: {context}\n\nQuestion: {query}\n\nAnswer:"
+            # Tokenize
+            tokenized = self.tokenizer(
+                prompt,
+                return_tensors="pt",
+                truncation=True,
+                max_length=1024,
+                padding=True,
+                return_attention_mask=True,
+            )
+            # Generate response
+            with torch.no_grad():
+                try:
+                    outputs = self.model.generate(
+                        tokenized.input_ids,
+                        attention_mask=tokenized.attention_mask,
+                        max_new_tokens=512,
+                        num_return_sequences=1,
+                        temperature=0.7,
+                        do_sample=True,
+                        pad_token_id=self.tokenizer.pad_token_id,
+                        eos_token_id=self.tokenizer.eos_token_id,
+                    )
+                except RuntimeError as e:
+                    if "Half" in str(e):
+                        logger.warning(
+                            "Half precision not supported on CPU, converting to float32"
+                        )
+                        # Convert model to float32
+                        self.model = self.model.float()
+                        outputs = self.model.generate(
+                            tokenized.input_ids,
+                            attention_mask=tokenized.attention_mask,
+                            max_new_tokens=512,
+                            num_return_sequences=1,
+                            temperature=0.7,
+                            do_sample=True,
+                            pad_token_id=self.tokenizer.pad_token_id,
+                            eos_token_id=self.tokenizer.eos_token_id,
+                        )
+                    else:
+                        raise e
+            # Decode response
+            response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
+            # Extract only the generated part
+            if hasattr(self.tokenizer, "apply_chat_template"):
+                if "<|im_start|>assistant" in response:
+                    response = response.split("<|im_start|>assistant")[-1]
+                if "<|im_end|>" in response:
+                    response = response.split("<|im_end|>")[0]
+            else:
+                response = response[len(prompt) :]
+            return response.strip()
+        except Exception as e:
+            logger.error(f"Error generating response: {e}")
+            return f"Error generating response: {str(e)}"
+    def query(self, query: str, method: str = "hybrid", top_k: int = 5) -> RAGResponse:
+        """
+        Query the RAG system
+        Args:
+            query: User query
+            method: Search method
+            top_k: Number of results
+        Returns:
+            RAG response
+        """
+        start_time = time.time()
+        # Search for relevant documents
+        search_results = self.search(query, method, top_k)
+        if not search_results:
+            return RAGResponse(
+                answer="I couldn't find any relevant information to answer your question.",
+                confidence=0.0,
+                search_results=[],
+                method_used=method,
+                response_time=time.time() - start_time,
+                query=query,
+            )
+        # Combine context from search results
+        context = "\n\n".join([result.text for result in search_results])
+        # Generate response
+        answer = self.generate_response(query, context)
+        # Calculate confidence (simple heuristic)
+        confidence = np.mean([result.score for result in search_results])
+        return RAGResponse(
+            answer=answer,
+            confidence=confidence,
+            search_results=search_results,
+            method_used=method,
+            response_time=time.time() - start_time,
+            query=query,
+        )
+    def get_stats(self) -> Dict:
+        """Get system statistics"""
+        return {
+            "total_documents": len(self.documents),
+            "total_chunks": len(self.chunks),
+            "vector_size": (
+                self.embedder.get_sentence_embedding_dimension() if self.embedder else 0
+            ),
+            "model_name": self.generative_model,
+            "embedding_model": self.embedding_model,
+            "chunk_sizes": self.chunk_sizes,
+        }
+    def clear(self):
+        """Clear all documents and reset the system"""
+        self.documents = []
+        self.chunks = []
+        self._create_new_index()
+        self._save_index()
+        logger.info("✅ System cleared successfully")

requirements.txt CHANGED Viewed

@@ -1,3 +1,15 @@
-altair
-pandas
-streamlit

+# Core dependencies for Docker deployment
+streamlit==1.28.1
+torch==2.1.0
+transformers>=4.36.0
+sentence-transformers==2.2.2
+faiss-cpu==1.7.4
+scikit-learn==1.3.2
+rank-bm25==0.2.2
+pypdf==3.17.1
+pandas==2.1.3
+numpy==1.24.3
+loguru==0.7.2
+tqdm==4.66.1
+accelerate==0.24.1
+huggingface-hub==0.19.4

test_deployment.py ADDED Viewed

	@@ -0,0 +1,293 @@

+#!/usr/bin/env python3
+"""
+Test script for Hugging Face deployment
+This script tests if all components are working correctly for deployment.
+"""
+import os
+import sys
+import tempfile
+from pathlib import Path
+def test_imports():
+    """Test if all required packages can be imported"""
+    print("🔍 Testing imports...")
+    try:
+        import streamlit
+        print(f"✅ Streamlit: {streamlit.__version__}")
+    except ImportError as e:
+        print(f"❌ Streamlit import failed: {e}")
+        return False
+    try:
+        import torch
+        print(f"✅ PyTorch: {torch.__version__}")
+    except ImportError as e:
+        print(f"❌ PyTorch import failed: {e}")
+        return False
+    try:
+        import transformers
+        print(f"✅ Transformers: {transformers.__version__}")
+    except ImportError as e:
+        print(f"❌ Transformers import failed: {e}")
+        return False
+    try:
+        import sentence_transformers
+        print(f"✅ Sentence Transformers: {sentence_transformers.__version__}")
+    except ImportError as e:
+        print(f"❌ Sentence Transformers import failed: {e}")
+        return False
+    try:
+        import faiss
+        print(f"✅ FAISS: {faiss.__version__}")
+    except ImportError as e:
+        print(f"❌ FAISS import failed: {e}")
+        return False
+    try:
+        import rank_bm25
+        print("✅ Rank BM25")
+    except ImportError as e:
+        print(f"❌ Rank BM25 import failed: {e}")
+        return False
+    try:
+        import pypdf
+        print(f"✅ PyPDF: {pypdf.__version__}")
+    except ImportError as e:
+        print(f"❌ PyPDF import failed: {e}")
+        return False
+    return True
+def test_rag_system():
+    """Test the RAG system"""
+    print("\n🔍 Testing RAG system...")
+    try:
+        from rag_system import SimpleRAGSystem
+        # Test initialization
+        rag = SimpleRAGSystem()
+        print("✅ RAG system initialized")
+        # Test stats
+        stats = rag.get_stats()
+        print(f"✅ Stats retrieved: {stats}")
+        return True
+    except Exception as e:
+        print(f"❌ RAG system test failed: {e}")
+        return False
+def test_pdf_processor():
+    """Test the PDF processor"""
+    print("\n🔍 Testing PDF processor...")
+    try:
+        from pdf_processor import SimplePDFProcessor
+        # Test initialization
+        processor = SimplePDFProcessor()
+        print("✅ PDF processor initialized")
+        # Test query preprocessing
+        processed_query = processor.preprocess_query("What is the revenue?")
+        print(f"✅ Query preprocessing: '{processed_query}'")
+        return True
+    except Exception as e:
+        print(f"❌ PDF processor test failed: {e}")
+        return False
+def test_model_loading():
+    """Test if models can be loaded"""
+    print("\n🔍 Testing model loading...")
+    try:
+        from sentence_transformers import SentenceTransformer
+        from transformers import AutoTokenizer, AutoModelForCausalLM
+        # Test embedding model
+        embedder = SentenceTransformer("all-MiniLM-L6-v2")
+        print("✅ Embedding model loaded")
+        # Test tokenizer
+        tokenizer = AutoTokenizer.from_pretrained(
+            "Qwen/Qwen2.5-1.5B-Instruct", trust_remote_code=True
+        )
+        print("✅ Tokenizer loaded")
+        # Test model (CPU only for testing)
+        model = AutoModelForCausalLM.from_pretrained(
+            "Qwen/Qwen2.5-1.5B-Instruct",
+            trust_remote_code=True,
+            torch_dtype="auto",
+            device_map="cpu",
+        )
+        print("✅ Generative model loaded")
+        return True
+    except Exception as e:
+        print(f"❌ Model loading failed: {e}")
+        return False
+def test_streamlit_app():
+    """Test if Streamlit app can be imported"""
+    print("\n🔍 Testing Streamlit app...")
+    try:
+        # Test if app.py can be imported
+        import app
+        print("✅ Streamlit app imported successfully")
+        return True
+    except Exception as e:
+        print(f"❌ Streamlit app test failed: {e}")
+        return False
+def test_file_structure():
+    """Test if all required files exist"""
+    print("\n🔍 Testing file structure...")
+    required_files = [
+        "app.py",
+        "rag_system.py",
+        "pdf_processor.py",
+        "requirements.txt",
+        "README.md",
+    ]
+    missing_files = []
+    for file in required_files:
+        if os.path.exists(file):
+            print(f"✅ {file}")
+        else:
+            print(f"❌ {file} (missing)")
+            missing_files.append(file)
+    if missing_files:
+        print(f"❌ Missing files: {missing_files}")
+        return False
+    return True
+def test_requirements():
+    """Test if requirements.txt is valid"""
+    print("\n🔍 Testing requirements.txt...")
+    try:
+        with open("requirements.txt", "r") as f:
+            requirements = f.read()
+        # Check for essential packages
+        essential_packages = [
+            "streamlit",
+            "torch",
+            "transformers",
+            "sentence-transformers",
+            "faiss-cpu",
+            "rank-bm25",
+            "pypdf",
+        ]
+        missing_packages = []
+        for package in essential_packages:
+            if package in requirements:
+                print(f"✅ {package}")
+            else:
+                print(f"❌ {package} (missing)")
+                missing_packages.append(package)
+        if missing_packages:
+            print(f"❌ Missing packages: {missing_packages}")
+            return False
+        return True
+    except Exception as e:
+        print(f"❌ Requirements test failed: {e}")
+        return False
+def main():
+    """Run all tests"""
+    print("🚀 Hugging Face Deployment Test\n")
+    tests = [
+        ("File Structure", test_file_structure),
+        ("Requirements", test_requirements),
+        ("Imports", test_imports),
+        ("Model Loading", test_model_loading),
+        ("PDF Processor", test_pdf_processor),
+        ("RAG System", test_rag_system),
+        ("Streamlit App", test_streamlit_app),
+    ]
+    results = []
+    for test_name, test_func in tests:
+        try:
+            result = test_func()
+            results.append((test_name, result))
+        except Exception as e:
+            print(f"❌ {test_name} test failed with exception: {e}")
+            results.append((test_name, False))
+    # Summary
+    print("\n" + "=" * 50)
+    print("📊 Test Results Summary")
+    print("=" * 50)
+    passed = 0
+    total = len(results)
+    for test_name, result in results:
+        status = "✅ PASS" if result else "❌ FAIL"
+        print(f"{test_name:20} {status}")
+        if result:
+            passed += 1
+    print(f"\nOverall: {passed}/{total} tests passed")
+    if passed == total:
+        print("🎉 All tests passed! Ready for Hugging Face deployment.")
+        print("\nNext steps:")
+        print("1. Create a new Hugging Face Space")
+        print("2. Upload all files from this directory")
+        print("3. Set the SDK to 'Streamlit'")
+        print("4. Deploy and test your RAG system!")
+    else:
+        print("⚠️  Some tests failed. Please fix the issues before deployment.")
+        print("\nTroubleshooting:")
+        print("1. Install missing dependencies: pip install -r requirements.txt")
+        print("2. Check file permissions and paths")
+        print("3. Verify model download permissions")
+        print("4. Test locally first: streamlit run app.py")
+if __name__ == "__main__":
+    main()

test_docker.py ADDED Viewed

	@@ -0,0 +1,290 @@

+#!/usr/bin/env python3
+"""
+Test script for Docker deployment
+This script tests if all components are working correctly for Docker deployment.
+"""
+import os
+import sys
+import subprocess
+from pathlib import Path
+def test_dockerfile():
+    """Test if Dockerfile exists and is valid"""
+    print("🔍 Testing Dockerfile...")
+    dockerfile_path = Path("Dockerfile")
+    if not dockerfile_path.exists():
+        print("❌ Dockerfile not found")
+        return False
+    try:
+        with open(dockerfile_path, "r") as f:
+            content = f.read()
+        # Check for essential Dockerfile components
+        required_components = [
+            "FROM python:",
+            "WORKDIR /app",
+            "COPY requirements.txt",
+            "RUN pip install",
+            "COPY .",
+            "EXPOSE 8501",
+            'CMD ["streamlit"',
+        ]
+        missing_components = []
+        for component in required_components:
+            if component in content:
+                print(f"✅ {component}")
+            else:
+                print(f"❌ {component} (missing)")
+                missing_components.append(component)
+        if missing_components:
+            print(f"❌ Missing Dockerfile components: {missing_components}")
+            return False
+        return True
+    except Exception as e:
+        print(f"❌ Dockerfile test failed: {e}")
+        return False
+def test_dockerignore():
+    """Test if .dockerignore exists"""
+    print("\n🔍 Testing .dockerignore...")
+    dockerignore_path = Path(".dockerignore")
+    if dockerignore_path.exists():
+        print("✅ .dockerignore exists")
+        return True
+    else:
+        print("⚠️  .dockerignore not found (optional but recommended)")
+        return True
+def test_docker_compose():
+    """Test if docker-compose.yml exists"""
+    print("\n🔍 Testing docker-compose.yml...")
+    compose_path = Path("docker-compose.yml")
+    if compose_path.exists():
+        print("✅ docker-compose.yml exists")
+        return True
+    else:
+        print("⚠️  docker-compose.yml not found (optional)")
+        return True
+def test_docker_build():
+    """Test Docker build locally"""
+    print("\n🔍 Testing Docker build...")
+    try:
+        # Test Docker build
+        result = subprocess.run(
+            ["docker", "build", "-t", "rag-system-test", "."],
+            capture_output=True,
+            text=True,
+            timeout=300,  # 5 minutes timeout
+        )
+        if result.returncode == 0:
+            print("✅ Docker build successful")
+            return True
+        else:
+            print(f"❌ Docker build failed: {result.stderr}")
+            return False
+    except subprocess.TimeoutExpired:
+        print("❌ Docker build timed out")
+        return False
+    except FileNotFoundError:
+        print("⚠️  Docker not installed or not in PATH")
+        return False
+    except Exception as e:
+        print(f"❌ Docker build test failed: {e}")
+        return False
+def test_docker_run():
+    """Test Docker run locally"""
+    print("\n🔍 Testing Docker run...")
+    try:
+        # Test Docker run (brief test)
+        result = subprocess.run(
+            [
+                "docker",
+                "run",
+                "--rm",
+                "-d",
+                "-p",
+                "8501:8501",
+                "--name",
+                "rag-test",
+                "rag-system-test",
+            ],
+            capture_output=True,
+            text=True,
+            timeout=30,
+        )
+        if result.returncode == 0:
+            print("✅ Docker run successful")
+            # Clean up
+            subprocess.run(["docker", "stop", "rag-test"], capture_output=True)
+            return True
+        else:
+            print(f"❌ Docker run failed: {result.stderr}")
+            return False
+    except subprocess.TimeoutExpired:
+        print("❌ Docker run timed out")
+        return False
+    except FileNotFoundError:
+        print("⚠️  Docker not installed or not in PATH")
+        return False
+    except Exception as e:
+        print(f"❌ Docker run test failed: {e}")
+        return False
+def test_file_structure():
+    """Test if all required files exist"""
+    print("\n🔍 Testing file structure...")
+    required_files = [
+        "app.py",
+        "rag_system.py",
+        "pdf_processor.py",
+        "requirements.txt",
+        "Dockerfile",
+    ]
+    optional_files = [".dockerignore", "docker-compose.yml", "README.md"]
+    missing_required = []
+    missing_optional = []
+    for file in required_files:
+        if os.path.exists(file):
+            print(f"✅ {file}")
+        else:
+            print(f"❌ {file} (missing)")
+            missing_required.append(file)
+    for file in optional_files:
+        if os.path.exists(file):
+            print(f"✅ {file}")
+        else:
+            print(f"⚠️  {file} (optional)")
+            missing_optional.append(file)
+    if missing_required:
+        print(f"❌ Missing required files: {missing_required}")
+        return False
+    return True
+def test_requirements():
+    """Test if requirements.txt is valid"""
+    print("\n🔍 Testing requirements.txt...")
+    try:
+        with open("requirements.txt", "r") as f:
+            requirements = f.read()
+        # Check for essential packages
+        essential_packages = [
+            "streamlit",
+            "torch",
+            "transformers",
+            "sentence-transformers",
+            "faiss-cpu",
+            "rank-bm25",
+            "pypdf",
+        ]
+        missing_packages = []
+        for package in essential_packages:
+            if package in requirements:
+                print(f"✅ {package}")
+            else:
+                print(f"❌ {package} (missing)")
+                missing_packages.append(package)
+        if missing_packages:
+            print(f"❌ Missing packages: {missing_packages}")
+            return False
+        return True
+    except Exception as e:
+        print(f"❌ Requirements test failed: {e}")
+        return False
+def main():
+    """Run all tests"""
+    print("🐳 Docker Deployment Test\n")
+    tests = [
+        ("File Structure", test_file_structure),
+        ("Requirements", test_requirements),
+        ("Dockerfile", test_dockerfile),
+        (".dockerignore", test_dockerignore),
+        ("docker-compose.yml", test_docker_compose),
+        ("Docker Build", test_docker_build),
+        ("Docker Run", test_docker_run),
+    ]
+    results = []
+    for test_name, test_func in tests:
+        try:
+            result = test_func()
+            results.append((test_name, result))
+        except Exception as e:
+            print(f"❌ {test_name} test failed with exception: {e}")
+            results.append((test_name, False))
+    # Summary
+    print("\n" + "=" * 50)
+    print("📊 Test Results Summary")
+    print("=" * 50)
+    passed = 0
+    total = len(results)
+    for test_name, result in results:
+        status = "✅ PASS" if result else "❌ FAIL"
+        print(f"{test_name:20} {status}")
+        if result:
+            passed += 1
+    print(f"\nOverall: {passed}/{total} tests passed")
+    if passed == total:
+        print("🎉 All tests passed! Ready for Hugging Face Docker deployment.")
+        print("\nNext steps:")
+        print("1. Create a new Hugging Face Space with Docker SDK")
+        print("2. Upload all files from this directory")
+        print("3. Wait for Docker build to complete")
+        print("4. Test your RAG system!")
+    else:
+        print("⚠️  Some tests failed. Please fix the issues before deployment.")
+        print("\nTroubleshooting:")
+        print("1. Install Docker if not available")
+        print("2. Check file permissions and paths")
+        print("3. Verify Dockerfile syntax")
+        print("4. Test Docker build locally: docker build -t rag-system .")
+if __name__ == "__main__":
+    main()