Spaces:

jatinmehra
/

wasserstoff-AiInternTask

Sleeping

App Files Files Community

Jatin Mehra commited on Jun 11, 2025

Commit

535ca47

1 Parent(s): f255569

Add comprehensive documentation including API reference, development guide, and index

Browse files

Files changed (5) hide show

README.md +14 -0
docs/API.md +231 -0
docs/DEVELOPMENT.md +374 -0
docs/README.md +169 -0
docs/index.md +86 -0

README.md CHANGED Viewed

@@ -578,6 +578,20 @@ docker-compose up -d
 - **🌐 WebSocket Support**: Real-time chat updates and live document processing
 - **🧠 Model Upgrades**: Integration with latest embedding and LLM models
 ## 📄 License
 This project is licensed under the **Apache License 2.0** - see the [LICENSE](LICENSE) file for complete details.

 - **🌐 WebSocket Support**: Real-time chat updates and live document processing
 - **🧠 Model Upgrades**: Integration with latest embedding and LLM models
+## 📚 Documentation
+Comprehensive documentation is available in the `docs/` directory:
+- **[📖 Documentation Index](docs/index.md)** - Complete documentation overview
+- **[🏗️ Architecture & Quick Start](docs/README.md)** - Project architecture with mermaid diagram
+- **[🔌 API Reference](docs/API.md)** - REST API endpoints and examples
+- **[💻 Development Guide](docs/DEVELOPMENT.md)** - Contributing and development setup
+### Interactive API Documentation
+When the server is running, visit:
+- **Swagger UI**: http://localhost:8000/docs
+- **ReDoc**: http://localhost:8000/redoc
 ## 📄 License
 This project is licensed under the **Apache License 2.0** - see the [LICENSE](LICENSE) file for complete details.

docs/API.md ADDED Viewed

	@@ -0,0 +1,231 @@

+# API Documentation
+This document provides a quick reference for the RAG Chat Application REST API endpoints.
+## Base URL
+```
+http://localhost:8000
+```
+## Authentication
+Most endpoints require a GROQ API key to be configured:
+```bash
+POST /set-api-key
+Content-Type: application/json
+{
+  "api_key": "your_groq_api_key_here"
+}
+```
+## Core Endpoints
+### Document Processing
+#### Upload Files
+```bash
+POST /upload-files
+Content-Type: multipart/form-data
+# Form data with file uploads
+files: [file1.pdf, file2.txt, ...]
+```
+**Response:**
+```json
+{
+  "total_files": 5,
+  "total_documents": 12,
+  "total_chunks": 87,
+  "file_types": ["pdf", "txt", "py"],
+  "type_counts": {"pdf": 3, "txt": 1, "py": 1}
+}
+```
+#### Process Directory
+```bash
+POST /process-directory
+Content-Type: application/x-www-form-urlencoded
+directory_path=/path/to/documents
+```
+### Chat Interface
+#### Send Chat Message
+```bash
+POST /chat
+Content-Type: application/json
+{
+  "message": "What is the main topic of the documents?"
+}
+```
+**Response:**
+```json
+{
+  "response": "Based on the documents, the main topics include...",
+  "citations": [
+    {
+      "content": "relevant excerpt from document",
+      "citation": "/path/to/source/file.pdf",
+      "type": "pdf",
+      "score": 0.85
+    }
+  ],
+  "themes": {
+    "key_themes": ["AI", "Machine Learning", "RAG"],
+    "analysis": "The documents focus on AI and ML concepts..."
+  },
+  "timestamp": "2025-06-11T10:30:00.123456"
+}
+```
+### Data Management
+#### Get Statistics
+```bash
+GET /stats
+```
+**Response:**
+```json
+{
+  "total_files": 10,
+  "total_documents": 25,
+  "total_chunks": 150,
+  "file_types": ["pdf", "txt", "py", "md"],
+  "type_counts": {"pdf": 5, "txt": 3, "py": 1, "md": 1},
+  "processed_at": "2025-06-11 10:30:00"
+}
+```
+#### Get Chat History
+```bash
+GET /chat-history
+```
+**Response:**
+```json
+[
+  {
+    "user_message": "What is RAG?",
+    "assistant_response": "RAG stands for Retrieval-Augmented Generation...",
+    "timestamp": "2025-06-11T10:30:00.123456",
+    "citations": [...]
+  }
+]
+```
+#### Clear Chat History
+```bash
+DELETE /clear-chat
+```
+### Vector Store Management
+#### Save Vector Store
+```bash
+POST /save-vector-store
+```
+**Response:**
+```json
+{
+  "message": "Vector store saved successfully"
+}
+```
+#### Load Vector Store
+```bash
+POST /load-vector-store
+```
+**Response:**
+```json
+{
+  "message": "Vector store loaded successfully",
+  "stats": {
+    "total_files": 10,
+    "total_documents": 25,
+    "total_chunks": 150
+  }
+}
+```
+## Frontend Serving
+#### Main Application
+```bash
+GET /
+```
+Returns the HTML frontend application.
+## Error Responses
+All endpoints return errors in this format:
+```json
+{
+  "detail": "Error description message"
+}
+```
+Common HTTP status codes:
+- `200` - Success
+- `400` - Bad Request (invalid input)
+- `422` - Validation Error
+- `500` - Internal Server Error
+## Interactive Documentation
+When the server is running, visit:
+- **Swagger UI**: http://localhost:8000/docs
+- **ReDoc**: http://localhost:8000/redoc
+## Examples
+### Complete Workflow
+```bash
+# 1. Set API key
+curl -X POST "http://localhost:8000/set-api-key" \
+  -H "Content-Type: application/json" \
+  -d '{"api_key": "your_groq_key"}'
+# 2. Upload files
+curl -X POST "http://localhost:8000/upload-files" \
+  -F "files=@document1.pdf" \
+  -F "files=@document2.txt"
+# 3. Chat with documents
+curl -X POST "http://localhost:8000/chat" \
+  -H "Content-Type: application/json" \
+  -d '{"message": "Summarize the key points"}'
+# 4. Get statistics
+curl -X GET "http://localhost:8000/stats"
+# 5. Save vector store
+curl -X POST "http://localhost:8000/save-vector-store"
+```
+### Python Client Example
+```python
+import requests
+base_url = "http://localhost:8000"
+# Set API key
+response = requests.post(f"{base_url}/set-api-key",
+                        json={"api_key": "your_groq_key"})
+# Upload files
+files = {'files': open('document.pdf', 'rb')}
+response = requests.post(f"{base_url}/upload-files", files=files)
+# Chat
+response = requests.post(f"{base_url}/chat",
+                        json={"message": "What is this document about?"})
+print(response.json())
+```

docs/DEVELOPMENT.md ADDED Viewed

	@@ -0,0 +1,374 @@

+# Development Guide
+This guide helps developers understand the codebase and contribute to the RAG Chat Application.
+## 🏗️ Project Structure
+```
+wasserstoff-AiInternTask/
+├── rag_elements/              # 🧠 Core RAG Engine
+│   ├── enhanced_vectordb.py   # Main RAG implementation
+│   └── config.py              # Configuration management
+├── backend/                   # 🚀 FastAPI Production Server
+│   ├── main.py               # App entry point
+│   ├── models.py             # Pydantic schemas
+│   ├── utils.py              # Utilities and state
+│   └── routes/               # API endpoints
+├── frontend/                  # 🎨 Web Interface
+│   ├── index.html            # Main UI
+│   ├── style.css             # Styling
+│   └── script.js             # Frontend logic
+├── tests/                     # 🧪 Test Suite
+└── docs/                      # 📚 Documentation
+```
+## 🔧 Development Setup
+### Prerequisites
+- Python 3.8+
+- Git
+- Text editor/IDE (VS Code recommended)
+### Environment Setup
+```bash
+# Clone repository
+git clone https://github.com/Jatin-Mehra119/wasserstoff-AiInternTask.git
+cd wasserstoff-AiInternTask
+# Create virtual environment (recommended)
+python -m venv venv
+source venv/bin/activate  # Linux/macOS
+# or venv\Scripts\activate  # Windows
+# Install dependencies
+pip install -r requirements.txt
+# Install development dependencies
+pip install -r tests/requirements-test.txt
+# Set up environment variables
+cp .env.example .env  # Create if exists
+# Add your GROQ_API_KEY to .env
+```
+### Running in Development Mode
+```bash
+# Start FastAPI with hot reload
+cd backend
+python -m uvicorn main:app --reload --host 0.0.0.0 --port 8000
+# Or run Streamlit version
+streamlit run streamlit_rag_app.py
+```
+## 🧱 Core Components
+### 1. RAG Engine (`rag_elements/enhanced_vectordb.py`)
+The heart of the application. Key classes and methods:
+```python
+class EnhancedDocumentProcessor:
+    def process_files(self, file_paths)              # Multi-format processing
+    def create_enhanced_vector_store(self, documents) # FAISS index creation
+    def search_with_citations(self, query, k=5)      # Semantic search
+    def get_chat_response(self, query)                # End-to-end chat
+    def save_vector_store(self, path)                 # Persistence
+    def load_vector_store(self, path)                 # Restore data
+```
+### 2. FastAPI Backend (`backend/`)
+**Entry Point (`main.py`)**:
+- FastAPI app initialization
+- CORS configuration
+- Route registration
+**Data Models (`models.py`)**:
+- Pydantic schemas for API requests/responses
+- Type validation and serialization
+**Routes (`routes/`)**:
+- `main_routes.py` - Frontend serving, health checks
+- `upload_routes.py` - File upload and processing
+- `chat_routes.py` - Chat interface and AI responses
+- `store_routes.py` - Vector store management
+**Utilities (`utils.py`)**:
+- Global state management
+- Helper functions
+- Error handling utilities
+### 3. Frontend (`frontend/`)
+Modern web interface with:
+- **HTML**: Semantic structure with responsive layout
+- **CSS**: Modern styling with CSS Grid/Flexbox
+- **JavaScript**: Async API calls, real-time updates, file handling
+## 🔄 Data Flow
+### Document Processing Pipeline
+1. **File Upload** → `upload_routes.py`
+2. **Text Extraction** → `enhanced_vectordb.py`
+3. **Chunking** → LangChain text splitters
+4. **Embeddings** → Sentence Transformers
+5. **Indexing** → FAISS vector store
+6. **Metadata Storage** → JSON persistence
+### Chat Pipeline
+1. **User Query** → `chat_routes.py`
+2. **Semantic Search** → FAISS similarity search
+3. **Context Retrieval** → Top-K document chunks
+4. **AI Response** → GROQ API integration
+5. **Citation Generation** → Source attribution
+6. **Response Formatting** → Markdown output
+## 🧪 Testing
+### Running Tests
+```bash
+cd tests
+# Run all tests
+bash run_tests.sh
+# Run specific test files
+python -m pytest test_endpoints_pytest.py -v
+python test_api_endpoints.py
+```
+### Test Structure
+- `test_api_endpoints.py` - Basic API endpoint testing
+- `test_endpoints_pytest.py` - Comprehensive pytest suite
+- `run_tests.sh` - Test runner script
+### Writing Tests
+Follow these patterns:
+```python
+# API endpoint test
+def test_upload_endpoint():
+    response = requests.post(f"{BASE_URL}/upload-files", files=files)
+    assert response.status_code == 200
+    assert "total_files" in response.json()
+# Pytest test
+@pytest.mark.asyncio
+async def test_chat_endpoint():
+    async with httpx.AsyncClient() as client:
+        response = await client.post(f"{BASE_URL}/chat",
+                                   json={"message": "test"})
+        assert response.status_code == 200
+```
+## 🔌 Adding New Features
+### Adding a New API Endpoint
+1. **Define Pydantic Model** (`models.py`):
+```python
+class NewFeatureRequest(BaseModel):
+    parameter: str
+    optional_param: Optional[int] = None
+class NewFeatureResponse(BaseModel):
+    result: str
+    success: bool
+```
+2. **Create Route Handler** (`routes/new_routes.py`):
+```python
+from fastapi import APIRouter, HTTPException
+from ..models import NewFeatureRequest, NewFeatureResponse
+router = APIRouter()
+@router.post("/new-feature", response_model=NewFeatureResponse)
+async def new_feature_endpoint(request: NewFeatureRequest):
+    try:
+        # Implementation here
+        return NewFeatureResponse(result="success", success=True)
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+```
+3. **Register Router** (`main.py`):
+```python
+from .routes.new_routes import router as new_router
+app.include_router(new_router)
+```
+4. **Add Frontend Integration** (`frontend/script.js`):
+```javascript
+async function callNewFeature(data) {
+    const response = await fetch('/new-feature', {
+        method: 'POST',
+        headers: {'Content-Type': 'application/json'},
+        body: JSON.stringify(data)
+    });
+    return response.json();
+}
+```
+### Extending the RAG Engine
+To add new document types or processing capabilities:
+1. **Add File Type Support** (`enhanced_vectordb.py`):
+```python
+def extract_text_from_new_format(self, file_path):
+    # Implement extraction logic
+    return extracted_text
+def process_files(self, file_paths):
+    for file_path in file_paths:
+        if file_path.endswith('.new_format'):
+            text = self.extract_text_from_new_format(file_path)
+            # Process text...
+```
+2. **Update Frontend File Acceptance** (`index.html`):
+```html
+<input type="file" accept=".pdf,.txt,.new_format" multiple>
+```
+## 🎨 Frontend Development
+### Key JavaScript Functions
+- `uploadFiles()` - Handle file uploads with progress
+- `sendMessage()` - Send chat messages and display responses
+- `updateStats()` - Refresh processing statistics
+- `displayCitations()` - Show document sources
+### CSS Architecture
+- Mobile-first responsive design
+- CSS custom properties for theming
+- Flexbox/Grid layouts
+- Component-based styling
+### Adding UI Components
+1. Add HTML structure
+2. Style with CSS classes
+3. Add JavaScript event handlers
+4. Connect to backend APIs
+## 🐛 Debugging
+### Common Issues
+**CORS Errors**:
+- Check `main.py` CORS configuration
+- Ensure frontend runs on allowed origins
+**Import Errors**:
+- Verify Python path and virtual environment
+- Check `requirements.txt` dependencies
+**API Key Issues**:
+- Confirm GROQ API key is set
+- Check environment variable loading
+### Logging
+Add logging to your code:
+```python
+import logging
+logger = logging.getLogger(__name__)
+@router.post("/endpoint")
+async def endpoint():
+    logger.info("Processing request")
+    try:
+        # Logic here
+        logger.debug("Success")
+    except Exception as e:
+        logger.error(f"Error: {e}")
+        raise
+```
+## 📝 Code Style Guidelines
+### Python
+- Follow PEP 8
+- Use type hints
+- Add docstrings
+- Maximum line length: 88 characters
+```python
+def process_document(file_path: str, options: Dict[str, Any]) -> ProcessResult:
+    """
+    Process a document and extract text content.
+    Args:
+        file_path: Path to the document file
+        options: Processing configuration options
+    Returns:
+        ProcessResult containing extracted text and metadata
+    Raises:
+        ProcessingError: If document cannot be processed
+    """
+    # Implementation...
+```
+### JavaScript
+- Use modern ES6+ syntax
+- Prefer `const`/`let` over `var`
+- Use async/await for promises
+- Add JSDoc comments
+```javascript
+/**
+ * Upload files to the server
+ * @param {FileList} files - Files to upload
+ * @returns {Promise<Object>} Upload result
+ */
+async function uploadFiles(files) {
+    // Implementation...
+}
+```
+## 🚀 Deployment
+### Development
+```bash
+python -m uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
+```
+### Production
+```bash
+python -m uvicorn backend.main:app --host 0.0.0.0 --port 8000 --workers 4
+```
+### Docker (if configured)
+```bash
+docker build -t rag-chat-app .
+docker run -p 8000:8000 -e GROQ_API_KEY=your_key rag-chat-app
+```
+## 🤝 Contributing
+1. Fork the repository
+2. Create feature branch: `git checkout -b feature/amazing-feature`
+3. Make changes and add tests
+4. Ensure tests pass: `bash tests/run_tests.sh`
+5. Commit: `git commit -m 'Add amazing feature'`
+6. Push: `git push origin feature/amazing-feature`
+7. Open Pull Request
+### Pull Request Checklist
+- [ ] Code follows style guidelines
+- [ ] Tests added for new functionality
+- [ ] All tests pass
+- [ ] Documentation updated
+- [ ] No breaking changes (or clearly documented)
+## 📚 Additional Resources
+- [FastAPI Documentation](https://fastapi.tiangolo.com/)
+- [FAISS Documentation](https://faiss.ai/)
+- [LangChain Documentation](https://python.langchain.com/)
+- [GROQ API Documentation](https://console.groq.com/docs)

docs/README.md ADDED Viewed

	@@ -0,0 +1,169 @@

+# RAG Chat Application - Documentation
+A sophisticated Retrieval-Augmented Generation (RAG) chat application that enables intelligent conversations with your documents.
+## 🏗️ Architecture Overview
+```mermaid
+flowchart TD
+  %% Client Layer
+  subgraph "Client Layer"
+    direction TB
+    WebClient["Web Client (HTML/JS/CSS)"]:::ui
+    StreamlitUI["MVP Streamlit UI"]:::ui
+  end
+  %% Backend Layer
+  subgraph "Backend Layer"
+    direction TB
+    FastAPI["FastAPI Backend"]:::api
+    subgraph routes["Routes"]
+      direction TB
+      MainRoutes["main_routes.py"]:::api
+      UploadRoutes["upload_routes.py"]:::api
+      ChatRoutes["chat_routes.py"]:::api
+      StoreRoutes["store_routes.py"]:::api
+    end
+    Models["models.py"]:::api
+    Utils["utils.py"]:::api
+  end
+  %% RAG Engine Layer
+  subgraph "RAG Engine Layer"
+    direction TB
+    Config["config.py"]:::core
+    CoreEngine["enhanced_vectordb.py"]:::core
+  end
+  %% Persistence Layer
+  VectorStore[(Vector Store<br/>FAISS Index + Metadata)]:::store
+  %% External Services
+  subgraph "External Services"
+    direction TB
+    GROQ["GROQ Vision API"]:::external
+    SentenceModel["SentenceTransformer Model"]:::external
+  end
+  %% Tests
+  subgraph "Automated Tests"
+    direction TB
+    Tests1["test_api_endpoints.py"]:::tests
+    Tests2["test_endpoints_pytest.py"]:::tests
+  end
+  %% Connections
+  WebClient -->|"/api/*" fetch| FastAPI
+  MainRoutes -->|serve static| WebClient
+  StreamlitUI -->|in-process calls| CoreEngine
+  FastAPI -->|calls RAG Engine| CoreEngine
+  CoreEngine -->|read/write| VectorStore
+  CoreEngine -->|OCR & LLM requests| GROQ
+  CoreEngine -->|embedding requests| SentenceModel
+  StoreRoutes -->|disk read/write| VectorStore
+  %% Click Events
+  click WebClient "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/frontend/index.html"
+  click WebClient "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/frontend/script.js"
+  click WebClient "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/frontend/style.css"
+  click StreamlitUI "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/streamlit_rag_app.py"
+  click FastAPI "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/backend/main.py"
+  click Models "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/backend/models.py"
+  click Utils "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/backend/utils.py"
+  click MainRoutes "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/backend/routes/main_routes.py"
+  click UploadRoutes "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/backend/routes/upload_routes.py"
+  click ChatRoutes "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/backend/routes/chat_routes.py"
+  click StoreRoutes "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/backend/routes/store_routes.py"
+  click Config "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/rag_elements/config.py"
+  click CoreEngine "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/rag_elements/enhanced_vectordb.py"
+  click Tests1 "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/test/test_api_endpoints.py"
+  click Tests2 "https://github.com/jatin-mehra119/wasserstoff-aiinterntask/blob/main/test/test_endpoints_pytest.py"
+  %% Styles
+  classDef ui fill:#E3F2FD,stroke:#1976D2,color:#0D47A1;
+  classDef api fill:#E8F5E9,stroke:#388E3C,color:#1B5E20;
+  classDef core fill:#FFF3E0,stroke:#FB8C00,color:#E65100;
+  classDef store fill:#FFF9C4,stroke:#FBC02D,color:#F57F17;
+  classDef external fill:#ECEFF1,stroke:#607D8B,color:#37474F;
+  classDef tests fill:#F3E5F5,stroke:#8E24AA,color:#4A148C;
+```
+## 📋 Quick Start
+### Prerequisites
+- Python 3.8+
+- GROQ API key (for OCR and chat)
+### Installation & Running
+```bash
+# Clone repository
+git clone https://github.com/Jatin-Mehra119/wasserstoff-AiInternTask.git
+cd wasserstoff-AiInternTask
+# Install dependencies
+pip install -r requirements.txt
+# Run FastAPI backend (Production)
+cd backend
+python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reload
+# Open http://localhost:8000 in browser
+# Alternative: Run Streamlit MVP
+streamlit run streamlit_rag_app.py
+```
+## 🔧 Architecture Components
+### Core RAG Engine (`rag_elements/`)
+- **`enhanced_vectordb.py`** - Main RAG implementation with document processing, vector search, and AI integration
+- **`config.py`** - Configuration management and settings
+### FastAPI Backend (`backend/`)
+- **`main.py`** - Application entry point and server configuration
+- **`models.py`** - Pydantic data models and API schemas
+- **`utils.py`** - Utilities, state management, and helpers
+- **`routes/`** - Modular API endpoints:
+  - `main_routes.py` - Frontend serving and health
+  - `upload_routes.py` - Document upload and processing
+  - `chat_routes.py` - Chat interface and AI responses
+  - `store_routes.py` - Vector store persistence
+### Frontend (`frontend/`)
+- **`index.html`** - Main application UI
+- **`style.css`** - Responsive design and styling
+- **`script.js`** - Frontend logic and API integration
+### Legacy MVP
+- **`streamlit_rag_app.py`** - Original Streamlit implementation
+## 📊 Data Flow
+1. **Document Upload** → Text extraction → Chunking → Vector embeddings → FAISS index
+2. **Chat Query** → Semantic search → Context retrieval → AI response generation → Citations
+3. **Persistence** → Save/load vector stores with metadata
+## 🔌 Key APIs
+- `POST /upload-files` - Process documents
+- `POST /chat` - Chat with documents
+- `GET /stats` - Processing statistics
+- `POST /save-vector-store` - Persist data
+- `POST /load-vector-store` - Restore data
+## 🧪 Testing
+```bash
+cd tests
+bash run_tests.sh
+```
+## 📚 External Dependencies
+- **FAISS** - Vector similarity search
+- **GROQ** - Vision OCR and conversational AI
+- **LangChain** - Document processing
+- **FastAPI** - Web framework
+- **Sentence Transformers** - Text embeddings
+For detailed information, see the main [README.md](../README.md).

docs/index.md ADDED Viewed

	@@ -0,0 +1,86 @@

+# Documentation Index
+Welcome to the RAG Chat Application documentation! This directory contains comprehensive guides to help you understand, use, and contribute to the project.
+## 📚 Documentation Structure
+### Quick Start & Overview
+- **[README.md](README.md)** - Project overview, architecture diagram, and quick start guide
+- **[Main README](../README.md)** - Comprehensive project documentation with detailed features and usage
+### API Reference
+- **[API.md](API.md)** - Complete REST API documentation with examples and curl commands
+### Development
+- **[DEVELOPMENT.md](DEVELOPMENT.md)** - Developer guide for contributing to the project
+## 🎯 Getting Started
+### For Users
+1. Read the [Quick Start](README.md#-quick-start) section
+2. Follow the [Installation & Running](README.md#installation--running) instructions
+3. Review the [API Reference](API.md) for integration details
+### For Developers
+1. Start with the [Development Setup](DEVELOPMENT.md#-development-setup)
+2. Understand the [Project Structure](DEVELOPMENT.md#-project-structure)
+3. Review [Core Components](DEVELOPMENT.md#-core-components)
+4. Check the [Contributing Guidelines](DEVELOPMENT.md#-contributing)
+## 🏗️ Architecture Quick Reference
+The application follows a layered architecture:
+- **Client Layer**: Web frontend + Streamlit MVP
+- **Backend Layer**: FastAPI with modular routes
+- **RAG Engine Layer**: Core document processing and vector search
+- **Persistence Layer**: FAISS vector store with metadata
+- **External Services**: GROQ API and Sentence Transformers
+See the [architecture diagram](README.md#️-architecture-overview) for visual representation.
+## 🔗 Quick Links
+| Topic | Document | Description |
+|-------|----------|-------------|
+| **Overview** | [README.md](README.md) | Architecture and quick start |
+| **API Endpoints** | [API.md](API.md) | REST API reference |
+| **Development** | [DEVELOPMENT.md](DEVELOPMENT.md) | Contributing guidelines |
+| **Main README** | [../README.md](../README.md) | Detailed project documentation |
+| **Tests** | [../tests/README.md](../tests/README.md) | Testing documentation |
+## 🚀 Core Features
+- **Multi-format Document Processing**: PDF, text, images, code files
+- **Intelligent Chat Interface**: AI-powered responses with citations
+- **Vector Search**: FAISS-powered semantic similarity search
+- **Persistence**: Save and load processed document collections
+- **Modern Web UI**: Responsive design with real-time updates
+- **Comprehensive API**: RESTful endpoints with interactive documentation
+## 🛠️ Tech Stack
+- **Backend**: FastAPI, Python 3.8+
+- **Frontend**: HTML5, CSS3, JavaScript (ES6+)
+- **AI/ML**: GROQ API, Sentence Transformers, LangChain
+- **Search**: FAISS vector database
+- **Testing**: pytest, requests
+## 📞 Support
+- **Issues**: [GitHub Issues](https://github.com/Jatin-Mehra119/wasserstoff-AiInternTask/issues)
+- **Discussions**: [GitHub Discussions](https://github.com/Jatin-Mehra119/wasserstoff-AiInternTask/discussions)
+- **API Docs**: http://localhost:8000/docs (when server is running)
+## 📝 Contributing
+We welcome contributions! Please read the [Development Guide](DEVELOPMENT.md#-contributing) for guidelines on:
+- Code style and standards
+- Testing requirements
+- Pull request process
+- Adding new features
+---
+*This documentation is maintained alongside the codebase. For the most up-to-date information, always refer to the latest version in the repository.*