refer the space - ashok75-gakr.hf.space # GAKR AI – Local File‑Aware Chat Assistant GAKR AI is a **local, privacy‑friendly chat assistant** that runs entirely on your machine. It combines a **FastAPI backend**, a modern **web chat UI**, and a **file‑intelligence pipeline** that can read and summarize many file types before generating natural‑language responses. The assistant itself is **text‑only**. It never directly sees raw PDFs, images, audio, or videos. Instead, specialized tools convert files into **structured text summaries**, and the language model reasons over that text. --- ## ✨ Features ### 🌐 Web Chat Interface - Clean dark UI with message bubbles and typing indicator - Auto‑growing input box - Attach files from camera, gallery, or filesystem - Works in any modern browser at **http://localhost:8080** ### 🧠 Text + File Understanding - **Prompt only** → general assistant (explanations, coding help, reasoning) - **Prompt + files** → full analysis pipeline: - Detects file type - Stores uploads in `dataupload/` - Extracts structured facts - Feeds extracted context + question to the model ### 📂 Multi‑File, Multi‑Type Uploads Upload multiple files at once: - Documents: PDF, DOCX, TXT - Tabular data: CSV, Excel, JSON - Images: OCR via Tesseract - Audio: Speech‑to‑text via Whisper - Video: Audio extraction via ffmpeg → Whisper ### 💾 Persistent Uploads - Files saved under `dataupload/` by type - Timestamped, safe filenames - Automatic directory creation ### 🔐 Simple Login Reminder UX - After **5 guest messages**, a popup encourages login - Logged‑in users are not interrupted - Login state stored in `localStorage` --- ## 🗂 Project Structure ``` project_root/ ├── run.py # FastAPI backend + template serving ├── load_model.py # Loads the language model once ├── generate.py # generate_response() wrapper ├── file_pipeline.py # File detection, storage, and summarization ├── templates/ │ ├── chat.html # Main chat interface │ └── auth.html # Login / signup UI ├── dataupload/ # Created at runtime for uploads │ ├── images/ │ ├── videos/ │ ├── audio/ │ ├── documents/ │ ├── tabular/ │ └── other/ └── requirements.txt ``` --- ## ⚙️ Installation ### 1️⃣ Create & Activate Virtual Environment (Recommended) ```bash python -m venv .venv source .venv/bin/activate # Linux / macOS # or .\.venv\Scripts\activate # Windows ``` ### 2️⃣ Install Python Dependencies ```bash pip install -r requirements.txt ``` **requirements.txt** ``` fastapi uvicorn[standard] python-multipart torch transformers accelerate safetensors pandas numpy pdfplumber pymupdf python-docx Pillow pytesseract openai-whisper ffmpeg-python ``` ### 3️⃣ Install System Tools - **Tesseract OCR** (for image text extraction) - **ffmpeg** (for audio extraction and Whisper) Install via OS package manager (`apt`, `brew`, `choco`) or official installers. --- ## ▶️ Running GAKR AI ### Start the Backend ```bash python run.py ``` Expected output: ``` 🚀 Starting GAKR AI Backend... ✅ Model initialized successfully 🌐 SERVER & CHAT LOCATION 🚀 CHAT INTERFACE: http://localhost:8080 🔧 API DOCUMENTATION: http://localhost:8080/docs ✅ CHAT.HTML SERVED: templates/chat.html ``` ### Open the Chat UI Navigate to: ``` http://localhost:8080 ``` --- ## 🔌 API Overview ### POST `/api/analyze` **Request** (`multipart/form-data`) - `api_key` (string, required) - `prompt` (string, required) - `files` (optional, multiple) **Behavior** - No files → General assistant mode - With files → File‑analysis mode using structured summaries **Response** ```json { "response": "natural-language answer here", "context": { "files": [ { "original_name": "report.pdf", "stored_path": "dataupload/documents/20241214_report.pdf", "kind": "document", "summary": { "type": "document", "char_count": 12345, "preview": "First 4000 characters..." } } ] }, "status": "success" } ``` --- ## 🧪 File Intelligence Pipeline Handled by `file_pipeline.py` ### Type Detection - Tabular → CSV, XLSX, JSON - Documents → PDF, DOCX, TXT - Images → PNG, JPG - Audio → MP3, WAV - Video → MP4, MKV ### Summaries - **Tabular**: rows, columns, missing values, stats - **Documents**: character count + preview - **Images**: dimensions + OCR text - **Audio**: duration + transcript preview - **Video**: extracted audio analysis Errors are stored per‑file and never crash the whole request. --- ## 🎨 Frontend UX Highlights - Auto‑growing textarea - Attachment chips with remove buttons - Typing indicator - URL prefill: `?q=your+question` - Generic error message for all backend failures --- ## 🔐 Security Notes - API key is currently a fixed string (for local use) - For production: - Use environment variables - Add real authentication (JWT / sessions) - Restrict CORS - Apply upload size limits and cleanup policies --- ## 🚀 Extending GAKR AI Ideas: - Per‑user chat & file history (database) - Search across uploaded documents - External API integrations - HTTPS + reverse proxy deployment --- ## 🧠 Philosophy **GAKR AI is an intelligence layer.** Tools translate reality (files, media, data) into structured language. The language model turns that language into insight, reasoning, and action.