refer the space - ashok75-gakr.hf.space
GAKR AI – Local File‑Aware Chat Assistant
GAKR AI is a local, privacy‑friendly chat assistant that runs entirely on your machine.
It combines a FastAPI backend, a modern web chat UI, and a file‑intelligence pipeline that can read and summarize many file types before generating natural‑language responses.
The assistant itself is text‑only. It never directly sees raw PDFs, images, audio, or videos.
Instead, specialized tools convert files into structured text summaries, and the language model reasons over that text.
✨ Features
🌐 Web Chat Interface
- Clean dark UI with message bubbles and typing indicator
- Auto‑growing input box
- Attach files from camera, gallery, or filesystem
- Works in any modern browser at http://localhost:8080
🧠 Text + File Understanding
- Prompt only → general assistant (explanations, coding help, reasoning)
- Prompt + files → full analysis pipeline:
- Detects file type
- Stores uploads in
dataupload/ - Extracts structured facts
- Feeds extracted context + question to the model
📂 Multi‑File, Multi‑Type Uploads
Upload multiple files at once:
- Documents: PDF, DOCX, TXT
- Tabular data: CSV, Excel, JSON
- Images: OCR via Tesseract
- Audio: Speech‑to‑text via Whisper
- Video: Audio extraction via ffmpeg → Whisper
💾 Persistent Uploads
- Files saved under
dataupload/by type - Timestamped, safe filenames
- Automatic directory creation
🔐 Simple Login Reminder UX
- After 5 guest messages, a popup encourages login
- Logged‑in users are not interrupted
- Login state stored in
localStorage
🗂 Project Structure
project_root/
├── run.py # FastAPI backend + template serving
├── load_model.py # Loads the language model once
├── generate.py # generate_response() wrapper
├── file_pipeline.py # File detection, storage, and summarization
├── templates/
│ ├── chat.html # Main chat interface
│ └── auth.html # Login / signup UI
├── dataupload/ # Created at runtime for uploads
│ ├── images/
│ ├── videos/
│ ├── audio/
│ ├── documents/
│ ├── tabular/
│ └── other/
└── requirements.txt
⚙️ Installation
1️⃣ Create & Activate Virtual Environment (Recommended)
python -m venv .venv
source .venv/bin/activate # Linux / macOS
# or
.\.venv\Scripts\activate # Windows
2️⃣ Install Python Dependencies
pip install -r requirements.txt
requirements.txt
fastapi
uvicorn[standard]
python-multipart
torch
transformers
accelerate
safetensors
pandas
numpy
pdfplumber
pymupdf
python-docx
Pillow
pytesseract
openai-whisper
ffmpeg-python
3️⃣ Install System Tools
- Tesseract OCR (for image text extraction)
- ffmpeg (for audio extraction and Whisper)
Install via OS package manager (apt, brew, choco) or official installers.
▶️ Running GAKR AI
Start the Backend
python run.py
Expected output:
🚀 Starting GAKR AI Backend...
✅ Model initialized successfully
🌐 SERVER & CHAT LOCATION
🚀 CHAT INTERFACE: http://localhost:8080
🔧 API DOCUMENTATION: http://localhost:8080/docs
✅ CHAT.HTML SERVED: templates/chat.html
Open the Chat UI
Navigate to:
http://localhost:8080
🔌 API Overview
POST /api/analyze
Request (multipart/form-data)
api_key(string, required)prompt(string, required)files(optional, multiple)
Behavior
- No files → General assistant mode
- With files → File‑analysis mode using structured summaries
Response
{
"response": "natural-language answer here",
"context": {
"files": [
{
"original_name": "report.pdf",
"stored_path": "dataupload/documents/20241214_report.pdf",
"kind": "document",
"summary": {
"type": "document",
"char_count": 12345,
"preview": "First 4000 characters..."
}
}
]
},
"status": "success"
}
🧪 File Intelligence Pipeline
Handled by file_pipeline.py
Type Detection
- Tabular → CSV, XLSX, JSON
- Documents → PDF, DOCX, TXT
- Images → PNG, JPG
- Audio → MP3, WAV
- Video → MP4, MKV
Summaries
- Tabular: rows, columns, missing values, stats
- Documents: character count + preview
- Images: dimensions + OCR text
- Audio: duration + transcript preview
- Video: extracted audio analysis
Errors are stored per‑file and never crash the whole request.
🎨 Frontend UX Highlights
- Auto‑growing textarea
- Attachment chips with remove buttons
- Typing indicator
- URL prefill:
?q=your+question - Generic error message for all backend failures
🔐 Security Notes
- API key is currently a fixed string (for local use)
- For production:
- Use environment variables
- Add real authentication (JWT / sessions)
- Restrict CORS
- Apply upload size limits and cleanup policies
🚀 Extending GAKR AI
Ideas:
- Per‑user chat & file history (database)
- Search across uploaded documents
- External API integrations
- HTTPS + reverse proxy deployment
🧠 Philosophy
GAKR AI is an intelligence layer.
Tools translate reality (files, media, data) into structured language.
The language model turns that language into insight, reasoning, and action.
- Downloads last month
- 542
We're not able to determine the quantization variants.