Spaces:
Running
Running
metadata
title: Kerdos AI — Custom LLM RAG API
emoji: 🤖
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
license: mit
tags:
- rag
- document-qa
- fastapi
- llama
- faiss
- nlp
- question-answering
- kerdos
- private-llm
- api
🤖 Kerdos AI — Custom LLM RAG API
A REST API by Kerdos Infrasoft Private Limited Upload documents. Ask questions. Get answers — strictly grounded in your data.
✨ Features
| 📄 Multi-format | PDF, DOCX, TXT, MD, CSV |
| 🧠 LLM | meta-llama/Llama-3.1-8B-Instruct via HF Inference Router |
| 🔒 Grounded | Answers only from your uploaded documents |
| 💬 Multi-turn | Conversation history per session |
| ⚡ Fast | all-MiniLM-L6-v2 + FAISS in-memory |
| 🔑 Session-based | Each client gets an isolated FAISS index |
📡 API Reference
Interactive docs → /docs (Swagger UI)
| Method | Path | Description |
|---|---|---|
POST |
/sessions |
Create a session → get session_id |
GET |
/sessions/{id} |
Session status |
DELETE |
/sessions/{id} |
Delete session |
POST |
/sessions/{id}/documents |
Upload & index files |
POST |
/sessions/{id}/chat |
Ask a question |
DELETE |
/sessions/{id}/history |
Clear chat history |
GET |
/health |
Health check |
🔁 Typical Workflow
BASE=https://kerdosdotio-kerdos-llm-rag-api.hf.space
# 1. Create session
curl -X POST $BASE/sessions
# 2. Upload a document
curl -X POST "$BASE/sessions/{session_id}/documents" \
-F "files=@your_doc.pdf"
# 3. Ask a question
curl -X POST "$BASE/sessions/{session_id}/chat" \
-H "Content-Type: application/json" \
-d '{"question": "Summarise this document", "hf_token": "hf_..."}'
⚙️ Environment / Secrets
Set these in Settings → Variables and secrets of this Space:
| Secret | Description |
|---|---|
HF_TOKEN |
Your HuggingFace token (Write access + Llama 3.1 licence accepted) |
SESSION_TTL_MINUTES |
Session expiry (default: 60) |
MAX_UPLOAD_MB |
Max upload size in MB (default: 50) |
🏗️ Architecture
FastAPI (api.py)
├── SessionStore — UUID sessions, TTL, per-session lock
└── RAGSession
├── parse_file() — PDF/DOCX/TXT/CSV
├── chunk_text() — 512-char chunks, 64 overlap
├── all-MiniLM-L6-v2 — embeddings
├── FAISS — in-memory vector search
└── call_llm() — HF Router → Llama 3.1 8B
💼 Enterprise Edition
Interested in private, on-premise deployment?
- 🔒 Private LLM Hosting
- 🎛️ Custom Model Fine-tuning
- 🛡️ Data Privacy Guarantees
- 🏷️ White-label Deployments
📧 partnership@kerdos.in | 🌐 kerdos.in/contact
© 2024–2025 Kerdos Infrasoft Private Limited | Bengaluru, Karnataka, India