Spaces:

kerdosdotio
/

kerdos-llm-rag-api

Running

App Files Files Community

kerdos-llm-rag-api / README.md

Bhaskar Ram

feat: Kerdos AI RAG API v1.0

b1a3dce about 2 months ago

preview code

raw

history blame contribute delete

3.88 kB

metadata

title: Kerdos AI — Custom LLM RAG API
emoji: 🤖
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
license: mit
tags:
  - rag
  - document-qa
  - fastapi
  - llama
  - faiss
  - nlp
  - question-answering
  - kerdos
  - private-llm
  - api

🤖 Kerdos AI — Custom LLM RAG API

A REST API by Kerdos Infrasoft Private Limited Upload documents. Ask questions. Get answers — strictly grounded in your data.

✨ Features


📄 Multi-format	PDF, DOCX, TXT, MD, CSV
🧠 LLM	`meta-llama/Llama-3.1-8B-Instruct` via HF Inference Router
🔒 Grounded	Answers only from your uploaded documents
💬 Multi-turn	Conversation history per session
⚡ Fast	`all-MiniLM-L6-v2` + FAISS in-memory
🔑 Session-based	Each client gets an isolated FAISS index

📡 API Reference

Interactive docs → /docs (Swagger UI)

Method	Path	Description
`POST`	`/sessions`	Create a session → get `session_id`
`GET`	`/sessions/{id}`	Session status
`DELETE`	`/sessions/{id}`	Delete session
`POST`	`/sessions/{id}/documents`	Upload & index files
`POST`	`/sessions/{id}/chat`	Ask a question
`DELETE`	`/sessions/{id}/history`	Clear chat history
`GET`	`/health`	Health check

🔁 Typical Workflow

BASE=https://kerdosdotio-kerdos-llm-rag-api.hf.space

# 1. Create session
curl -X POST $BASE/sessions

# 2. Upload a document
curl -X POST "$BASE/sessions/{session_id}/documents" \
  -F "files=@your_doc.pdf"

# 3. Ask a question
curl -X POST "$BASE/sessions/{session_id}/chat" \
  -H "Content-Type: application/json" \
  -d '{"question": "Summarise this document", "hf_token": "hf_..."}'

⚙️ Environment / Secrets

Set these in Settings → Variables and secrets of this Space:

Secret	Description
`HF_TOKEN`	Your HuggingFace token (Write access + Llama 3.1 licence accepted)
`SESSION_TTL_MINUTES`	Session expiry (default: 60)
`MAX_UPLOAD_MB`	Max upload size in MB (default: 50)

🏗️ Architecture

FastAPI (api.py)
  ├── SessionStore — UUID sessions, TTL, per-session lock
  └── RAGSession
        ├── parse_file()       — PDF/DOCX/TXT/CSV
        ├── chunk_text()       — 512-char chunks, 64 overlap
        ├── all-MiniLM-L6-v2   — embeddings
        ├── FAISS              — in-memory vector search
        └── call_llm()         — HF Router → Llama 3.1 8B

💼 Enterprise Edition

Interested in private, on-premise deployment?

🔒 Private LLM Hosting
🎛️ Custom Model Fine-tuning
🛡️ Data Privacy Guarantees
🏷️ White-label Deployments

📧 partnership@kerdos.in | 🌐 kerdos.in/contact