Spaces:

sujal7102003
/

igenrate

Runtime error

sujal7102003 commited on Jun 24, 2025

Commit

a286f95

verified ·

1 Parent(s): f9dc39d

Upload folder using huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

+---
+title: TinyLlama Chatbot API
+emoji: 🦙
+colorFrom: indigo
+colorTo: pink
+sdk: docker
+sdk_version: "1.0"
+app_file: main.py
+pinned: false
+---
+# 🚀 FastAPI QLoRA Chatbot
+This project provides a **FastAPI backend** for serving predictions using the **TinyLlama** model, fine-tuned with QLoRA for instruction-based question answering.
+It also includes a clean, responsive **Jinja2-based frontend** for querying the model interactively.
+---
+## 🔧 Features
+- ✅ QLoRA-finetuned inference endpoint
+- ✅ HTML frontend built using Jinja2
+- ✅ FastAPI + Uvicorn backend
+- ✅ Docker-ready for Hugging Face Spaces
+- ✅ Hugging Face cache and model offloading for low-RAM environments
+---
+## 📦 Tech Stack
+- FastAPI + Uvicorn
+- Hugging Face Transformers + PEFT (QLoRA)
+- PyTorch (FP16)
+- Jinja2 Templates + HTML + JS (Vanilla)
+---
+## 📌 API Endpoints
+| Method | Endpoint         | Description                   |
+|--------|------------------|-------------------------------|
+| `GET`  | `/`              | Serves Jinja2 frontend (UI)   |
+| `POST` | `/predict/qlora` | Runs QLoRA inference on input |
+---
+## 🚀 How to Run
+### Locally
+1. **Install Python dependencies**:
+   ```bash
+   pip install -r requirements.txt