Spaces:

sujal7102003
/

igenrate

Runtime error

App Files Files Community

igenrate / README.md

sujal7102003

Upload folder using huggingface_hub

a286f95 verified 10 months ago

preview code

raw

history blame contribute delete

1.3 kB

metadata

title: TinyLlama Chatbot API
emoji: 🦙
colorFrom: indigo
colorTo: pink
sdk: docker
sdk_version: '1.0'
app_file: main.py
pinned: false

🚀 FastAPI QLoRA Chatbot

This project provides a FastAPI backend for serving predictions using the TinyLlama model, fine-tuned with QLoRA for instruction-based question answering.

It also includes a clean, responsive Jinja2-based frontend for querying the model interactively.

🔧 Features

✅ QLoRA-finetuned inference endpoint
✅ HTML frontend built using Jinja2
✅ FastAPI + Uvicorn backend
✅ Docker-ready for Hugging Face Spaces
✅ Hugging Face cache and model offloading for low-RAM environments

📦 Tech Stack

FastAPI + Uvicorn
Hugging Face Transformers + PEFT (QLoRA)
PyTorch (FP16)
Jinja2 Templates + HTML + JS (Vanilla)

📌 API Endpoints

Method	Endpoint	Description
`GET`	`/`	Serves Jinja2 frontend (UI)
`POST`	`/predict/qlora`	Runs QLoRA inference on input

🚀 How to Run

Locally

Install Python dependencies:
```
pip install -r requirements.txt
```