lily_fast_api / README.md
gbrabbit's picture
Auto commit at 26-2025-08 2:29:46
d34fab1
---
title: Lily LLM API Server
emoji: ๐Ÿค–
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: \"1.0.0\"
app_port: 7860
app_file: app.py
pinned: false
---
# 250826 v1.0.3
- cloudtype migrate, supabase db ์—ฐ๊ฒฐ ๊ณ„์ธต์  ๋ฉ”๋ชจ๋ฆฌ ํ…Œ์ŠคํŠธ
# 250825 v1.0.2 test
- lily ์ปจํ…์ŠคํŠธ ๋ฉ”๋ชจ๋ฆฌ supabase ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์—ฐ๊ฒฐ
# 250824
- ๊ณ„์ธต์  ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ ํ†ตํ•ฉ ๋ฆฌํŒฉํ† ๋ง, ์žฅ๊ธฐ, ์ค‘๊ธฐ, ๋‹จ๊ธฐ
# 250823
- ์ปจํ…์ŠคํŠธ ๋งค๋‹ˆ์ €, lora, rag ์‹œ์Šคํ…œ, ๋ฌธ์„œ ์š”์•ฝ + ์ปจํ…์ŠคํŠธ ํ™•์žฅ ์‹œ์Šคํ…œ ํ†ตํ•ฉ
# 250822
- polyglot 1.3b lora ํŒŒ๋ผ๋ฉ”ํ„ฐ ์กฐ์ ˆ, ์‘๋‹ต ํ’ˆ์งˆ ํ–ฅ์ƒ
# 250821
- polyplot 5.8b ์‘๋‹ต ์†๋„ ๊ฐœ์„ , ๋ชจ๋ธ๋ณ„ tokenizer config settings json ๋ณ€์ˆ˜ ๋ช…์‹œ์ ์œผ๋กœ ๊ธฐ์ž…
# 250820
- lily llm kanana 3b ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๊ฐœ์„ , polyglot 1.3b, 5.8b ์‘๋‹ต ํ’ˆ์งˆ ๊ฐœ์„ 
# 250819 v1.0.1
- ์ปจํ…์ŠคํŠธ ์ฐฝ, lora ์ถ”๊ฐ€, kanana model ๊ณต์‹ ๋ฌธ์„œ๋Œ€๋กœ ๋‹ค์‹œ ๋‹ค์šด๋กœ๋“œ ํ›„ app_v2 ๋ฐ ํ”„๋กœํ•„ ๋ฆฌํŒฉํ† ๋ง
# Lily LLM API Server
FastAPI ๊ธฐ๋ฐ˜ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ AI ์„œ๋ฒ„์ž…๋‹ˆ๋‹ค.
## Features
- ํ…์ŠคํŠธ ์ƒ์„ฑ
- ์ด๋ฏธ์ง€ ์ธ์‹
- RAG ์‹œ์Šคํ…œ
- ์ˆ˜ํ•™ ๊ณต์‹ ์ฒ˜๋ฆฌ
## API Endpoints
- \`GET /health\` - ์„œ๋ฒ„ ์ƒํƒœ ํ™•์ธ
- \`POST /generate\` - ํ…์ŠคํŠธ/์ด๋ฏธ์ง€ ์ƒ์„ฑ
- \`POST /upload-document\` - ๋ฌธ์„œ ์—…๋กœ๋“œ
- \`POST /rag-query\` - RAG ์งˆ์˜
## Model
- Kanana-1.5-v-3b-instruct (ํ•œ๊ตญ์–ด ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ)
- ์ž๋™ ๋‹ค์šด๋กœ๋“œ: gbrabbit/lily-math-model" > README.md
# Lily LLM API - Hugging Face Spaces
## ๐Ÿค– ์†Œ๊ฐœ
Lily LLM API๋Š” ๋‹ค์ค‘ ๋ชจ๋ธ ์ง€์›๊ณผ RAG(Retrieval Augmented Generation) ์‹œ์Šคํ…œ์„ ๊ฐ–์ถ˜ ๊ณ ์„ฑ๋Šฅ AI API ์„œ๋ฒ„์ž…๋‹ˆ๋‹ค.
### โœจ ์ฃผ์š” ๊ธฐ๋Šฅ
- **๐Ÿง  ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ AI**: Kanana-1.5-v-3b-instruct ๋ชจ๋ธ์„ ํ†ตํ•œ ํ…์ŠคํŠธ ๋ฐ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ
- **๐Ÿ“š RAG ์‹œ์Šคํ…œ**: ๋ฌธ์„œ ๊ธฐ๋ฐ˜ ์งˆ์˜์‘๋‹ต ๋ฐ ์ปจํ…์ŠคํŠธ ๊ฒ€์ƒ‰
- **๐Ÿ” ๋ฒกํ„ฐ ๊ฒ€์ƒ‰**: FAISS ๊ธฐ๋ฐ˜ ๊ณ ์† ์œ ์‚ฌ๋„ ๊ฒ€์ƒ‰
- **๐Ÿ“„ ๋ฌธ์„œ ์ฒ˜๋ฆฌ**: PDF, DOCX, TXT ๋“ฑ ๋‹ค์–‘ํ•œ ๋ฌธ์„œ ํ˜•์‹ ์ง€์›
- **๐Ÿ–ผ๏ธ ์ด๋ฏธ์ง€ OCR**: LaTeX-OCR์„ ํ†ตํ•œ ์ˆ˜ํ•™ ๊ณต์‹ ์ธ์‹
- **โšก ๋น„๋™๊ธฐ ์ฒ˜๋ฆฌ**: Celery ๊ธฐ๋ฐ˜ ๋ฐฑ๊ทธ๋ผ์šด๋“œ ์ž‘์—…
- **๐ŸŒ RESTful API**: FastAPI ๊ธฐ๋ฐ˜ ๊ณ ์„ฑ๋Šฅ ์›น API
### ๐Ÿš€ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
#### 1. ํ…์ŠคํŠธ ์ƒ์„ฑ
```python
import requests
response = requests.post(
"https://huggingface.co/spaces/gbrabbit/lily_fast_api/generate",
data={"prompt": "์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜ ๋‚ ์”จ๊ฐ€ ์–ด๋–ค๊ฐ€์š”?"}
)
print(response.json())
```
#### 2. ์ด๋ฏธ์ง€์™€ ํ•จ๊ป˜ ์งˆ์˜
```python
import requests
with open("image.jpg", "rb") as f:
response = requests.post(
"https://https://huggingface.co/spaces/gbrabbit/lily_fast_api/generate",
data={"prompt": "์ด๋ฏธ์ง€์—์„œ ๋ฌด์—‡์„ ๋ณผ ์ˆ˜ ์žˆ๋‚˜์š”?"},
files={"image1": f}
)
print(response.json())
```
#### 3. RAG ๊ธฐ๋ฐ˜ ์งˆ์˜์‘๋‹ต
```python
import requests
# ๋ฌธ์„œ ์—…๋กœ๋“œ
with open("document.pdf", "rb") as f:
upload_response = requests.post(
"https://huggingface.co/spaces/gbrabbit/lily_fast_api/upload-document",
files={"file": f},
data={"user_id": "your_user_id"}
)
document_id = upload_response.json()["document_id"]
# RAG ์งˆ์˜
response = requests.post(
"https://huggingface.co/spaces/gbrabbit/lily_fast_api/rag-query",
json={
"query": "๋ฌธ์„œ์˜ ์ฃผ์š” ๋‚ด์šฉ์€ ๋ฌด์—‡์ธ๊ฐ€์š”?",
"user_id": "your_user_id",
"document_id": document_id
}
)
print(response.json())
```
### ๐Ÿ“‹ API ์—”๋“œํฌ์ธํŠธ
#### ๊ธฐ๋ณธ ์—”๋“œํฌ์ธํŠธ
- `GET /health` - ์„œ๋ฒ„ ์ƒํƒœ ํ™•์ธ
- `GET /models` - ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ ๋ชฉ๋ก
- `POST /load-model` - ๋ชจ๋ธ ๋กœ๋“œ
- `POST /generate` - ํ…์ŠคํŠธ/์ด๋ฏธ์ง€ ์ƒ์„ฑ
#### RAG ์‹œ์Šคํ…œ
- `POST /upload-document` - ๋ฌธ์„œ ์—…๋กœ๋“œ
- `POST /rag-query` - RAG ๊ธฐ๋ฐ˜ ์งˆ์˜
- `GET /documents/{user_id}` - ์‚ฌ์šฉ์ž ๋ฌธ์„œ ๋ชฉ๋ก
- `DELETE /document/{document_id}` - ๋ฌธ์„œ ์‚ญ์ œ
#### ๊ณ ๊ธ‰ ๊ธฐ๋Šฅ
- `POST /batch-process` - ๋ฐฐ์น˜ ๋ฌธ์„œ ์ฒ˜๋ฆฌ
- `GET /task-status/{task_id}` - ์ž‘์—… ์ƒํƒœ ํ™•์ธ
- `POST /cancel-task/{task_id}` - ์ž‘์—… ์ทจ์†Œ
### ๐Ÿ› ๏ธ ๊ธฐ์ˆ  ์Šคํƒ
- **Backend**: FastAPI, Python 3.11
- **AI Models**: Transformers, PyTorch
- **Vector DB**: FAISS, ChromaDB
- **Task Queue**: Celery, Redis
- **OCR**: LaTeX-OCR, EasyOCR
- **Document Processing**: LangChain
### ๐Ÿ“Š ๋ชจ๋ธ ์ •๋ณด
#### Kanana-1.5-v-3b-instruct
- **ํฌ๊ธฐ**: 3.6B ๋งค๊ฐœ๋ณ€์ˆ˜
- **์–ธ์–ด**: ํ•œ๊ตญ์–ด ํŠนํ™”
- **๊ธฐ๋Šฅ**: ํ…์ŠคํŠธ ์ƒ์„ฑ, ์ด๋ฏธ์ง€ ์ดํ•ด
- **์ปจํ…์ŠคํŠธ**: ์ตœ๋Œ€ 4096 ํ† ํฐ
### ๐Ÿ”ง ์„ค์ •
ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ํ†ตํ•ด ๋‹ค์Œ ์„ค์ •์„ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```bash
# ์„œ๋ฒ„ ์„ค์ •
HOST=0.0.0.0
PORT=7860
# ๋ชจ๋ธ ์„ค์ •
DEFAULT_MODEL=kanana-1.5-v-3b-instruct
MAX_NEW_TOKENS=256
TEMPERATURE=0.7
# ์บ์‹œ ์„ค์ •
TRANSFORMERS_CACHE=/app/cache/transformers
HF_HOME=/app/cache/huggingface
```
### ๐Ÿ“ ๋ผ์ด์„ ์Šค
์ด ํ”„๋กœ์ ํŠธ๋Š” MIT ๋ผ์ด์„ ์Šค ํ•˜์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค.
### ๐Ÿค ๊ธฐ์—ฌ
๋ฒ„๊ทธ ๋ฆฌํฌํŠธ, ๊ธฐ๋Šฅ ์ œ์•ˆ, ํ’€ ๋ฆฌํ€˜์ŠคํŠธ๋ฅผ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค!
### ๐Ÿ“ž ์ง€์›
๋ฌธ์˜์‚ฌํ•ญ์ด ์žˆ์œผ์‹œ๋ฉด GitHub Issues๋ฅผ ํ†ตํ•ด ์—ฐ๋ฝํ•ด ์ฃผ์„ธ์š”.