metadata
title: Open Finance LLM 8B
emoji: π
colorFrom: red
colorTo: red
sdk: docker
pinned: false
app_port: 7860
suggested_hardware: l4x1
Open Finance LLM 8B
OpenAI-compatible API powered by DragonLLM/Qwen-Open-Finance-R-8B.
Deployment
| Platform | Backend | Dockerfile | Use Case |
|---|---|---|---|
| Hugging Face Spaces | Transformers | Dockerfile |
Development, L4 GPU |
| Koyeb | vLLM | Dockerfile.koyeb |
Production, L40s GPU |
Features
- OpenAI-compatible API
- Tool/function calling support
- Streaming responses
- Rate limiting (30 req/min, 500 req/hour)
- Statistics tracking via
/v1/stats
Quick Start
curl -X POST "https://your-endpoint/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "DragonLLM/Qwen-Open-Finance-R-8B",
"messages": [{"role": "user", "content": "What is compound interest?"}],
"max_tokens": 500
}'
from openai import OpenAI
client = OpenAI(base_url="https://your-endpoint/v1", api_key="not-needed")
response = client.chat.completions.create(
model="DragonLLM/Qwen-Open-Finance-R-8B",
messages=[{"role": "user", "content": "What is compound interest?"}],
max_tokens=500
)
Configuration
| Variable | Required | Default | Description |
|---|---|---|---|
HF_TOKEN_LC2 |
Yes | - | Hugging Face token |
MODEL |
No | DragonLLM/Qwen-Open-Finance-R-8B |
Model name |
PORT |
No | 8000 (vLLM) / 7860 (Transformers) |
Server port |
vLLM-specific (Koyeb):
ENABLE_AUTO_TOOL_CHOICE=true- Enable tool callingTOOL_CALL_PARSER=hermes- Parser for Qwen modelsMAX_MODEL_LEN=8192- Max context lengthGPU_MEMORY_UTILIZATION=0.90- GPU memory fraction
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/v1/models |
GET | List available models |
/v1/chat/completions |
POST | Chat completion |
/v1/stats |
GET | Usage statistics |
/health |
GET | Health check |
Technical Specs
- Model: DragonLLM/Qwen-Open-Finance-R-8B (8B parameters)
- vLLM Backend: vllm-openai:latest with hermes tool parser
- Transformers Backend: 4.45.0+ with PyTorch 2.5.0+ (CUDA 12.4)
- Minimum VRAM: 20GB (L4), recommended 48GB (L40s)
Development
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8080
pytest tests/ -v
License
MIT License