Spaces:

studzinsky
/

bielik_app_service

Running

App Files Files Community

bielik_app_service / README.md

Patryk Studzinski

Fix: Handle function-call style

093fabc 10 days ago

preview code

raw

history blame contribute delete

8.44 kB

	---
	title: Bielik App Service
	emoji: 🤖
	colorFrom: blue
	colorTo: purple
	sdk: docker
	app_port: 7860
	pinned: false
	---

	# Bielik App Service

	Multi-model LLM service for description enhancement, batch gap-filling, and A/B testing.

	## Overview

	This service provides an API for generating enhanced descriptions using multiple open-source LLMs. It supports:
	- Description Enhancement: Generate marketing descriptions from structured data
	- Batch Infill: Fill gaps (`[GAP:n]` or `___`) in ad texts with natural words
	- Multi-Model Comparison: Compare outputs across different models for A/B testing

	## Models

	\| Model \| Size \| Polish Support \| Type \|
	\|-------\|------\|----------------\|------\|
	\| Bielik-1.5B \| 1.5B \| Excellent \| Local \|
	\| Qwen2.5-3B \| 3B \| Good \| Local \|
	\| Gemma-2-2B \| 2B \| Medium \| Local \|
	\| PLLuM-12B \| 12B \| Excellent \| API \|

	## API Endpoints

	### Health & Info

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| `GET` \| `/` \| Welcome message \|
	\| `GET` \| `/health` \| API health check and model status \|
	\| `GET` \| `/models` \| List all available models \|

	### Model Management (Lazy Loading)

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| `POST` \| `/models/{name}/load` \| Load a model into memory \|
	\| `POST` \| `/models/{name}/unload` \| Unload a model from memory \|

	### Description Generation

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| `POST` \| `/enhance-description` \| Generate description with single model \|
	\| `POST` \| `/compare` \| Compare outputs from multiple models \|

	### Batch Infill (Gap-Filling)

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| `POST` \| `/infill` \| Batch gap-filling with single model \|
	\| `POST` \| `/compare-infill` \| Compare gap-filling across multiple models \|

	---

	## Lazy Loading

	Models are not loaded at startup to conserve memory. Instead:
	- Models are loaded on first request (lazy loading)
	- Only one local model is loaded at a time
	- Switching to a different local model automatically unloads the previous one
	- API models (PLLuM) don't affect local model memory

	### Example: Load/Unload Flow
	```
	1. Request with bielik-1.5b → Loads Bielik (first use)
	2. Request with qwen2.5-3b → Unloads Bielik, loads Qwen
	3. Request with pllum-12b → Qwen stays loaded (API model doesn't affect local)
	4. POST /models/qwen2.5-3b/unload → Manually free memory
	```

	---

	## Endpoint Details

	### `GET /health`

	Check API status and loaded models.

	Response:
	```json
	{
	"status": "ok",
	"available_models": 4,
	"loaded_models": ["bielik-1.5b"],
	"active_local_model": "bielik-1.5b"
	}
	```

	---

	### `GET /models`

	List all available models with their load status.

	Response:
	```json
	[
	{
	"name": "bielik-1.5b",
	"model_id": "speakleash/Bielik-1.5B-v3.0-Instruct",
	"type": "local",
	"polish_support": "excellent",
	"size": "1.5B",
	"loaded": true,
	"active": true
	},
	{
	"name": "qwen2.5-3b",
	"model_id": "Qwen/Qwen2.5-3B-Instruct",
	"type": "local",
	"polish_support": "good",
	"size": "3B",
	"loaded": false,
	"active": false
	}
	]
	```

	---

	### `POST /models/{name}/load`

	Explicitly load a model. For local models, unloads the previous one first.

	Response:
	```json
	{
	"status": "loaded",
	"model": {
	"name": "bielik-1.5b",
	"loaded": true,
	"active": true
	}
	}
	```

	---

	### `POST /models/{name}/unload`

	Explicitly unload a model to free memory.

	Response:
	```json
	{
	"status": "unloaded",
	"model": "bielik-1.5b"
	}
	```

	---

	### `POST /enhance-description`

	Generate enhanced description using a single model.

	Request:
	```json
	{
	"domain": "cars",
	"data": {
	"make": "BMW",
	"model": "320i",
	"year": 2020,
	"mileage": 45000,
	"features": ["nawigacja", "klimatyzacja"],
	"condition": "bardzo dobry"
	},
	"model": "bielik-1.5b"
	}
	```

	Response:
	```json
	{
	"description": "Generated description text...",
	"model_used": "speakleash/Bielik-1.5B-v3.0-Instruct",
	"generation_time": 2.34,
	"user_email": "anonymous"
	}
	```

	---

	### `POST /compare`

	Compare outputs from multiple models for the same input.

	Request:
	```json
	{
	"domain": "cars",
	"data": {
	"make": "BMW",
	"model": "320i",
	"year": 2020,
	"mileage": 45000,
	"features": ["nawigacja", "klimatyzacja"],
	"condition": "bardzo dobry"
	},
	"models": ["bielik-1.5b", "qwen2.5-3b", "gemma-2-2b", "pllum-12b"]
	}
	```

	Response:
	```json
	{
	"domain": "cars",
	"results": [
	{
	"model": "bielik-1.5b",
	"output": "Generated text from Bielik...",
	"time": 2.3,
	"type": "local",
	"error": null
	},
	{
	"model": "pllum-12b",
	"output": "Generated text from PLLuM...",
	"time": 1.1,
	"type": "inference_api",
	"error": null
	}
	],
	"total_time": 5.67
	}
	```

	---

	### `POST /infill`

	Batch gap-filling for ads using a single model. Accepts texts with `[GAP:n]` markers or `___` and returns filled text with per-gap choices and alternatives.

	Gap Notation:
	- `[GAP:1]`, `[GAP:2]`, ... → Explicit numbered gaps (preferred)
	- `___` → Auto-numbered in scan order

	Request:
	```json
	{
	"domain": "cars",
	"items": [
	{
	"id": "ad1",
	"text_with_gaps": "Sprzedam [GAP:1] BMW w [GAP:2] stanie technicznym"
	},
	{
	"id": "ad2",
	"text_with_gaps": "Auto ma ___ km przebiegu i ___ lakier"
	}
	],
	"model": "bielik-1.5b",
	"options": {
	"top_n_per_gap": 3,
	"language": "pl",
	"temperature": 0.6
	}
	}
	```

	Response:
	```json
	{
	"model": "bielik-1.5b",
	"results": [
	{
	"id": "ad1",
	"status": "ok",
	"filled_text": "Sprzedam eleganckie BMW w doskonałym stanie technicznym",
	"gaps": [
	{
	"index": 1,
	"marker": "[GAP:1]",
	"choice": "eleganckie",
	"alternatives": ["piękne", "zadbane"]
	},
	{
	"index": 2,
	"marker": "[GAP:2]",
	"choice": "doskonałym",
	"alternatives": ["bardzo dobrym", "idealnym"]
	}
	],
	"error": null
	}
	],
	"total_time": 3.45,
	"processed_count": 2,
	"error_count": 0
	}
	```

	Options:
	\| Field \| Type \| Default \| Description \|
	\|-------\|------\|---------\|-------------\|
	\| `gap_notation` \| string \| `"auto"` \| `"auto"`, `"[GAP:n]"`, or `"___"` \|
	\| `top_n_per_gap` \| int \| `3` \| Alternatives per gap (1-5) \|
	\| `language` \| string \| `"pl"` \| Output language \|
	\| `temperature` \| float \| `0.6` \| Generation temperature (0-1) \|
	\| `max_new_tokens` \| int \| `256` \| Max tokens to generate \|

	---

	### `POST /compare-infill`

	Multi-model batch gap-filling comparison for A/B testing.

	Request:
	```json
	{
	"domain": "cars",
	"items": [
	{
	"id": "ad1",
	"text_with_gaps": "Sprzedam [GAP:1] BMW w [GAP:2] stanie"
	}
	],
	"models": ["bielik-1.5b", "qwen2.5-3b", "pllum-12b"],
	"options": {
	"top_n_per_gap": 3
	}
	}
	```

	Response:
	```json
	{
	"domain": "cars",
	"models": [
	{
	"model": "bielik-1.5b",
	"type": "local",
	"results": [...],
	"time": 2.1,
	"error_count": 0
	},
	{
	"model": "qwen2.5-3b",
	"type": "local",
	"results": [...],
	"time": 1.8,
	"error_count": 0
	}
	],
	"total_time": 5.2
	}
	```

	---

	## Domains

	Currently supported domains:

	\| Domain \| Schema Fields \|
	\|--------\|---------------\|
	\| `cars` \| `make`, `model`, `year`, `mileage`, `features[]`, `condition` \|

	---

	## Environment Variables

	\| Variable \| Description \| Required \|
	\|----------\|-------------\|----------\|
	\| `HF_TOKEN` \| HuggingFace API token for Inference API \| Yes (for API models) \|
	\| `LOCAL_MODEL_PATH` \| Path to pre-downloaded local model \| No (default: `/app/pretrain_model`) \|
	\| `FRONTEND_URL` \| Frontend URL for CORS \| No \|

	## Running Locally

	```bash
	# Install dependencies
	pip install -r requirements.txt

	# Run server
	uvicorn app.main:app --reload --port 8000
	```

	## Docker

	```bash
	# Build and run
	./start_container.ps1
	```

	API available at `http://localhost:8000`

	Docs at `http://localhost:8000/docs`

	## Live Demo

	Deployed on HuggingFace Spaces:

	URL: `https://studzinsky-bielik-app-service.hf.space`

	Quick Test:
	```bash
	# Health check
	curl https://studzinsky-bielik-app-service.hf.space/health

	# List models
	curl https://studzinsky-bielik-app-service.hf.space/models
	```