--- title: Bielik App Service emoji: 🤖 colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false --- # Bielik App Service Multi-model LLM service for description enhancement, batch gap-filling, and A/B testing. ## Overview This service provides an API for generating enhanced descriptions using multiple open-source LLMs. It supports: - **Description Enhancement**: Generate marketing descriptions from structured data - **Batch Infill**: Fill gaps (`[GAP:n]` or `___`) in ad texts with natural words - **Multi-Model Comparison**: Compare outputs across different models for A/B testing ## Models | Model | Size | Polish Support | Type | |-------|------|----------------|------| | Bielik-1.5B | 1.5B | Excellent | Local | | Qwen2.5-3B | 3B | Good | Local | | Gemma-2-2B | 2B | Medium | Local | | PLLuM-12B | 12B | Excellent | API | ## API Endpoints ### Health & Info | Method | Endpoint | Description | |--------|----------|-------------| | `GET` | `/` | Welcome message | | `GET` | `/health` | API health check and model status | | `GET` | `/models` | List all available models | ### Model Management (Lazy Loading) | Method | Endpoint | Description | |--------|----------|-------------| | `POST` | `/models/{name}/load` | Load a model into memory | | `POST` | `/models/{name}/unload` | Unload a model from memory | ### Description Generation | Method | Endpoint | Description | |--------|----------|-------------| | `POST` | `/enhance-description` | Generate description with single model | | `POST` | `/compare` | Compare outputs from multiple models | ### Batch Infill (Gap-Filling) | Method | Endpoint | Description | |--------|----------|-------------| | `POST` | `/infill` | Batch gap-filling with single model | | `POST` | `/compare-infill` | Compare gap-filling across multiple models | --- ## Lazy Loading Models are **not loaded at startup** to conserve memory. Instead: - Models are loaded **on first request** (lazy loading) - Only **one local model** is loaded at a time - Switching to a different local model **automatically unloads** the previous one - API models (PLLuM) don't affect local model memory ### Example: Load/Unload Flow ``` 1. Request with bielik-1.5b → Loads Bielik (first use) 2. Request with qwen2.5-3b → Unloads Bielik, loads Qwen 3. Request with pllum-12b → Qwen stays loaded (API model doesn't affect local) 4. POST /models/qwen2.5-3b/unload → Manually free memory ``` --- ## Endpoint Details ### `GET /health` Check API status and loaded models. **Response:** ```json { "status": "ok", "available_models": 4, "loaded_models": ["bielik-1.5b"], "active_local_model": "bielik-1.5b" } ``` --- ### `GET /models` List all available models with their load status. **Response:** ```json [ { "name": "bielik-1.5b", "model_id": "speakleash/Bielik-1.5B-v3.0-Instruct", "type": "local", "polish_support": "excellent", "size": "1.5B", "loaded": true, "active": true }, { "name": "qwen2.5-3b", "model_id": "Qwen/Qwen2.5-3B-Instruct", "type": "local", "polish_support": "good", "size": "3B", "loaded": false, "active": false } ] ``` --- ### `POST /models/{name}/load` Explicitly load a model. For local models, unloads the previous one first. **Response:** ```json { "status": "loaded", "model": { "name": "bielik-1.5b", "loaded": true, "active": true } } ``` --- ### `POST /models/{name}/unload` Explicitly unload a model to free memory. **Response:** ```json { "status": "unloaded", "model": "bielik-1.5b" } ``` --- ### `POST /enhance-description` Generate enhanced description using a single model. **Request:** ```json { "domain": "cars", "data": { "make": "BMW", "model": "320i", "year": 2020, "mileage": 45000, "features": ["nawigacja", "klimatyzacja"], "condition": "bardzo dobry" }, "model": "bielik-1.5b" } ``` **Response:** ```json { "description": "Generated description text...", "model_used": "speakleash/Bielik-1.5B-v3.0-Instruct", "generation_time": 2.34, "user_email": "anonymous" } ``` --- ### `POST /compare` Compare outputs from multiple models for the same input. **Request:** ```json { "domain": "cars", "data": { "make": "BMW", "model": "320i", "year": 2020, "mileage": 45000, "features": ["nawigacja", "klimatyzacja"], "condition": "bardzo dobry" }, "models": ["bielik-1.5b", "qwen2.5-3b", "gemma-2-2b", "pllum-12b"] } ``` **Response:** ```json { "domain": "cars", "results": [ { "model": "bielik-1.5b", "output": "Generated text from Bielik...", "time": 2.3, "type": "local", "error": null }, { "model": "pllum-12b", "output": "Generated text from PLLuM...", "time": 1.1, "type": "inference_api", "error": null } ], "total_time": 5.67 } ``` --- ### `POST /infill` Batch gap-filling for ads using a single model. Accepts texts with `[GAP:n]` markers or `___` and returns filled text with per-gap choices and alternatives. **Gap Notation:** - `[GAP:1]`, `[GAP:2]`, ... → Explicit numbered gaps (preferred) - `___` → Auto-numbered in scan order **Request:** ```json { "domain": "cars", "items": [ { "id": "ad1", "text_with_gaps": "Sprzedam [GAP:1] BMW w [GAP:2] stanie technicznym" }, { "id": "ad2", "text_with_gaps": "Auto ma ___ km przebiegu i ___ lakier" } ], "model": "bielik-1.5b", "options": { "top_n_per_gap": 3, "language": "pl", "temperature": 0.6 } } ``` **Response:** ```json { "model": "bielik-1.5b", "results": [ { "id": "ad1", "status": "ok", "filled_text": "Sprzedam eleganckie BMW w doskonałym stanie technicznym", "gaps": [ { "index": 1, "marker": "[GAP:1]", "choice": "eleganckie", "alternatives": ["piękne", "zadbane"] }, { "index": 2, "marker": "[GAP:2]", "choice": "doskonałym", "alternatives": ["bardzo dobrym", "idealnym"] } ], "error": null } ], "total_time": 3.45, "processed_count": 2, "error_count": 0 } ``` **Options:** | Field | Type | Default | Description | |-------|------|---------|-------------| | `gap_notation` | string | `"auto"` | `"auto"`, `"[GAP:n]"`, or `"___"` | | `top_n_per_gap` | int | `3` | Alternatives per gap (1-5) | | `language` | string | `"pl"` | Output language | | `temperature` | float | `0.6` | Generation temperature (0-1) | | `max_new_tokens` | int | `256` | Max tokens to generate | --- ### `POST /compare-infill` Multi-model batch gap-filling comparison for A/B testing. **Request:** ```json { "domain": "cars", "items": [ { "id": "ad1", "text_with_gaps": "Sprzedam [GAP:1] BMW w [GAP:2] stanie" } ], "models": ["bielik-1.5b", "qwen2.5-3b", "pllum-12b"], "options": { "top_n_per_gap": 3 } } ``` **Response:** ```json { "domain": "cars", "models": [ { "model": "bielik-1.5b", "type": "local", "results": [...], "time": 2.1, "error_count": 0 }, { "model": "qwen2.5-3b", "type": "local", "results": [...], "time": 1.8, "error_count": 0 } ], "total_time": 5.2 } ``` --- ## Domains Currently supported domains: | Domain | Schema Fields | |--------|---------------| | `cars` | `make`, `model`, `year`, `mileage`, `features[]`, `condition` | --- ## Environment Variables | Variable | Description | Required | |----------|-------------|----------| | `HF_TOKEN` | HuggingFace API token for Inference API | Yes (for API models) | | `LOCAL_MODEL_PATH` | Path to pre-downloaded local model | No (default: `/app/pretrain_model`) | | `FRONTEND_URL` | Frontend URL for CORS | No | ## Running Locally ```bash # Install dependencies pip install -r requirements.txt # Run server uvicorn app.main:app --reload --port 8000 ``` ## Docker ```bash # Build and run ./start_container.ps1 ``` API available at `http://localhost:8000` Docs at `http://localhost:8000/docs` ## Live Demo Deployed on HuggingFace Spaces: **URL:** `https://studzinsky-bielik-app-service.hf.space` **Quick Test:** ```bash # Health check curl https://studzinsky-bielik-app-service.hf.space/health # List models curl https://studzinsky-bielik-app-service.hf.space/models ```