Spaces:
Starting
title: AI Car Description Enhancer
emoji: πβ¨
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false
AI Car Description Enhancer "Bielik"
Turbocharge your automotive listings! This app, powered by the Bielik Polish language model, transforms dry vehicle data into compelling, ready-to-publish marketing descriptions.
Contents
- Features
- Prerequisites
- Project Structure
- Installation (Local Development)
- Usage (Local Development)
- Docker Usage
- Quick Start with PowerShell (
start_container.ps1) - API Endpoints
- Core Service (
app/models/huggingface_service.py) - Configuration
- Schemas (
app/schemas/schemas.py) - Contributing
- License
LLM Car Description Enhancer (Polish)
This repository contains a FastAPI application that utilizes a Hugging Face Transformers Large Language Model (specifically, speakleash/Bielik-1.5B-v3.0-Instruct or a similar model from the Bielik series) to generate enhanced marketing descriptions for cars, primarily in Polish.
The application is designed to be run locally for development or containerized using Docker for deployment. The LLM is baked into the Docker image for self-contained and efficient execution, which may require Hugging Face Hub authentication during the build process if the model is gated.
Features
- Generate enhanced marketing descriptions for cars in Polish.
- Utilizes the
speakleash/Bielik-1.5B-v3.0-Instructmodel via the Hugging Facetransformerslibrary. - Health check endpoint.
- Docker support for easy deployment, with the model included in the image.
- Includes a
start_container.shscript for convenient container startup.
Prerequisites
- Python 3.9 or higher
pip(Python package installer)- Docker (for containerized deployment, Docker BuildKit enabled recommended for secrets)
- Git (for cloning the repository)
- A Hugging Face Hub account and an access token (with
readpermissions) if the chosen model is gated (see Docker Usage section). - For using
start_container.sh: A bash-compatible shell (like those on Linux, macOS, or Git Bash on Windows).
Project Structure
A typical layout for this project would be:
.
βββ app/
β βββ __init__.py
β βββ main.py # FastAPI application, endpoints
β βββ models/
β β βββ __init__.py
β β βββ huggingface_service.py # Service for interacting with the LLM
β βββ schemas/
β βββ __init__.py
β βββ schemas.py # Pydantic schemas for request/response
βββ .gitignore
βββ Dockerfile
βββ download_model.py # Script to download model during Docker build
βββ my_hf_token.txt # (Should be created locally) For storing HF token
βββ requirements.txt
βββ start_container.sh # Helper script to run the Docker container
βββ README.md
Installation (Local Development)
Clone the repository:
git clone [https://github.com/studzin-sky/llm-description-enhancer.git](https://github.com/studzin-sky/llm-description-enhancer.git) cd llm-description-enhancerCreate and activate a virtual environment: (Recommended to keep dependencies isolated)
python -m venv venv- On macOS/Linux:
source venv/bin/activate - On Windows (PowerShell):
.\venv\Scripts\Activate.ps1 - On Windows (Command Prompt):
venv\Scripts\activate.bat
- On macOS/Linux:
Install the required dependencies: Ensure your
requirements.txtincludesfastapi,uvicorn[standard],transformers[torch],torch,accelerate, andhuggingface_hub.pip install -r requirements.txtNote: The first time you run the application locally (or if the model cache is empty), the Hugging Face model (~3.2GB) will be downloaded. This might take some time. If the model (
speakleash/Bielik-1.5B-v3.0-Instructor the one configured) is gated or requires authentication, you may need to log in usinghuggingface-cli loginin your terminal before running the application locally. After logging in, your token will be cached by thehuggingface_hublibrary.
Usage (Local Development)
Start the FastAPI server: From the project root directory:
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000--reloadenables auto-reloading for development.--host 0.0.0.0makes the server accessible on your network.
Access the application:
- Health Check: http://127.0.0.1:8000/health
- API Documentation (Swagger UI): http://127.0.0.1:8000/docs
- Enhance Description:
POSTrequests to http://127.0.0.1:8000/enhance-description
Docker Usage
The included Dockerfile builds an image with the application and the pre-downloaded Hugging Face model, making it self-contained. Downloading gated models during the build process requires a Hugging Face Hub token.
Prepare Hugging Face Hub Token (for Gated Models): The
speakleash/Bielik-1.5B-v3.0-Instructmodel may require authentication to download.- Get a Token:
- Go to your Hugging Face account settings: https://huggingface.co/settings/tokens
- Create a new token (e.g., named "docker-bielik-access") with
readpermissions. - Copy the generated token (it will start with
hf_).
- Create Token File:
- In your project's root directory (next to your
Dockerfile), create a file namedmy_hf_token.txt. - Paste only the token string (e.g.,
hf_YourActualTokenValueHere) into this file. Do not add any other text or variable names.
- In your project's root directory (next to your
- Get a Token:
Build the Docker image: From the project root directory, run:
DOCKER_BUILDKIT=1 docker build --secret id=huggingface_token,src=my_hf_token.txt -t llm-description-enhancer .DOCKER_BUILDKIT=1: Enables BuildKit, which is required for using--secret.--secret id=huggingface_token,src=my_hf_token.txt: Securely provides the content ofmy_hf_token.txtto the build process. Theid=huggingface_tokenmust match the ID used in theRUN --mountdirective in yourDockerfile.- (This step will take a while, especially the first time, as it downloads the LLM using your token).
Run the Docker container using the Helper Script (
start_container.sh): A helper scriptstart_container.shis included in the repository to simplify starting the Docker container. This script typically handles stopping/removing any pre-existing container with the same configured name and then starts a new one.Ensure the script is executable: After cloning the repository, or if the execute permission isn't set, you might need to make the script executable (on Linux, macOS, or Git Bash on Windows):
chmod +x start_container.shRun the script: From the project root directory:
./start_container.shExpected Outcome (depends on your script's content): The script will likely:
- Output messages indicating it's managing the container.
- Start the container (possibly in detached mode).
- Inform you that the service is available at
http://127.0.0.1:8000. - Provide commands to view logs or stop the container if it's running in detached mode (e.g.,
docker logs <container_name> -fanddocker stop <container_name>).
(Alternatively, you can run the container manually:
docker run --rm -p 8000:8000 llm-description-enhancer)Test the containerized application: Once the container is running (via the script or manually), send requests to
http://127.0.0.1:8000as described in the API Endpoints section.
Quick Start with PowerShell (start_container.ps1)
For Windows users, you can automate the Docker build and run process using the provided PowerShell script. This script will:
- Build the Docker image using your Hugging Face token (from
my_hf_token.txt) - Stop and remove any existing container named
bielik_app_instance - Start a new container and map port 8000
Steps:
- Ensure your Hugging Face token is saved in
my_hf_token.txtin the project root (see above for details). - Open PowerShell in the project directory.
- (Optional, but recommended) Temporarily allow running unsigned scripts for this session:
Set-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process - Run the script:
.\start_container.ps1
The script will build the image and start the container. Your FastAPI service will be available at http://127.0.0.1:8000.
You can view logs with:
docker logs bielik_app_instance -f
To stop the container:
docker stop bielik_app_instance
If you encounter a security error about script signing, see the Microsoft documentation on execution policies.
API Endpoints
Health Check
- Endpoint:
/health - Method:
GET - Description: Returns the status of the application and model initialization.
- Example Response:
{ "status": "ok", "model_initialized": true, "model_path": "/app/pretrain_model" }
Enhance Description
- Endpoint:
/enhance-description - Method:
POST - Description: Generates an enhanced marketing description for a car in Polish.
- Request Body (
application/json):{ "make": "Volkswagen", "model": "Golf", "year": 2022, "mileage": 15000, "features": ["Klimatyzacja automatyczna", "System nawigacji", "Czujniki parkowania"], "condition": "Bardzo dobry" } - Response (
application/json):{ "description": "Wygenerowany przez AI opis samochodu..." } - Example cURL request (for Git Bash / bash-like shells):
curl -X POST "http://127.0.0.1:8000/enhance-description" \ -H "Content-Type: application/json" \ -d '{ "make": "Toyota", "model": "Corolla", "year": 2021, "mileage": 25000, "features": ["Kamera cofania", "Apple CarPlay", "Android Auto", "System bezkluczykowy"], "condition": "Bardzo dobry" }'
Core Service (app/models/huggingface_service.py)
The HuggingFaceTextGenerationService class handles the interaction with the Large Language Model.
- Key Methods:
async initialize(): Loads the pre-trained model and tokenizer from the path specified during service instantiation (e.g.,/app/pretrain_modelin Docker, or from Hugging Face cache locally).async generate_text(chat_template_messages: list, max_new_tokens: int, ...): Generates text based on a structured chat prompt, applying appropriate chat templates and parsing the model's output to return only the assistant's response.
Configuration
- Model Used:
speakleash/Bielik-1.5B-v3.0-Instruct. This is baked into/app/pretrain_modelin the Docker image. For local development, it's downloaded to the Hugging Face cache. - Language: The primary focus is on generating descriptions in Polish.
- Prompt Engineering: The system and user prompts in
app/main.pyare crafted to guide the model towards generating concise and relevant marketing descriptions.
Schemas (app/schemas/schemas.py)
Pydantic models are used for request and response validation.
CarData
- Fields:
make:strmodel:stryear:intmileage:intfeatures:list[str]condition:str
EnhancedDescriptionResponse
- Fields:
description:str
Contributing
Contributions are welcome! Please open an issue or submit a pull request for any changes.
License
This project is licensed under the MIT License.