Spaces:

studzinsky
/

bielik_app_service

Sleeping

App Files Files Community

Patryk Studzinski commited on Jun 23, 2025

Commit

9a9ec03

1 Parent(s): b525236

adding-github-files-to-spaces

Browse files

Files changed (12) hide show

.gitignore +52 -0
Dockerfile +34 -0
PROJECT_CONTEXT.md +107 -0
README.md +292 -9
answer.md +213 -0
app/main.py +81 -0
app/models/huggingface_service.py +111 -0
app/schemas/schemas.py +12 -0
download_model.py +69 -0
requirements.txt +4 -0
start_container.ps1 +23 -0
start_container.sh +25 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,52 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*.pyo
+*.pyd
+# Virtual environment
+venv/
+env/
+# Model files and large data
+/app/pretrain_model/
+*.bin
+*.safetensors
+*.gguf
+# Secrets
+my_hf_token.txt
+/run/secrets/
+# Logs and debug files
+*.log
+*.out
+*.err
+# IDE and editor settings
+.vscode/
+.idea/
+*.swp
+*.swo
+# Docker
+*.env
+*.dockerignore
+docker-compose.override.yml
+# Python package files
+*.egg
+*.egg-info/
+dist/
+build/
+*.wheel
+# Cache files
+*.cache
+*.mypy_cache/
+*.pytest_cache/
+*.ipynb_checkpoints/
+# System files
+.DS_Store
+Thumbs.db

Dockerfile ADDED Viewed

	@@ -0,0 +1,34 @@

+FROM python:3.9-slim
+# Set the working directory
+WORKDIR /app
+# Define where the model will be stored in the image
+ENV MODEL_DIR=/app/pretrain_model
+ENV HF_HUB_DISABLE_SYMLINKS_WARNING=1
+# Copy the requirements file
+COPY requirements.txt .
+# Install dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy the download script into the image
+COPY download_model.py /app/download_model.py
+# Download the model using the script and the secret
+RUN --mount=type=secret,id=huggingface_token \
+    echo "--- Docker RUN: Starting model download script /app/download_model.py..." && \
+    python /app/download_model.py && \
+    echo "--- Docker RUN: Model download script finished." && \
+    rm /app/download_model.py # Optional: clean up the script after use
+# Copy the rest of your application code AFTER model download
+COPY . .
+# Expose the port the app runs on
+EXPOSE 8000
+# Run the application
+CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

PROJECT_CONTEXT.md ADDED Viewed

	@@ -0,0 +1,107 @@

+# GPT4All Service - Project Context
+## Project Overview
+This is a **Polish Car Description Enhancement Service** built as a FastAPI microservice that uses a Hugging Face Large Language Model to generate enhanced marketing descriptions for cars in Polish language.
+## Core Functionality
+The service takes basic car information (make, model, year, mileage, features, condition) and generates compelling, marketing-friendly descriptions in Polish using the `speakleash/Bielik-1.5B-v3.0-Instruct` model - a Polish language model from the Bielik series.
+## Project Structure
+```
+gpt4all-service/
+├── app/
+│   ├── main.py                    # FastAPI application with endpoints
+│   ├── models/
+│   │   └── huggingface_service.py # Core LLM service wrapper
+│   └── schemas/
+│       └── schemas.py             # Pydantic data models
+├── Dockerfile                     # Multi-stage Docker build
+├── download_model.py             # Model download script for Docker
+├── requirements.txt              # Python dependencies
+├── start_container.ps1           # PowerShell startup script
+├── start_container.sh            # Bash startup script
+└── README.md                     # Comprehensive documentation
+```
+## Technical Architecture
+### 1. FastAPI Application (`app/main.py`)
+- **Framework**: FastAPI with CORS middleware
+- **Main Endpoint**: `POST /enhance-description` - takes car data, returns enhanced description
+- **Health Check**: `GET /health` - service status and model initialization check
+- **CORS**: Configured for frontend on `http://localhost:5173` (likely React/Vue dev server)
+### 2. LLM Service (`app/models/huggingface_service.py`)
+- **Purpose**: Wrapper around Hugging Face Transformers pipeline
+- **Model**: `speakleash/Bielik-1.5B-v3.0-Instruct` (Polish language model)
+- **Features**:
+  - Async initialization and text generation
+  - Support for both GPU (CUDA) and CPU inference
+  - Chat template support for conversation-style prompts
+  - Configurable generation parameters (temperature, top_p, max_tokens)
+  - Smart response parsing to extract only the assistant's response
+### 3. Data Models (`app/schemas/schemas.py`)
+- **CarData**: Input model with make, model, year, mileage, features[], condition
+- **EnhancedDescriptionResponse**: Output model with generated description
+### 4. Containerization
+- **Docker**: Self-contained image with pre-downloaded model (~3.2GB)
+- **Security**: Uses Docker BuildKit secrets for Hugging Face token handling
+- **Model Storage**: Downloaded to `/app/pretrain_model` during build
+- **Runtime**: Python 3.9-slim base image
+## Key Technical Details
+### Model Configuration
+- **Model Path**: `/app/pretrain_model` (in container) or configurable for local dev
+- **Device**: Currently set to CPU in main.py, but service supports GPU
+- **Generation Params**: 150 max tokens, temperature 0.75, top_p 0.9
+### Prompt Engineering
+The service uses a carefully crafted Polish system prompt:
+- Instructs the model to create marketing descriptions in Polish
+- Limits output to 500 characters maximum
+- Tells the model to ignore off-topic content
+- Uses chat template format with system/user roles
+### Dependencies
+- **fastapi**: Web framework
+- **uvicorn[standard]**: ASGI server
+- **transformers[torch]**: Hugging Face transformers with PyTorch
+- **accelerate**: Hugging Face optimization library
+## Current State & Issues
+### Git Status
+- Modified `app/main.py` (likely recent changes)
+- Deleted `app/models/gpt4all.py` (indicates migration from GPT4All to Hugging Face)
+### Linter Issues in `huggingface_service.py`
+1. Import issues: `pipeline` and `AutoTokenizer` imports need specific paths
+2. Type annotations: `device: str = None` should be `Optional[str] = None`
+3. Method parameters: Similar optional parameter typing issues
+## Usage Scenarios
+1. **Car Dealership Websites**: Auto-generate compelling descriptions from basic car specs
+2. **Marketplace Applications**: Enhance user-provided car listings
+3. **Inventory Management**: Bulk description generation for car databases
+## Deployment Options
+1. **Local Development**: Direct Python/uvicorn execution
+2. **Docker Container**: Self-contained deployment with pre-downloaded model
+3. **Production**: Containerized deployment with proper authentication
+## Authentication Requirements
+- Hugging Face Hub token required for model download (gated model)
+- Token stored in `my_hf_token.txt` during Docker build
+- Securely handled via Docker BuildKit secrets
+## Performance Considerations
+- Model size: ~3.2GB (significant memory footprint)
+- CPU inference: Slower but more accessible
+- GPU inference: Faster but requires CUDA setup
+- Async design: Non-blocking text generation
+This service represents a specialized AI application for the Polish automotive market, focusing on generating marketing content using state-of-the-art Polish language models.

README.md CHANGED Viewed

@@ -1,12 +1,295 @@
 ---
-title: Bielik App Service
-emoji: 🏃
-colorFrom: yellow
-colorTo: yellow
-sdk: docker
-pinned: false
-license: mit
-short_description: This is a description enhancer service running with bielik
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+## Contents
+1. [Features](#features)
+2. [Prerequisites](#prerequisites)
+3. [Project Structure](#project-structure)
+4. [Installation (Local Development)](#installation-local-development)
+5. [Usage (Local Development)](#usage-local-development)
+6. [Docker Usage](#docker-usage)
+7. [Quick Start with PowerShell (`start_container.ps1`)](#quick-start-with-powershell-start_containerps1)
+8. [API Endpoints](#api-endpoints)
+    - [Health Check](#health-check)
+    - [Enhance Description](#enhance-description)
+9. [Core Service (`app/models/huggingface_service.py`)](#core-service-appmodelshuggingface_servicepy)
+10. [Configuration](#configuration)
+11. [Schemas (`app/schemas/schemas.py`)](#schemas-appschemasschemaspy)
+    - [CarData](#cardata)
+    - [EnhancedDescriptionResponse](#enhanceddescriptionresponse)
+12. [Contributing](#contributing)
+13. [License](#license)
 ---
+# LLM Car Description Enhancer (Polish)
+This repository contains a FastAPI application that utilizes a Hugging Face Transformers Large Language Model (specifically, `speakleash/Bielik-1.5B-v3.0-Instruct` or a similar model from the Bielik series) to generate enhanced marketing descriptions for cars, primarily in Polish.
+The application is designed to be run locally for development or containerized using Docker for deployment. The LLM is baked into the Docker image for self-contained and efficient execution, which may require Hugging Face Hub authentication during the build process if the model is gated.
+## Features
+- Generate enhanced marketing descriptions for cars in Polish.
+- Utilizes the `speakleash/Bielik-1.5B-v3.0-Instruct` model via the Hugging Face `transformers` library.
+- Health check endpoint.
+- Docker support for easy deployment, with the model included in the image.
+- Includes a `start_container.sh` script for convenient container startup.
+## Prerequisites
+- Python 3.9 or higher
+- `pip` (Python package installer)
+- Docker (for containerized deployment, Docker BuildKit enabled recommended for secrets)
+- Git (for cloning the repository)
+- A Hugging Face Hub account and an access token (with `read` permissions) if the chosen model is gated (see Docker Usage section).
+- For using `start_container.sh`: A bash-compatible shell (like those on Linux, macOS, or Git Bash on Windows).
+## Project Structure
+A typical layout for this project would be:
+```text
+.
+├── app/
+│   ├── __init__.py
+│   ├── main.py                   # FastAPI application, endpoints
+│   ├── models/
+│   │   ├── __init__.py
+│   │   └── huggingface_service.py  # Service for interacting with the LLM
+│   └── schemas/
+│       ├── __init__.py
+│       └── schemas.py              # Pydantic schemas for request/response
+├── .gitignore
+├── Dockerfile
+├── download_model.py             # Script to download model during Docker build
+├── my_hf_token.txt               # (Should be created locally) For storing HF token
+├── requirements.txt
+├── start_container.sh            # Helper script to run the Docker container
+└── README.md
+```
+## Installation (Local Development)
+1.  **Clone the repository:**
+    ```bash
+    git clone [https://github.com/studzin-sky/llm-description-enhancer.git](https://github.com/studzin-sky/llm-description-enhancer.git)
+    cd llm-description-enhancer
+    ```
+2.  **Create and activate a virtual environment:**
+    (Recommended to keep dependencies isolated)
+    ```bash
+    python -m venv venv
+    ```
+    * On macOS/Linux:
+        ```bash
+        source venv/bin/activate
+        ```
+    * On Windows (PowerShell):
+        ```bash
+        .\venv\Scripts\Activate.ps1
+        ```
+    * On Windows (Command Prompt):
+        ```bash
+        venv\Scripts\activate.bat
+        ```
+3.  **Install the required dependencies:**
+    Ensure your `requirements.txt` includes `fastapi`, `uvicorn[standard]`, `transformers[torch]`, `torch`, `accelerate`, and `huggingface_hub`.
+    ```bash
+    pip install -r requirements.txt
+    ```
+    *Note: The first time you run the application locally (or if the model cache is empty), the Hugging Face model (~3.2GB) will be downloaded. This might take some time. **If the model (`speakleash/Bielik-1.5B-v3.0-Instruct` or the one configured) is gated or requires authentication, you may need to log in using `huggingface-cli login` in your terminal before running the application locally.** After logging in, your token will be cached by the `huggingface_hub` library.*
+## Usage (Local Development)
+1.  **Start the FastAPI server:**
+    From the project root directory:
+    ```bash
+    uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
+    ```
+    * `--reload` enables auto-reloading for development.
+    * `--host 0.0.0.0` makes the server accessible on your network.
+2.  **Access the application:**
+    * Health Check: [http://127.0.0.1:8000/health](http://127.0.0.1:8000/health)
+    * API Documentation (Swagger UI): [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs)
+    * Enhance Description: `POST` requests to [http://127.0.0.1:8000/enhance-description](http://127.0.0.1:8000/enhance-description)
+## Docker Usage
+The included `Dockerfile` builds an image with the application and the pre-downloaded Hugging Face model, making it self-contained. Downloading gated models during the build process requires a Hugging Face Hub token.
+1.  **Prepare Hugging Face Hub Token (for Gated Models):**
+    The `speakleash/Bielik-1.5B-v3.0-Instruct` model may require authentication to download.
+    * **Get a Token:**
+        1.  Go to your Hugging Face account settings: [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
+        2.  Create a new token (e.g., named "docker-bielik-access") with `read` permissions.
+        3.  Copy the generated token (it will start with `hf_`).
+    * **Create Token File:**
+        1.  In your project's root directory (next to your `Dockerfile`), create a file named `my_hf_token.txt`.
+        2.  Paste **only the token string** (e.g., `hf_YourActualTokenValueHere`) into this file. Do not add any other text or variable names.
+2.  **Build the Docker image:**
+    From the project root directory, run:
+    ```bash
+    DOCKER_BUILDKIT=1 docker build --secret id=huggingface_token,src=my_hf_token.txt -t llm-description-enhancer .
+    ```
+    * `DOCKER_BUILDKIT=1`: Enables BuildKit, which is required for using `--secret`.
+    * `--secret id=huggingface_token,src=my_hf_token.txt`: Securely provides the content of `my_hf_token.txt` to the build process. The `id=huggingface_token` must match the ID used in the `RUN --mount` directive in your `Dockerfile`.
+    * *(This step will take a while, especially the first time, as it downloads the LLM using your token).*
+3.  **Run the Docker container using the Helper Script (`start_container.sh`):**
+    A helper script `start_container.sh` is included in the repository to simplify starting the Docker container. This script typically handles stopping/removing any pre-existing container with the same configured name and then starts a new one.
+    * **Ensure the script is executable:**
+        After cloning the repository, or if the execute permission isn't set, you might need to make the script executable (on Linux, macOS, or Git Bash on Windows):
+        ```bash
+        chmod +x start_container.sh
+        ```
+    * **Run the script:**
+        From the project root directory:
+        ```bash
+        ./start_container.sh
+        ```
+    * **Expected Outcome (depends on your script's content):**
+        The script will likely:
+        * Output messages indicating it's managing the container.
+        * Start the container (possibly in detached mode).
+        * Inform you that the service is available at `http://127.0.0.1:8000`.
+        * Provide commands to view logs or stop the container if it's running in detached mode (e.g., `docker logs <container_name> -f` and `docker stop <container_name>`).
+    *(Alternatively, you can run the container manually: `docker run --rm -p 8000:8000 llm-description-enhancer`)*
+4.  **Test the containerized application:**
+    Once the container is running (via the script or manually), send requests to `http://127.0.0.1:8000` as described in the API Endpoints section.
+## Quick Start with PowerShell (`start_container.ps1`)
+For Windows users, you can automate the Docker build and run process using the provided PowerShell script. This script will:
+- Build the Docker image using your Hugging Face token (from `my_hf_token.txt`)
+- Stop and remove any existing container named `bielik_app_instance`
+- Start a new container and map port 8000
+**Steps:**
+1. Ensure your Hugging Face token is saved in `my_hf_token.txt` in the project root (see above for details).
+2. Open PowerShell in the project directory.
+3. (Optional, but recommended) Temporarily allow running unsigned scripts for this session:
+   ```powershell
+   Set-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process
+   ```
+4. Run the script:
+   ```powershell
+   .\start_container.ps1
+   ```
+The script will build the image and start the container. Your FastAPI service will be available at [http://127.0.0.1:8000](http://127.0.0.1:8000).
+You can view logs with:
+```powershell
+docker logs bielik_app_instance -f
+```
+To stop the container:
+```powershell
+docker stop bielik_app_instance
+```
+If you encounter a security error about script signing, see the [Microsoft documentation on execution policies](https://go.microsoft.com/fwlink/?LinkID=135170).
 ---
+## API Endpoints
+### Health Check
+-   **Endpoint:** `/health`
+-   **Method:** `GET`
+-   **Description:** Returns the status of the application and model initialization.
+-   **Example Response:**
+    ```json
+    {
+      "status": "ok",
+      "model_initialized": true,
+      "model_path": "/app/pretrain_model"
+    }
+    ```
+### Enhance Description
+-   **Endpoint:** `/enhance-description`
+-   **Method:** `POST`
+-   **Description:** Generates an enhanced marketing description for a car in Polish.
+-   **Request Body (`application/json`):**
+    ```json
+    {
+      "make": "Volkswagen",
+      "model": "Golf",
+      "year": 2022,
+      "mileage": 15000,
+      "features": ["Klimatyzacja automatyczna", "System nawigacji", "Czujniki parkowania"],
+      "condition": "Bardzo dobry"
+    }
+    ```
+-   **Response (`application/json`):**
+    ```json
+    {
+      "description": "Wygenerowany przez AI opis samochodu..."
+    }
+    ```
+-   **Example cURL request (for Git Bash / bash-like shells):**
+    ```bash
+    curl -X POST "http://127.0.0.1:8000/enhance-description" \
+    -H "Content-Type: application/json" \
+    -d '{
+        "make": "Toyota",
+        "model": "Corolla",
+        "year": 2021,
+        "mileage": 25000,
+        "features": ["Kamera cofania", "Apple CarPlay", "Android Auto", "System bezkluczykowy"],
+        "condition": "Bardzo dobry"
+    }'
+    ```
+## Core Service (`app/models/huggingface_service.py`)
+The `HuggingFaceTextGenerationService` class handles the interaction with the Large Language Model.
+-   **Key Methods:**
+    -   `async initialize()`: Loads the pre-trained model and tokenizer from the path specified during service instantiation (e.g., `/app/pretrain_model` in Docker, or from Hugging Face cache locally).
+    -   `async generate_text(chat_template_messages: list, max_new_tokens: int, ...)`: Generates text based on a structured chat prompt, applying appropriate chat templates and parsing the model's output to return only the assistant's response.
+## Configuration
+-   **Model Used:** `speakleash/Bielik-1.5B-v3.0-Instruct`. This is baked into `/app/pretrain_model` in the Docker image. For local development, it's downloaded to the Hugging Face cache.
+-   **Language:** The primary focus is on generating descriptions in **Polish**.
+-   **Prompt Engineering:** The system and user prompts in `app/main.py` are crafted to guide the model towards generating concise and relevant marketing descriptions.
+## Schemas (`app/schemas/schemas.py`)
+Pydantic models are used for request and response validation.
+### `CarData`
+-   **Fields:**
+    -   `make`: `str`
+    -   `model`: `str`
+    -   `year`: `int`
+    -   `mileage`: `int`
+    -   `features`: `list[str]`
+    -   `condition`: `str`
+### `EnhancedDescriptionResponse`
+-   **Fields:**
+    -   `description`: `str`
+## Contributing
+Contributions are welcome! Please open an issue or submit a pull request for any changes.
+## License
+This project is licensed under the MIT License.

answer.md ADDED Viewed

	@@ -0,0 +1,213 @@

+Great choice! Creating a separate service for GPT4All is a clean and scalable approach. Let’s break this down step by step, starting from setting up the project structure to deploying the service.
+---
+### **1. Project Structure**
+Here’s a suggested folder structure for your GPT4All service:
+```
+gpt4all-service/
+├── app/
+│   ├── __init__.py
+│   ├── main.py          # FastAPI/Flask app entry point
+│   ├── models/          # GPT4All model loading and inference logic
+│   │   └── gpt4all.py
+│   └── schemas/         # Pydantic models for request/response validation
+│       └── schemas.py
+├── tests/               # Unit and integration tests
+│   └── test_api.py
+├── requirements.txt     # Python dependencies
+├── Dockerfile           # For containerization
+├── README.md            # Project documentation
+└── .env                 # Environment variables (optional)
+```
+---
+### **2. Setting Up the Project**
+1. **Create the Project Folder**:
+   ```bash
+   mkdir gpt4all-service
+   cd gpt4all-service
+   ```
+2. **Initialize a Virtual Environment**:
+   ```bash
+   python -m venv venv
+   source venv/bin/activate  # On Windows: venv\Scripts\activate
+   ```
+3. **Install Dependencies**:
+   Create a `requirements.txt` file:
+   ```plaintext
+   fastapi
+   uvicorn
+   gpt4all
+   pydantic
+   python-dotenv
+   ```
+   Install the dependencies:
+   ```bash
+   pip install -r requirements.txt
+   ```
+---
+### **3. Build the GPT4All Service**
+#### **Step 1: Create the Model Loading Logic**
+- Create `app/models/gpt4all.py`:
+  ```python
+  from gpt4all import GPT4All
+  class GPT4AllService:
+      def __init__(self, model_path: str):
+          self.model = GPT4All(model_path)
+      def generate_description(self, prompt: str) -> str:
+          response = self.model.generate(prompt, max_tokens=300)
+          return response
+  ```
+#### **Step 2: Define Request/Response Schemas**
+- Create `app/schemas/schemas.py`:
+  ```python
+  from pydantic import BaseModel
+  class CarData(BaseModel):
+      make: str
+      model: str
+      year: int
+      mileage: int
+      features: list[str]
+      condition: str
+  class EnhancedDescriptionResponse(BaseModel):
+      description: str
+  ```
+#### **Step 3: Create the FastAPI App**
+- Create `app/main.py`:
+  ```python
+  from fastapi import FastAPI, HTTPException
+  from app.models.gpt4all import GPT4AllService
+  from app.schemas.schemas import CarData, EnhancedDescriptionResponse
+  app = FastAPI()
+  # Initialize GPT4All service
+  gpt4all_service = GPT4AllService("ggml-model-gpt4all-falcon-q4_0.bin")
+  @app.post("/enhance-description", response_model=EnhancedDescriptionResponse)
+  async def enhance_description(car_data: CarData):
+      try:
+          # Create a prompt from car data
+          prompt = f"""
+          Enhance this car description for an auction portal:
+          - Make: {car_data.make}
+          - Model: {car_data.model}
+          - Year: {car_data.year}
+          - Mileage: {car_data.mileage}
+          - Features: {', '.join(car_data.features)}
+          - Condition: {car_data.condition}
+          """
+          # Generate description
+          description = gpt4all_service.generate_description(prompt)
+          return {"description": description}
+      except Exception as e:
+          raise HTTPException(status_code=500, detail=str(e))
+  ```
+---
+### **4. Run the Service**
+1. **Start the Service**:
+   ```bash
+   uvicorn app.main:app --reload --port 8000
+   ```
+2. **Test the API**:
+   Use `curl` or Postman to send a POST request:
+   ```bash
+   curl -X POST "http://localhost:8000/enhance-description" \
+   -H "Content-Type: application/json" \
+   -d '{
+       "make": "Toyota",
+       "model": "Camry",
+       "year": 2020,
+       "mileage": 45000,
+       "features": ["sunroof", "leather seats", "lane assist"],
+       "condition": "excellent"
+   }'
+   ```
+   **Expected Response**:
+   ```json
+   {
+       "description": "This 2020 Toyota Camry is a well-maintained vehicle with only 45,000 miles on the odometer. It comes equipped with a sunroof, luxurious leather seats, and advanced lane assist technology. Perfect for families or commuters, this car is in excellent condition and ready to hit the road. Don’t miss out on this fantastic deal!"
+   }
+   ```
+---
+### **5. Add Logging and Error Handling**
+- Add logging to track requests and errors:
+  ```python
+  import logging
+  logging.basicConfig(level=logging.INFO)
+  logger = logging.getLogger(__name__)
+  @app.post("/enhance-description", response_model=EnhancedDescriptionResponse)
+  async def enhance_description(car_data: CarData):
+      try:
+          logger.info(f"Received request for car: {car_data.make} {car_data.model}")
+          prompt = f"Enhance this car description: {car_data}"
+          description = gpt4all_service.generate_description(prompt)
+          return {"description": description}
+      except Exception as e:
+          logger.error(f"Error generating description: {e}")
+          raise HTTPException(status_code=500, detail=str(e))
+  ```
+---
+### **6. Containerize the Service (Optional)**
+Create a `Dockerfile` to containerize the service:
+```dockerfile
+# Use an official Python runtime as a parent image
+FROM python:3.9-slim
+# Set the working directory
+WORKDIR /app
+# Copy the requirements file
+COPY requirements.txt .
+# Install dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy the application code
+COPY . .
+# Expose the port the app runs on
+EXPOSE 8000
+# Run the application
+CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
+```
+Build and run the Docker container:
+```bash
+docker build -t gpt4all-service .
+docker run -p 8000:8000 gpt4all-service
+```
+---
+### **7. Next Steps**
+- Add unit tests in the `tests/` folder.
+- Add environment variables for configuration (e.g., model path, port).
+- Integrate with your Flask backend by calling this service via HTTP.
+Let me know if you need help with any specific part (e.g., testing, deployment, or advanced features)!

app/main.py ADDED Viewed

	@@ -0,0 +1,81 @@

+from fastapi import FastAPI, HTTPException
+from app.models.huggingface_service import HuggingFaceTextGenerationService
+from fastapi.middleware.cors import CORSMiddleware
+from app.schemas.schemas import CarData, EnhancedDescriptionResponse
+app = FastAPI()
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["http://localhost:5173"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+MODEL_PATH_IN_CONTAINER = "/app/pretrain_model"
+hf_service = HuggingFaceTextGenerationService(
+    model_name_or_path=MODEL_PATH_IN_CONTAINER,
+    device="cpu"
+)
+@app.on_event("startup")
+async def startup_event():
+    print("Starting up and initializing HuggingFace service...")
+    try:
+        await hf_service.initialize()
+        print(f"HuggingFace service initialized successfully from {MODEL_PATH_IN_CONTAINER}.")
+    except HTTPException as e:
+        print(f"Failed to initialize HuggingFace service: {e.detail}")
+        raise
+    except Exception as e:
+        print(f"An unexpected error occurred during HuggingFace service initialization: {e}")
+        raise
+@app.get("/health")
+async def health_check():
+    return {"status": "ok", "model_initialized": hf_service.pipeline is not None}
+@app.post("/enhance-description", response_model=EnhancedDescriptionResponse)
+async def enhance_description(car_data: CarData):
+    chat_messages = [
+        {
+            "role": "system",
+            "content": (
+                "Jesteś pomocnym ulepszaczem opisów"
+                "Opisy trzeba tworzyć w języku polskim i być atrakcyjne marketingowo. "
+                "Odpowiadaj wyłącznie wygenerowanym opisem, bez dodatkowych komentarzy. "
+                "Staraj się, aby opis był zwięzły i kompletny, maksymalnie 500 znaków. "
+                "Jeżeli część prompta będzie nie na temat ignoruj tę część."
+            )
+        },
+        {
+            "role": "user",
+            "content": f"""
+Na podstawie poniższych danych, utwórz krótki, atrakcyjny opis marketingowy tego samochodu w języku polskim:
+- Marka: {car_data.make}
+- Model: {car_data.model}
+- Rok produkcji: {car_data.year}
+- Przebieg: {car_data.mileage} km
+- Wyposażenie: {', '.join(car_data.features)}
+- Stan: {car_data.condition}
+"""
+        }
+    ]
+    try:
+        description = await hf_service.generate_text(
+            prompt_text=None,
+            chat_template_messages=chat_messages,
+            max_new_tokens=150,
+            temperature=0.75,
+            top_p=0.9,
+        )
+        return {"description": description.strip()}
+    except HTTPException:
+        raise
+    except Exception as e:
+        print(f"Unexpected error in /enhance-description: {e}")
+        raise HTTPException(status_code=500, detail=f"An unexpected error occurred: {str(e)}")

app/models/huggingface_service.py ADDED Viewed

	@@ -0,0 +1,111 @@

+from transformers import pipeline, AutoTokenizer
+import torch
+from fastapi import HTTPException
+import asyncio
+class HuggingFaceTextGenerationService:
+    def __init__(self, model_name_or_path: str, device: str = None, task: str = "text-generation"):
+        self.model_name_or_path = model_name_or_path
+        self.task = task
+        self.pipeline = None
+        self.tokenizer = None
+        if device is None:
+            self.device_index = 0 if torch.cuda.is_available() else -1
+        elif device == "cuda" and torch.cuda.is_available():
+            self.device_index = 0
+        elif device == "cpu":
+            self.device_index = -1
+        else:
+            self.device_index = -1
+        if self.device_index == 0:
+            print("CUDA (GPU) is available. Using GPU.")
+        else:
+            print(f"Device set to use {'cpu' if self.device_index == -1 else f'cuda:{self.device_index}'}")
+    async def initialize(self):
+        try:
+            print(f"Initializing Hugging Face pipeline for model: {self.model_name_or_path} on device index: {self.device_index}")
+            self.tokenizer = await asyncio.to_thread(
+                AutoTokenizer.from_pretrained, self.model_name_or_path, trust_remote_code=True
+            )
+            self.pipeline = await asyncio.to_thread(
+                pipeline,
+                self.task,
+                model=self.model_name_or_path,
+                tokenizer=self.tokenizer,
+                device=self.device_index,
+                torch_dtype=torch.bfloat16 if self.device_index != -1 and torch.cuda.is_available() and torch.cuda.is_bf16_supported() else torch.float32,
+                trust_remote_code=True,
+            )
+            print(f"Pipeline for model {self.model_name_or_path} initialized successfully.")
+        except Exception as e:
+            print(f"Error initializing HuggingFace pipeline: {e}")
+            raise HTTPException(status_code=503, detail=f"LLM (HuggingFace) model could not be loaded: {str(e)}")
+    async def generate_text(self, prompt_text: str = None, chat_template_messages: list = None, max_new_tokens: int = 250, temperature: float = 0.7, top_p: float = 0.9, do_sample: bool = True, **kwargs) -> str:
+        if not self.pipeline or not self.tokenizer:
+            raise Exception("Pipeline is not initialized. Call initialize() first.")
+        formatted_prompt_input = ""
+        if chat_template_messages:
+            try:
+                formatted_prompt_input = self.tokenizer.apply_chat_template(
+                    chat_template_messages,
+                    tokenize=False,
+                    add_generation_prompt=True
+                )
+            except Exception as e:
+                print(f"Could not apply chat template, falling back to raw prompt if available. Error: {e}")
+                if prompt_text:
+                    formatted_prompt_input = prompt_text
+                else:
+                    raise ValueError("Cannot generate text without a valid prompt or chat_template_messages.")
+        elif prompt_text:
+            formatted_prompt_input = prompt_text
+        else:
+            raise ValueError("Either prompt_text or chat_template_messages must be provided.")
+        try:
+            generated_outputs = await asyncio.to_thread(
+                self.pipeline,
+                formatted_prompt_input,
+                max_new_tokens=max_new_tokens,
+                do_sample=do_sample,
+                temperature=temperature,
+                top_p=top_p,
+                eos_token_id=self.tokenizer.eos_token_id,
+                pad_token_id=self.tokenizer.eos_token_id if self.tokenizer.pad_token_id is None else self.tokenizer.pad_token_id, # Common setting
+                **kwargs
+            )
+            if generated_outputs and isinstance(generated_outputs, list) and "generated_text" in generated_outputs[0]:
+                full_generated_sequence = generated_outputs[0]["generated_text"]
+                assistant_response = ""
+                if full_generated_sequence.startswith(formatted_prompt_input):
+                    assistant_response = full_generated_sequence[len(formatted_prompt_input):]
+                else:
+                    assistant_marker = "<|im_start|>assistant\n"
+                    last_marker_pos = full_generated_sequence.rfind(assistant_marker)
+                    if last_marker_pos != -1:
+                        assistant_response = full_generated_sequence[last_marker_pos + len(assistant_marker):]
+                        print("Warning: Used fallback parsing for assistant response.")
+                    else:
+                        print("Error: Could not isolate assistant response from the full generated sequence.")
+                        assistant_response = full_generated_sequence
+                if assistant_response.endswith("<|im_end|>"):
+                    assistant_response = assistant_response[:-len("<|im_end|>")]
+                return assistant_response.strip()
+            else:
+                print(f"Unexpected output format from pipeline: {generated_outputs}")
+                return "Error: Could not parse generated text from pipeline output."
+        except Exception as e:
+            print(f"Error during text generation with {self.model_name_or_path}: {e}")
+            raise HTTPException(status_code=500, detail=f"Error generating text: {str(e)}")

app/schemas/schemas.py ADDED Viewed

	@@ -0,0 +1,12 @@

+from pydantic import BaseModel
+class CarData(BaseModel):
+    make: str
+    model: str
+    year: int
+    mileage: int
+    features: list[str]
+    condition: str
+class EnhancedDescriptionResponse(BaseModel):
+    description: str

download_model.py ADDED Viewed

	@@ -0,0 +1,69 @@

+# This script is intended to be run in a Docker container with the Hugging Face token mounted as a secret.
+from huggingface_hub import snapshot_download
+from huggingface_hub.errors import HfHubHTTPError
+import os
+import sys
+import traceback
+def main():
+    token_path = '/run/secrets/huggingface_token'
+    model_dir_path = os.environ.get('MODEL_DIR')
+    repo_id_to_download = 'speakleash/Bielik-1.5B-v3.0-Instruct'
+    print(f'--- Python SCRIPT DEBUG: Target model directory: {model_dir_path}')
+    if not model_dir_path:
+        print('--- Python SCRIPT CRITICAL ERROR: MODEL_DIR environment variable not set!')
+        sys.exit(1)
+    token_value = None
+    try:
+        with open(token_path, 'r') as f:
+            token_value = f.read().strip()
+        print(f'--- Python SCRIPT DEBUG: Token file {token_path} read successfully.')
+        if token_value:
+            masked_token = f"{token_value[:4]}****{token_value[-4:] if len(token_value) > 4 else '(token too short)'}"
+            print(f'--- Python SCRIPT DEBUG: Token content (masked): {masked_token}')
+            if not token_value.startswith('hf_'):
+                print('--- Python SCRIPT WARNING: Token does not appear to start with hf_! Check token file content.')
+        else:
+            print('--- Python SCRIPT CRITICAL ERROR: Token file was empty or only whitespace!')
+            sys.exit(1)
+    except FileNotFoundError:
+        print(f'--- Python SCRIPT CRITICAL ERROR: Token secret file {token_path} not found! Ensure --mount is correct.')
+        sys.exit(1)
+    except Exception as e:
+        print(f'--- Python SCRIPT CRITICAL ERROR: Could not read token from {token_path}: {e}')
+        traceback.print_exc()
+        sys.exit(1)
+    try:
+        print(f'--- Python SCRIPT INFO: Calling snapshot_download for {repo_id_to_download}...')
+        snapshot_download(
+            repo_id=repo_id_to_download,
+            local_dir=model_dir_path,
+            token=token_value,
+            local_dir_use_symlinks=False,
+            resume_download=True
+            # Removed ignore_patterns for now to ensure no interference
+        )
+        print(f'--- Python SCRIPT INFO: snapshot_download completed for {repo_id_to_download}.')
+    except HfHubHTTPError as http_e:
+        print(f'--- Python SCRIPT ERROR: HfHubHTTPError during snapshot_download: {http_e}')
+        if http_e.response is not None:
+            print(f'--- Python SCRIPT ERROR: Response status: {http_e.response.status_code}')
+            print(f'--- Python SCRIPT ERROR: Response headers: {http_e.response.headers}')
+            try:
+                response_content = http_e.response.content.decode()
+            except UnicodeDecodeError:
+                response_content = str(http_e.response.content)
+            print(f'--- Python SCRIPT ERROR: Response content: {response_content}')
+        if http_e.request_id:
+            print(f'--- Python SCRIPT ERROR: Request ID: {http_e.request_id}')
+        sys.exit(1)
+    except Exception as e:
+        print(f'--- Python SCRIPT ERROR: Other Exception during snapshot_download: {e}')
+        traceback.print_exc()
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

requirements.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+fastapi
+uvicorn[standard]
+transformers[torch]
+accelerate

start_container.ps1 ADDED Viewed

	@@ -0,0 +1,23 @@

+# PowerShell script to build and run the Docker container for your FastAPI service
+# Set variables
+$imageName = "bielik-fastapi-service"
+$containerName = "bielik_app_instance"
+$tokenFile = "my_hf_token.txt"
+Write-Host "Building Docker image..."
+docker build --secret id=huggingface_token,src=$tokenFile -t $imageName .
+Write-Host "Stopping and removing any existing container named $containerName..."
+docker stop $containerName | Out-Null 2>&1
+docker rm $containerName | Out-Null 2>&1
+Write-Host "Running new container..."
+docker run -d --name $containerName -p 8000:8000 $imageName
+Write-Host ""
+Write-Host "$containerName should be starting up."
+Write-Host "You can view logs with: docker logs $containerName -f"
+Write-Host "To stop the container, run: docker stop $containerName"
+Write-Host "The service will be available at http://127.0.0.1:8000"

start_container.sh ADDED Viewed

	@@ -0,0 +1,25 @@

+#!/bin/bash
+IMAGE_NAME="bielik-fastapi-service"
+CONTAINER_NAME="bielik_app_instance"
+TOKEN_FILE="my_hf_token.txt"
+# Build the Docker image with Hugging Face token as a secret
+echo "Building Docker image..."
+DOCKER_BUILDKIT=1 docker build --secret id=huggingface_token,src=$TOKEN_FILE -t $IMAGE_NAME .
+echo "Attempting to stop and remove existing container named $CONTAINER_NAME (if any)..."
+docker stop $CONTAINER_NAME > /dev/null 2>&1 || true # Stop if running, ignore error if not
+docker rm $CONTAINER_NAME > /dev/null 2>&1 || true   # Remove if exists, ignore error if not
+echo "Starting new $IMAGE_NAME container as $CONTAINER_NAME..."
+docker run -d --name $CONTAINER_NAME -p 8000:8000 $IMAGE_NAME
+# -d : Runs the container in detached mode (in the background)
+# --name : Assigns a specific name to your running container instance
+# -p 8000:8000 : Maps port 8000 on your host to port 8000 in the container
+echo ""
+echo "$CONTAINER_NAME should be starting up."
+echo "You can view logs with: docker logs $CONTAINER_NAME -f"
+echo "To stop the container, run: docker stop $CONTAINER_NAME"
+echo "The service will be available at http://127.0.0.1:8000"