---
license: apache-2.0
pipeline_tag: image-classification
library_name: transformers
datasets:
- dchen0/font_crops_v4
---

# Font Classifier DINOv2 (Server-Side Preprocessing)

A fine-tuned DINOv2 model for font classification with **built-in preprocessing**.

🎯 **Key Feature: No client-side preprocessing required!**

## Performance
- **Accuracy**: ~86% on test set
- **Preprocessing**: Automatic server-side pad-to-square + normalization

## Usage

### Simple API Usage (Recommended)

Clients can send **raw images directly** to inference endpoints:

```python
import requests
import base64

# Load your image
with open("test_image.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

# Send to inference endpoint
response = requests.post(
    "https://your-endpoint.com",
    headers={"Authorization": "Bearer YOUR_TOKEN"},
    json={"inputs": image_data}
)

results = response.json()
print(f"Predicted font: {results[0]['label']} ({results[0]['score']:.2%})")
```

### Standard HuggingFace Usage

```python
from transformers import pipeline

# The model automatically handles preprocessing
classifier = pipeline("image-classification", model="dchen0/font-classifier-v4")
results = classifier("your_image.png")
print(f"Predicted font: {results[0]['label']}")
```

### Direct Model Usage

```python
from PIL import Image
import torch
from transformers import AutoImageProcessor
from font_classifier_with_preprocessing import FontClassifierWithPreprocessing

# Load model and processor
model = FontClassifierWithPreprocessing.from_pretrained("dchen0/font-classifier-v4")
processor = AutoImageProcessor.from_pretrained("dchen0/font-classifier-v4")

# Process image (model handles pad_to_square automatically)
image = Image.open("test.png")
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
```

## Model Architecture

- **Base Model**: facebook/dinov2-base-imagenet1k-1-layer
- **Fine-tuning**: LoRA on Google Fonts dataset  
- **Labels**: 394 font families
- **Preprocessing**: Built-in pad-to-square + ImageNet normalization

## Server-Side Preprocessing

This model automatically applies the following preprocessing in its forward pass:

1. **Pad to square** preserving aspect ratio
2. **Resize** to 224×224
3. **Normalize** with ImageNet statistics

**No client-side preprocessing required** - just send raw images!

## Deployment

### HuggingFace Inference Endpoints

1. Deploy this model to an Inference Endpoint
2. Send raw images directly - preprocessing happens automatically
3. Achieve ~86% accuracy out of the box

### Custom Deployment

The model includes preprocessing in the forward pass, so any deployment (TorchServe, TensorFlow Serving, etc.) will automatically apply correct preprocessing.

## Files

- `font_classifier_with_preprocessing.py`: Custom model class with built-in preprocessing
- Standard HuggingFace model files

## Technical Details

The model inherits from `Dinov2ForImageClassification` but overrides the forward pass to include:

```python
def forward(self, pixel_values=None, labels=None, **kwargs):
    # Automatic preprocessing happens here
    processed_pixel_values = self.preprocess_images(pixel_values)
    return super().forward(pixel_values=processed_pixel_values, labels=labels, **kwargs)
```

This ensures that whether clients send raw images or pre-processed tensors, the model receives correctly formatted input.