checkbox-classifier / README.md
wendys-llc's picture
Update README.md
4aed93e verified
---
license: apache-2.0
tags:
- image-classification
- computer-vision
- checkbox-detection
- efficientnet
datasets:
- wendys-llc/chkbx
metrics:
- accuracy
- f1
- precision
- recall
base_model: google/efficientnet-b0
model-index:
- name: checkbox-classifier-efficientnet
results:
- task:
type: image-classification
name: Image Classification
dataset:
type: wendys-llc/chkbx
name: Checkbox Detection Dataset
split: validation
metrics:
- type: accuracy
value: 0.97
name: Validation Accuracy
library_name: transformers
pipeline_tag: image-classification
---
# Checkbox State Classifier - EfficientNet-B0
A fine-tuned EfficientNet-B0 model for binary classification of checkbox states (checked/unchecked). This model achieves ~95% accuracy on UI checkbox detection.
## Model Description
This model is fine-tuned from [google/efficientnet-b0](https://huggingface.co/google/efficientnet-b0) on the [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx) dataset. It's designed to classify UI checkboxes in screenshots and interface images.
### Key Features
- **No `trust_remote_code` required** - Uses native transformers support
- **Fast inference** - EfficientNet-B0 is optimized for speed
- **High accuracy** - ~95% on validation set
- **Simple API** - Works with transformers pipeline out of the box
## Usage
### Quick Start with Pipeline (Recommended)
```python
from transformers import pipeline
from PIL import Image
# Load the model
classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")
# Classify an image
image = Image.open("checkbox.jpg")
results = classifier(image)
# Print results
for result in results:
print(f"{result['label']}: {result['score']:.2%}")
# Get just the top prediction
top_result = classifier(image, top_k=1)[0]
print(f"Checkbox is: {top_result['label']} (confidence: {top_result['score']:.2%})")
```
### Using AutoModel and AutoImageProcessor
```python
from transformers import AutoImageProcessor, AutoModelForImageClassification
import torch
from PIL import Image
# Load model and processor
processor = AutoImageProcessor.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
model = AutoModelForImageClassification.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
# Prepare image
image = Image.open("checkbox.jpg")
inputs = processor(images=image, return_tensors="pt")
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
# Get predicted class
predicted_class_idx = logits.argmax(-1).item()
predicted_label = model.config.id2label[predicted_class_idx]
# Get confidence scores
probabilities = torch.nn.functional.softmax(logits, dim=-1)
confidence = probabilities.max().item()
print(f"Prediction: {predicted_label} (confidence: {confidence:.2%})")
```
### Batch Processing
```python
from transformers import pipeline
from PIL import Image
classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")
# Process multiple images
images = [Image.open(f"checkbox_{i}.jpg") for i in range(1, 4)]
results = classifier(images)
for i, result in enumerate(results):
top_pred = result[0] # Get top prediction
print(f"Image {i+1}: {top_pred['label']} ({top_pred['score']:.2%})")
```
## Model Details
### Architecture
- **Base Model**: google/efficientnet-b0
- **Model Type**: EfficientNet for Image Classification
- **Number of Labels**: 2 (checked, unchecked)
- **Input Size**: 224x224 RGB images
- **Framework**: PyTorch via Transformers
### Training Details
- **Dataset**: [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx)
- ~4,800 training samples
- ~1,200 validation samples
- **Training Configuration**:
- Epochs: 15 (with early stopping)
- Batch Size: 64 (on A100)
- Learning Rate: Default AdamW
- Mixed Precision: FP16
- Hardware: NVIDIA A100 GPU
## Acknowledgments
- Base model: [google/efficientnet-b0](https://huggingface.co/google/efficientnet-b0)
- Dataset: [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx)
- Framework: [HuggingFace Transformers](https://github.com/huggingface/transformers)
## License
This model is licensed under the Apache 2.0 License. See the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) file for details.