useful_charts_table_text_images_vs_useless_images_classifier
This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the codewithaman/useful_charts_table_text_images_vs_useless_images dataset. It achieves the following results on the evaluation set:
- Loss: 0.0851
- Accuracy: 0.9853
To use the model
from transformers import ViTFeatureExtractor, ViTForImageClassification
from PIL import Image
import torch
# Define the device to run the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Inference on device: {device}")
# Load the feature extractor and model
model_name_or_path = 'codewithaman/useful_charts_table_text_images_vs_useless_images_classifier'
feature_extractor = ViTFeatureExtractor.from_pretrained(model_name_or_path)
model = ViTForImageClassification.from_pretrained(model_name_or_path).to(device)
# Load local image
def load_image_from_path(image_path):
image = Image.open(image_path)
return image.convert("RGB").resize((224, 224))
# Define the inference function
def classify_image(image):
# Prepare image for the model
inputs = feature_extractor(images=image, return_tensors="pt").to(device)
# Make prediction
with torch.no_grad():
outputs = model(**inputs)
# Extract the predicted label
predicted_class = outputs.logits.argmax(-1).item()
label = model.config.id2label[predicted_class]
return label
# Example usage
image_path = "path/to/your/image.jpg" # Replace with your local image path
image = load_image_from_path(image_path)
predicted_label = classify_image(image)
print(f"Predicted label: {predicted_label}")
Model description
This model is a Vision Transformer (ViT)-based image classifier fine-tuned on a dataset of images categorized as "useful charts with text" and "useless images." It leverages the google/vit-base-patch16-224-in21k model as its backbone, benefiting from pre-trained weights on a large corpus of general images. This architecture allows the model to capture detailed visual features that distinguish between the two classes effectively, making it particularly useful for identifying informative visual content.
The model takes an image as input and classifies it into one of the specified categories. Its feature extractor processes images into a format compatible with the ViT model, which uses self-attention to understand spatial relationships within images. The model has been optimized for accuracy in distinguishing images based on their content's relevance, focusing on high-level visual features.
Intended uses & limitations
Intended Uses
- Image Classification for Educational Content: Useful for identifying visually rich, informative charts and tables, which can assist in content moderation or educational material curation.
- Content Filtering: Can be used to filter out irrelevant or "useless" images in large datasets where only informational images are desired.
- Dataset Augmentation: Helpful in creating cleaner datasets by selecting images with specific content types, particularly in educational or training datasets.
Limitations
- Generalizability: This model is specifically fine-tuned on images labeled as either useful charts with text or useless images. It may not generalize well to other types of image classification tasks.
- Resolution and Size Constraints: The model's architecture is designed for images resized to 224x224 pixels, so images of significantly different resolutions may affect performance.
- Content-specific Accuracy: Since this model is trained on a specific dataset, it may misclassify images that do not closely resemble the training data (e.g., abstract or artistic images).
- Sensitive Information: This model does not have filters for detecting sensitive or inappropriate content; manual filtering may be required if sensitive content is expected.
Training and evaluation data
The model was trained on the codewithaman/useful_charts_table_text_images_vs_useless_images dataset from the Hugging Face Hub. The dataset contains two main classes:
- Useful Charts and Tables with Text: Images that contain structured, informative visuals like charts, graphs, and tables, often with textual information relevant for educational or informative purposes.
- Useless Images: Images that lack informative content or visual structure useful for educational or analytical purposes.
The training data includes transformations to resize and normalize images, ensuring they are compatible with the ViT model’s input requirements. The evaluation process was carried out on a validation subset, assessing model accuracy and reliability in classifying images into the appropriate categories.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 4
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 0.8814 | 0.0203 | 100 | 0.9432 | 0.7601 |
| 0.4374 | 0.0405 | 200 | 0.4927 | 0.8864 |
| 0.0042 | 0.0608 | 300 | 0.3534 | 0.9267 |
| 0.0093 | 0.0811 | 400 | 0.2335 | 0.9414 |
| 0.125 | 0.1013 | 500 | 0.3630 | 0.9286 |
| 0.4924 | 0.1216 | 600 | 0.2374 | 0.9469 |
| 0.0052 | 0.1419 | 700 | 0.2015 | 0.9487 |
| 0.3738 | 0.1621 | 800 | 0.4200 | 0.8864 |
| 0.4533 | 0.1824 | 900 | 0.2573 | 0.9286 |
| 0.027 | 0.2027 | 1000 | 0.3408 | 0.9121 |
| 0.6685 | 0.2229 | 1100 | 0.3140 | 0.8260 |
| 0.0703 | 0.2432 | 1200 | 0.2425 | 0.9322 |
| 0.9411 | 0.2635 | 1300 | 0.7809 | 0.8223 |
| 0.4378 | 0.2837 | 1400 | 0.6968 | 0.8223 |
| 0.7127 | 0.3040 | 1500 | 0.3294 | 0.8242 |
| 0.9465 | 0.3243 | 1600 | 0.4913 | 0.8223 |
| 0.3834 | 0.3445 | 1700 | 0.2594 | 0.9048 |
| 0.6691 | 0.3648 | 1800 | 0.3537 | 0.8993 |
| 0.3002 | 0.3851 | 1900 | 0.2502 | 0.9286 |
| 0.0473 | 0.4054 | 2000 | 0.2312 | 0.9322 |
| 0.634 | 0.4256 | 2100 | 0.2406 | 0.9359 |
| 0.4471 | 0.4459 | 2200 | 0.2983 | 0.9377 |
| 0.3229 | 0.4662 | 2300 | 0.3601 | 0.9212 |
| 0.4769 | 0.4864 | 2400 | 0.2990 | 0.9011 |
| 0.0135 | 0.5067 | 2500 | 0.3134 | 0.9029 |
| 0.3025 | 0.5270 | 2600 | 0.1748 | 0.9505 |
| 0.0114 | 0.5472 | 2700 | 0.2898 | 0.9212 |
| 0.1636 | 0.5675 | 2800 | 0.2281 | 0.9396 |
| 0.7427 | 0.5878 | 2900 | 0.2334 | 0.9341 |
| 0.0083 | 0.6080 | 3000 | 0.2466 | 0.9359 |
| 0.0041 | 0.6283 | 3100 | 0.2737 | 0.9432 |
| 1.7268 | 0.6486 | 3200 | 0.2626 | 0.9396 |
| 0.0115 | 0.6688 | 3300 | 0.2621 | 0.9304 |
| 0.6196 | 0.6891 | 3400 | 0.3546 | 0.9267 |
| 0.0141 | 0.7094 | 3500 | 0.2064 | 0.9505 |
| 0.006 | 0.7296 | 3600 | 0.2204 | 0.9487 |
| 0.0226 | 0.7499 | 3700 | 0.2544 | 0.9451 |
| 0.0084 | 0.7702 | 3800 | 0.1698 | 0.9542 |
| 0.0035 | 0.7904 | 3900 | 0.2541 | 0.9304 |
| 0.0137 | 0.8107 | 4000 | 0.1235 | 0.9670 |
| 0.9026 | 0.8310 | 4100 | 0.3319 | 0.9249 |
| 0.4531 | 0.8512 | 4200 | 0.2221 | 0.9414 |
| 0.0039 | 0.8715 | 4300 | 0.1823 | 0.9560 |
| 1.3298 | 0.8918 | 4400 | 0.2125 | 0.9542 |
| 0.4403 | 0.9120 | 4500 | 0.4900 | 0.8938 |
| 0.0025 | 0.9323 | 4600 | 0.3010 | 0.9249 |
| 0.0056 | 0.9526 | 4700 | 0.2978 | 0.9267 |
| 0.3642 | 0.9728 | 4800 | 0.2162 | 0.9451 |
| 0.5704 | 0.9931 | 4900 | 0.2459 | 0.9414 |
| 0.1761 | 1.0134 | 5000 | 0.1674 | 0.9652 |
| 0.0023 | 1.0336 | 5100 | 0.1855 | 0.9542 |
| 0.1477 | 1.0539 | 5200 | 0.1516 | 0.9652 |
| 0.0034 | 1.0742 | 5300 | 0.8117 | 0.7326 |
| 0.4936 | 1.0944 | 5400 | 0.2102 | 0.9377 |
| 0.0158 | 1.1147 | 5500 | 0.1886 | 0.9524 |
| 0.0041 | 1.1350 | 5600 | 0.2544 | 0.9286 |
| 0.7993 | 1.1552 | 5700 | 0.2523 | 0.9304 |
| 0.6292 | 1.1755 | 5800 | 0.1681 | 0.9451 |
| 0.0048 | 1.1958 | 5900 | 0.2746 | 0.9377 |
| 0.4908 | 1.2161 | 6000 | 0.3194 | 0.9359 |
| 0.4156 | 1.2363 | 6100 | 0.1320 | 0.9744 |
| 0.0056 | 1.2566 | 6200 | 0.3195 | 0.8993 |
| 0.0013 | 1.2769 | 6300 | 0.1581 | 0.9615 |
| 0.0027 | 1.2971 | 6400 | 0.2660 | 0.9414 |
| 0.1753 | 1.3174 | 6500 | 0.1858 | 0.9560 |
| 0.0013 | 1.3377 | 6600 | 0.2018 | 0.9615 |
| 0.0033 | 1.3579 | 6700 | 0.1475 | 0.9707 |
| 0.0037 | 1.3782 | 6800 | 0.1417 | 0.9689 |
| 1.2775 | 1.3985 | 6900 | 0.1101 | 0.9670 |
| 0.0051 | 1.4187 | 7000 | 0.1292 | 0.9707 |
| 0.4954 | 1.4390 | 7100 | 0.2473 | 0.9469 |
| 0.1533 | 1.4593 | 7200 | 0.1181 | 0.9707 |
| 0.0022 | 1.4795 | 7300 | 0.1512 | 0.9707 |
| 0.005 | 1.4998 | 7400 | 0.1329 | 0.9670 |
| 0.4396 | 1.5201 | 7500 | 0.1219 | 0.9725 |
| 0.0044 | 1.5403 | 7600 | 0.1665 | 0.9670 |
| 0.7054 | 1.5606 | 7700 | 0.1652 | 0.9670 |
| 0.4057 | 1.5809 | 7800 | 0.1683 | 0.9542 |
| 0.011 | 1.6011 | 7900 | 0.3927 | 0.9286 |
| 0.7 | 1.6214 | 8000 | 0.0999 | 0.9762 |
| 0.0026 | 1.6417 | 8100 | 0.1249 | 0.9744 |
| 0.002 | 1.6619 | 8200 | 0.1386 | 0.9615 |
| 0.0041 | 1.6822 | 8300 | 0.1175 | 0.9670 |
| 0.0034 | 1.7025 | 8400 | 0.1160 | 0.9725 |
| 0.0041 | 1.7227 | 8500 | 0.2097 | 0.9542 |
| 0.3303 | 1.7430 | 8600 | 0.1527 | 0.9597 |
| 0.006 | 1.7633 | 8700 | 0.1389 | 0.9670 |
| 0.0012 | 1.7835 | 8800 | 0.1799 | 0.9597 |
| 0.0027 | 1.8038 | 8900 | 0.1717 | 0.9615 |
| 0.4926 | 1.8241 | 9000 | 0.1517 | 0.9670 |
| 0.0023 | 1.8443 | 9100 | 0.1272 | 0.9744 |
| 0.5028 | 1.8646 | 9200 | 0.1444 | 0.9725 |
| 0.0051 | 1.8849 | 9300 | 0.1276 | 0.9744 |
| 0.0019 | 1.9051 | 9400 | 0.1550 | 0.9689 |
| 0.0052 | 1.9254 | 9500 | 0.1958 | 0.9634 |
| 0.0099 | 1.9457 | 9600 | 0.1359 | 0.9689 |
| 0.3494 | 1.9660 | 9700 | 0.1969 | 0.9542 |
| 0.0035 | 1.9862 | 9800 | 0.1671 | 0.9579 |
| 0.0025 | 2.0065 | 9900 | 0.1435 | 0.9707 |
| 0.0006 | 2.0268 | 10000 | 0.1187 | 0.9799 |
| 0.0035 | 2.0470 | 10100 | 0.1303 | 0.9780 |
| 0.7492 | 2.0673 | 10200 | 0.1294 | 0.9762 |
| 0.0154 | 2.0876 | 10300 | 0.1108 | 0.9762 |
| 0.0007 | 2.1078 | 10400 | 0.2675 | 0.9487 |
| 0.0008 | 2.1281 | 10500 | 0.1334 | 0.9689 |
| 0.003 | 2.1484 | 10600 | 0.1583 | 0.9670 |
| 0.4043 | 2.1686 | 10700 | 0.1198 | 0.9780 |
| 0.0016 | 2.1889 | 10800 | 0.1130 | 0.9799 |
| 0.0033 | 2.2092 | 10900 | 0.1102 | 0.9762 |
| 1.0287 | 2.2294 | 11000 | 0.1053 | 0.9762 |
| 0.3159 | 2.2497 | 11100 | 0.1004 | 0.9780 |
| 0.0464 | 2.2700 | 11200 | 0.1181 | 0.9762 |
| 0.002 | 2.2902 | 11300 | 0.2652 | 0.9560 |
| 0.0758 | 2.3105 | 11400 | 0.1413 | 0.9725 |
| 0.0027 | 2.3308 | 11500 | 0.2025 | 0.9451 |
| 0.0011 | 2.3510 | 11600 | 0.1372 | 0.9725 |
| 0.0009 | 2.3713 | 11700 | 0.1458 | 0.9725 |
| 0.4178 | 2.3916 | 11800 | 0.1403 | 0.9725 |
| 0.0028 | 2.4118 | 11900 | 0.1406 | 0.9725 |
| 0.0009 | 2.4321 | 12000 | 0.1295 | 0.9725 |
| 0.002 | 2.4524 | 12100 | 0.1685 | 0.9670 |
| 0.0022 | 2.4726 | 12200 | 0.1151 | 0.9744 |
| 0.0008 | 2.4929 | 12300 | 0.1635 | 0.9689 |
| 0.0035 | 2.5132 | 12400 | 0.1283 | 0.9744 |
| 0.7689 | 2.5334 | 12500 | 0.1551 | 0.9689 |
| 0.0126 | 2.5537 | 12600 | 0.1144 | 0.9762 |
| 0.0028 | 2.5740 | 12700 | 0.0919 | 0.9835 |
| 0.0053 | 2.5942 | 12800 | 0.1132 | 0.9762 |
| 0.0018 | 2.6145 | 12900 | 0.0851 | 0.9853 |
| 0.0014 | 2.6348 | 13000 | 0.1095 | 0.9780 |
| 0.0017 | 2.6550 | 13100 | 0.0878 | 0.9817 |
| 0.0014 | 2.6753 | 13200 | 0.1322 | 0.9762 |
| 0.0015 | 2.6956 | 13300 | 0.1059 | 0.9799 |
| 0.0036 | 2.7158 | 13400 | 0.0927 | 0.9817 |
| 0.0051 | 2.7361 | 13500 | 0.1009 | 0.9799 |
| 0.0028 | 2.7564 | 13600 | 0.1680 | 0.9670 |
| 0.6951 | 2.7767 | 13700 | 0.2497 | 0.9487 |
| 0.0096 | 2.7969 | 13800 | 0.1138 | 0.9780 |
| 0.5063 | 2.8172 | 13900 | 0.1151 | 0.9744 |
| 0.0026 | 2.8375 | 14000 | 0.1179 | 0.9762 |
| 0.0041 | 2.8577 | 14100 | 0.1266 | 0.9744 |
| 0.0019 | 2.8780 | 14200 | 0.0998 | 0.9780 |
| 0.0038 | 2.8983 | 14300 | 0.1290 | 0.9652 |
| 0.0131 | 2.9185 | 14400 | 0.1998 | 0.9414 |
| 0.0037 | 2.9388 | 14500 | 0.1214 | 0.9634 |
| 0.2382 | 2.9591 | 14600 | 0.1097 | 0.9780 |
| 0.0021 | 2.9793 | 14700 | 0.1152 | 0.9780 |
| 0.002 | 2.9996 | 14800 | 0.1001 | 0.9799 |
| 0.0027 | 3.0199 | 14900 | 0.1291 | 0.9780 |
| 0.971 | 3.0401 | 15000 | 0.1617 | 0.9689 |
| 0.0024 | 3.0604 | 15100 | 0.1245 | 0.9707 |
| 0.0172 | 3.0807 | 15200 | 0.1246 | 0.9725 |
| 0.0016 | 3.1009 | 15300 | 0.1628 | 0.9634 |
| 0.0016 | 3.1212 | 15400 | 0.1621 | 0.9634 |
| 0.0005 | 3.1415 | 15500 | 0.1104 | 0.9762 |
| 0.3195 | 3.1617 | 15600 | 0.1447 | 0.9725 |
| 2.3502 | 3.1820 | 15700 | 0.1827 | 0.9652 |
| 0.4252 | 3.2023 | 15800 | 0.1077 | 0.9762 |
| 0.0042 | 3.2225 | 15900 | 0.1431 | 0.9707 |
| 1.0207 | 3.2428 | 16000 | 0.1287 | 0.9744 |
| 0.5064 | 3.2631 | 16100 | 0.1663 | 0.9689 |
| 0.0018 | 3.2833 | 16200 | 0.1327 | 0.9725 |
| 0.0006 | 3.3036 | 16300 | 0.1163 | 0.9762 |
| 0.0039 | 3.3239 | 16400 | 0.1413 | 0.9725 |
| 0.5045 | 3.3441 | 16500 | 0.1572 | 0.9689 |
| 0.0069 | 3.3644 | 16600 | 0.1553 | 0.9670 |
| 0.0058 | 3.3847 | 16700 | 0.1022 | 0.9780 |
| 0.006 | 3.4049 | 16800 | 0.0993 | 0.9780 |
| 0.002 | 3.4252 | 16900 | 0.0954 | 0.9799 |
| 0.0082 | 3.4455 | 17000 | 0.0976 | 0.9762 |
| 0.0029 | 3.4657 | 17100 | 0.0978 | 0.9780 |
| 0.0008 | 3.4860 | 17200 | 0.0973 | 0.9799 |
| 0.0014 | 3.5063 | 17300 | 0.0979 | 0.9799 |
| 0.0008 | 3.5266 | 17400 | 0.1151 | 0.9744 |
| 0.0023 | 3.5468 | 17500 | 0.1093 | 0.9780 |
| 0.0012 | 3.5671 | 17600 | 0.0996 | 0.9799 |
| 0.0016 | 3.5874 | 17700 | 0.0980 | 0.9817 |
| 0.0015 | 3.6076 | 17800 | 0.1052 | 0.9799 |
| 0.0018 | 3.6279 | 17900 | 0.1054 | 0.9799 |
| 0.003 | 3.6482 | 18000 | 0.1052 | 0.9780 |
| 0.002 | 3.6684 | 18100 | 0.1063 | 0.9799 |
| 0.0011 | 3.6887 | 18200 | 0.1195 | 0.9762 |
| 0.4766 | 3.7090 | 18300 | 0.0873 | 0.9835 |
| 0.0026 | 3.7292 | 18400 | 0.0876 | 0.9835 |
| 0.0006 | 3.7495 | 18500 | 0.0942 | 0.9835 |
| 0.0014 | 3.7698 | 18600 | 0.0944 | 0.9835 |
| 0.0013 | 3.7900 | 18700 | 0.0972 | 0.9817 |
| 0.0016 | 3.8103 | 18800 | 0.1044 | 0.9817 |
| 0.0009 | 3.8306 | 18900 | 0.1039 | 0.9799 |
| 0.0008 | 3.8508 | 19000 | 0.0976 | 0.9817 |
| 0.0005 | 3.8711 | 19100 | 0.0969 | 0.9835 |
| 0.0009 | 3.8914 | 19200 | 0.0964 | 0.9835 |
| 0.0005 | 3.9116 | 19300 | 0.1020 | 0.9799 |
| 0.5488 | 3.9319 | 19400 | 0.0986 | 0.9817 |
| 0.0014 | 3.9522 | 19500 | 0.0963 | 0.9835 |
| 0.001 | 3.9724 | 19600 | 0.1037 | 0.9799 |
| 0.0009 | 3.9927 | 19700 | 0.1045 | 0.9799 |
Framework versions
- Transformers 4.46.2
- Pytorch 2.5.1+cpu
- Datasets 2.11.0
- Tokenizers 0.20.3
- Downloads last month
- 36
Model tree for codewithaman/useful_charts_table_text_images_vs_useless_images_classifier
Base model
google/vit-base-patch16-224-in21k