Softy-lines
/

Pixel-Digit-Classifier

Model card Files Files and versions

Softy-lines commited on May 8, 2024

Commit

dccace1

·

verified ·

1 Parent(s): a5ae4fb

Update README.md

Files changed (1) hide show

README.md +104 -3

README.md CHANGED Viewed

@@ -1,3 +1,104 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+metrics:
+- accuracy
+library_name: adapter-transformers
+pipeline_tag: image-to-text
+---
+# Model Card for Pixelated Captcha Digit Detection
+## Model Details
+- **License:** Apache-2.0
+- **Developed by:** Saidi Souhaieb
+- **Finetuned from model:** YOLOv8
+## Uses
+This model is designed to detect pixelated captcha digits by showing bounding boxes and extracting the coordinates of the detections.
+## How to Get Started with the Model
+```python
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from torch.utils.data import DataLoader
+import torchvision.transforms as transforms
+from torchvision.datasets import ImageFolder
+from tqdm import tqdm
+from PIL import Image
+import torch.nn.functional as F
+import os
+class CNN(nn.Module):
+    def __init__(self):
+        super(CNN, self).__init__()
+        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1)
+        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
+        self.conv3 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
+        self.pool = nn.MaxPool2d(2, 2)
+        self.fc1 = nn.Linear(64 * 4 * 4, 500)
+        self.fc2 = nn.Linear(500, 10)  # 10 classes for example
+    def forward(self, x):
+        x = self.pool(F.relu(self.conv1(x)))
+        x = self.pool(F.relu(self.conv2(x)))
+        x = self.pool(F.relu(self.conv3(x)))
+        x = x.view(-1, 64 * 4 * 4)
+        x = F.relu(self.fc1(x))
+        x = self.fc2(x)
+        return x
+transform = transforms.Compose([
+transforms.Resize((32, 32)),  # Adjust the size accordingly
+transforms.ToTensor(),
+transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
+])
+transform = transforms.Compose([
+transforms.Resize((32, 32)),  # Adjust the size accordingly
+transforms.ToTensor(),
+transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
+])
+model = CNN()
+model.load_state_dict(torch.load('models/99acc_model.pth'))
+def predict_number(folder_path):
+  """
+  Predict the numbers in the images in the folder
+  """
+  predict_numbers = []
+  for file in os.listdir(folder_path):
+      input_image = Image.open(f"temp/{file}").convert('RGB')
+      # Load and preprocess the input image
+      input_tensor = transform(input_image)
+      input_batch = input_tensor.unsqueeze(0)  # Add a batch dimension
+      # Perform inference
+      with torch.no_grad():
+          output = model(input_batch)
+      # Get the predicted class label
+      _, predicted = torch.max(output, 1)
+      # Print the predicted class label
+      print("Predicted class label:", predicted.item(), "file", file)
+      predict_numbers.append(predicted.item())
+  return predict_numbers
+```
+## Training Details
+### Training Data
+Pixel Digit Captcha Data []
+## Model Card Authors
+[Saidi Souhaieb]