Softy-lines
/

Pixel-Digit-Classifier

Model card Files Files and versions

Pixel-Digit-Classifier / README.md

Softy-lines's picture

Update README.md

dccace1 verified almost 2 years ago

|

2.77 kB

	---
	license: apache-2.0
	language:
	- en
	metrics:
	- accuracy
	library_name: adapter-transformers
	pipeline_tag: image-to-text
	---
	# Model Card for Pixelated Captcha Digit Detection

	## Model Details

	- License: Apache-2.0
	- Developed by: Saidi Souhaieb
	- Finetuned from model: YOLOv8

	## Uses

	This model is designed to detect pixelated captcha digits by showing bounding boxes and extracting the coordinates of the detections.

	## How to Get Started with the Model

	```python
	import torch
	import torch.nn as nn
	import torch.optim as optim
	from torch.utils.data import DataLoader
	import torchvision.transforms as transforms
	from torchvision.datasets import ImageFolder
	from tqdm import tqdm
	from PIL import Image
	import torch.nn.functional as F
	import os

	class CNN(nn.Module):
	def __init__(self):
	super(CNN, self).__init__()
	self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1)
	self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
	self.conv3 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
	self.pool = nn.MaxPool2d(2, 2)
	self.fc1 = nn.Linear(64 * 4 * 4, 500)
	self.fc2 = nn.Linear(500, 10) # 10 classes for example

	def forward(self, x):
	x = self.pool(F.relu(self.conv1(x)))
	x = self.pool(F.relu(self.conv2(x)))
	x = self.pool(F.relu(self.conv3(x)))
	x = x.view(-1, 64 * 4 * 4)
	x = F.relu(self.fc1(x))
	x = self.fc2(x)
	return x

	transform = transforms.Compose([
	transforms.Resize((32, 32)), # Adjust the size accordingly
	transforms.ToTensor(),
	transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
	])

	transform = transforms.Compose([
	transforms.Resize((32, 32)), # Adjust the size accordingly
	transforms.ToTensor(),
	transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
	])

	model = CNN()
	model.load_state_dict(torch.load('models/99acc_model.pth'))

	def predict_number(folder_path):
	"""
	Predict the numbers in the images in the folder
	"""
	predict_numbers = []
	for file in os.listdir(folder_path):
	input_image = Image.open(f"temp/{file}").convert('RGB')
	# Load and preprocess the input image
	input_tensor = transform(input_image)
	input_batch = input_tensor.unsqueeze(0) # Add a batch dimension

	# Perform inference
	with torch.no_grad():
	output = model(input_batch)

	# Get the predicted class label
	_, predicted = torch.max(output, 1)

	# Print the predicted class label
	print("Predicted class label:", predicted.item(), "file", file)
	predict_numbers.append(predicted.item())

	return predict_numbers

	```

	## Training Details

	### Training Data

	Pixel Digit Captcha Data []

	## Model Card Authors

	[Saidi Souhaieb]