specification

by Nischay103 - opened May 17, 2024

Discussion

Nischay103

Owner May 17, 2024

The model is specifically designed to recognize and decipher captcha images similar to the provided example.

actively working to expand its capabilities to encompass a broader range of captcha styles and complexities.

Nischay103 changed discussion status to closed May 17, 2024

Nischay103

Owner May 23, 2024

def transform_image(image_path):
transform = T.Compose([T.ToTensor()]) # T --> torchvision.transforms object
device = 'cuda' if torch.cuda.is_available() else 'cpu'
image = cv2.imread(image_path,cv2.IMREAD_GRAYSCALE)
image = Image.fromarray(image)
image = transform(image)
image = Variable(image).to(device)
image = image.unsqueeze(1)
return image

def get_label(model_prediction,max_captcha_len=6): #currently trained on max_len = 6
_cls = string.ascii_lowercase + string.ascii_uppercase + '0123456789' + '$'
lab = ''
for idx in range(max_captcha_len):
get_char = _cls[np.argmax(model_prediction.squeeze().cpu().tolist()[ _cls_dim * idx : _cls_dim * (idx + 1)])] #_cls_dim = len(_cls)
lab += get_char
return lab

Nischay103

Owner May 23, 2024

•

edited May 23, 2024

model architecture : ResNet
Accuracy : 99.67

Nischay103

Owner May 23, 2024

predictor = torch.jit.load('captcha_model_v1.0_traced.pt')
input_path = 'captcha-cnn/test_captcha/00oLkY.jpg'
input = transform_image(input_path)
with torch.no_grad():
output = predictor(input)
o_label = get_label(output)
print(o_label)

Nischay103

Owner May 30, 2024

added more models as well as huggingface space to try out.
one can try to finetune parseq model to create generalized version of captcha prediction.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment