File size: 6,962 Bytes
76a4609 d69f325 76a4609 da0a9c7 c6f2f27 41a9673 c6f2f27 97c363a c6f2f27 97c363a c6f2f27 97c363a c6f2f27 76a4609 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
---
license: apache-2.0
datasets:
- squirelmail/dataset-BotDetect-CAPTCHA-Generator
language:
- en
metrics:
- accuracy
pipeline_tag: image-text-to-text
library_name: keras
tags:
- ocr
- captcha
- crnn
- ctc
- tensorflow
- keras
- 50x250
- uppercase
- digits
---
# Model AI For Solve BotDetect-CAPTCHA-Generator Gov ID Captcha
π§ CRNN+CTC Checkpoints
=======================
This directory contains **Keras 3** `save_weights`\-style checkpoints produced during training of a CRNN + CTC model for 5-char uppercase/digit CAPTCHA (image size `H=50`, `W=250`, grayscale).
* * *
π Contents
-----------
* `captcha_best.weights.h5` β best validation loss (auto-updated during training).
* `captcha_epNNN.weights.h5` β per-epoch snapshots (e.g., `captcha_ep001.weights.h5` β¦ `captcha_ep022.weights.h5`).
All files are _weights only_; they must be loaded into the same model architecture used in training (the tester builds that architecture for you).
* * *
β
Model Result captcha_ep022.weights.h5 => 90.91% Accuracy
-----------
```
(venv) root@prod-exploit-sa-all-01:/home/infra# date && python3 cek_model_v6.py --weights captcha_ep022.weights.h5 --data-root ./dataset_1000_rand --sample 24000 && date
Thu Oct 30 01:12:49 WITA 2025
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1761761571.108235 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
I0000 00:00:1761761571.304280 2264160 cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1761761575.452128 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
Found weights: captcha_ep022.weights.h5 | size: 27757.0 KB | mtime: Thu Oct 30 01:02:51 2025
E0000 00:00:1761761576.513960 2264160 cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)
TF GPUs: []
OK: weights loaded.
Base output shape: (None, 31, 37)
Testing on 24000 samples from ./dataset_1000_rand ...
W0000 00:00:1761761611.159498 2264160 cpu_allocator_impl.cc:84] Allocation of 1200000000 exceeds 10% of free system memory.
00 GT: 976VF | Pred: 976VF
01 GT: 7W20H | Pred: 7W20H
02 GT: UUU24 | Pred: UUU24
03 GT: 1EMVZ | Pred: 1EMVZ
04 GT: WY4RD | Pred: WY4RD
05 GT: 0GNKE | Pred: 0GNKE
06 GT: 7Y5TY | Pred: 7Y5TY
07 GT: OC8C1 | Pred: OC8C1
08 GT: 5ZIDQ | Pred: 5ZIDQ
09 GT: LP8IP | Pred: LP8IP
10 GT: AKQ7G | Pred: AKQ7G
11 GT: X23QD | Pred: X23QD
Exact match: 90.91% | Mean CER: 0.0194
Total images tested: 24000
Thu Oct 30 01:18:07 WITA 2025
```
* * *
π¦ Requirements
---------------
Install from the pinned list in the repo root:
# (recommended) fresh virtualenv
python3 -m venv venv
source venv/bin/activate
# install exact deps
pip install -r captcha_requirements.txt
**Important:** Keras/TensorFlow versions should match what was used during training. If you trained with TF/Keras nightly or dev builds, test in the same environment to avoid weight-loading shape/key mismatches.
* * *
π§ͺ How to Test
--------------
The tester script re-creates the training graph (CRNN+CTC), loads the selected checkpoint, and runs inference with the _base_ (CTC-free) submodel.
### 1) Single image
python3 check_model.py \
--weights /workspace/captcha_final.weights.h5 \
--image /workspace/dataset_500/style7/K9NO2.png
Optional ground truth override:
python3 check_model.py \
--weights /workspace/captcha_final.weights.h5 \
--image /workspace/dataset_500/style7/K9NO2.png \
--gt K9NO2
### 2) Batch from a dataset
python3 check_model.py \
--weights /home/infra/models/captcha_ep002.weights.h5 \
--data-root /datasets/dataset_500 \
--samples 64
Expected directory layout for `--data-root`:
/datasets/dataset_500/
βββ style0/
β βββ A1B2C.png
β βββ ...
βββ style1/
β βββ ...
βββ ...
βββ style59/
**Image format:** grayscale PNG, resized to `50x250` in the script.
**Labels:** derived from filename (regex `^[A-Z0-9]{5}$`).
* * *
π§© Model Details (for reference)
--------------------------------
* Backbone: 3Γ (Conv2D + BN + MaxPool), then reshape to time-steps.
* RNN head: 2Γ BiLSTM(128), `return_sequences=True`.
* Classifier: Dense(`num_classes = 36 + 1`) with softmax; `+1` is the CTC blank.
* Time steps: width is downsampled by 8 β `250/8 = 31` time steps.
The tester script internally builds both: `model_with_ctc` (training graph) and `base_model` (inference). It loads weights into the training graph and then uses `base_model` for predictions.
* * *
ποΈ CLI Options
---------------
--weights <path> : required, *.weights.h5 (same architecture)
--image <path> : test a single image
--gt <text> : ground truth for --image (default: file name)
--data-root <dir> : style0..style59 folders for batch testing
--samples N : max number of images for batch test (default 64)
--height H : input height (default 50)
--width W : input width (default 250)
--ext png|jpg : image extension for batch (default png)
--show K : print K sample predictions (default 12)
* * *
π Output
---------
* Per-sample preview lines: `GT: ABC12 | Pred: ABC12`
* Aggregate metrics:
* **Exact match** (% of predictions exactly equal to GT)
* **Mean CER** (character error rate)
* * *
π§― Troubleshooting
------------------
* **βA total of 1 objects could not be loadedβ¦ <Dense name=predictions>β**
Mismatch between Keras/TF versions or model definition. Use the same environment and architecture as training.
* **GPU not used**
Ensure a CUDA-enabled TF build and matching drivers. For server-side issues, test with:
import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))
* **NaN loss during training**
Check: label regex filtering, correct `input_length=31`, use `int32` for CTC inputs, disable LSTM dropouts when using cuDNN (set to `0.0`).
* * *
π Notes
--------
* CTC blank ID = `36` (since charset is 36 chars: 0-9 + A-Z).
* All checkpoints here are _weights only_; to export a full model, save the base model as `.keras` after loading weights in the same environment:
model_with_ctc, base_model = build_models(...)
model_with_ctc.load_weights("captcha_epXXX.weights.h5")
base_model.save("captcha_epXXX_base.keras") |