CaptchaOCR / src /config.py
mohakkapoor4
Refactor .gitignore to specify checkpoint file types and exclude all but the best model. Update inference.py to use enhanced CAPTCHA generation and adjust dimensions. Increase training epochs in train.py for better model performance. Update training metrics and data generation logic in data.py for improved dataset handling and augmentation. Update config.py for dataset path consistency.
322be7d
raw
history blame
1.03 kB
import os
import string
from dataclasses import dataclass
@dataclass
class Config:
data_root: str = os.getenv("DATA_ROOT","Dataset\captchas")
chars: str = string.ascii_letters + string.digits
CAPTCHA_LEN_LOWER_LIMIT: int = 5
CAPTCHA_LEN_UPPER_LIMIT: int = 7
RESULT_DIR: str = "Results"
# Image dimensions - increased for better character detail
H: int = 60 # Increased from 48 for more vertical detail
W_max: int = 256 # Increased from 224 for more time steps (T=64)
grayscale: bool = True
# Model architecture
total_stride: int = 4 # CNN width downsampling factor
# Training hyperparameters
batch_size: int = 32 # Local testing
batch_size_t4: int = 128 # Colab T4 recommendation
num_workers: int = 4
amp: bool = True
# Learning rate and optimization
lr: float = 3e-4
weight_decay: float = 1e-4
# Training duration
epochs: int = 40 # For 100k dataset
epochs_test: int = 10 # For 1k test dataset
cfg = Config()