Pujan-Dev commited on
Commit
f9003ec
·
1 Parent(s): 658d299
Dockerfile ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use the official Python image as a base image
2
+ FROM python:3.11-slim
3
+
4
+ # Set the working directory in the container
5
+ WORKDIR /app
6
+
7
+ # Copy the requirements file into the container
8
+ COPY requirements.txt ./
9
+
10
+ # Install the dependencies
11
+ RUN pip install --no-cache-dir -r requirements.txt
12
+
13
+ # Copy the application code into the container
14
+ COPY . .
15
+
16
+ # Expose the port the app runs on
17
+ EXPOSE 7860
18
+
19
+ # Command to run the application
20
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
README.md ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Number Plate OCR Pipeline
2
+
3
+ A complete pipeline for detecting and recognizing vehicle number plates, supporting both **Nepali** and **English** characters with multi-line plate support.
4
+
5
+ ## Features
6
+
7
+ - 🎯 **YOLO-based plate detection** - Automatically locates number plates in images
8
+ - 📝 **Multi-line plate support** - Handles single and multi-line plate formats
9
+ - 🇳🇵 **Nepali character recognition** - Supports Devanagari script (क, ख, ग, etc.)
10
+ - 🔤 **English character recognition** - A-Z, 0-9
11
+ - 🔢 **Embossed plate support** - Works with both painted and embossed plates
12
+ - 📊 **Confidence scoring** - Each character comes with a confidence score
13
+
14
+ ## Project Structure
15
+
16
+ ```
17
+ pipeline/
18
+ ├── __init__.py # Package initialization
19
+ ├── main.py # Main pipeline entry point
20
+ ├── config/
21
+ │ ├── __init__.py
22
+ │ └── config.py # Configuration settings
23
+ ├── model/
24
+ │ ├── __init__.py
25
+ │ ├── ocr.py # OCR model definition
26
+ │ └── plate_detector.py # YOLO plate detector
27
+ ├── utils/
28
+ │ ├── __init__.py
29
+ │ └── helper.py # Helper functions
30
+ └── final_models/
31
+ └── ocr_model_em_np_eng.pth # Pre-trained OCR model
32
+ ```
33
+
34
+ ## Requirements
35
+
36
+ ```bash
37
+ pip install torch torchvision opencv-python numpy scikit-learn matplotlib ultralytics
38
+ ```
39
+
40
+ ## Usage
41
+
42
+ ### Command Line
43
+
44
+ ```bash
45
+ # Basic usage
46
+ python pipeline/main.py image.jpg
47
+
48
+ # Skip YOLO detection (process entire image)
49
+ python pipeline/main.py plate_image.jpg --no-yolo
50
+
51
+ # Save extracted characters and results
52
+ python pipeline/main.py image.jpg --save --output results.json
53
+
54
+ # Quiet mode (no progress messages)
55
+ python pipeline/main.py image.jpg -q
56
+ ```
57
+
58
+ ### Python API
59
+
60
+ ```python
61
+ from pipeline import NumberPlateOCR
62
+
63
+ # Initialize with YOLO detection
64
+ ocr = NumberPlateOCR(use_yolo=True)
65
+
66
+ # Process an image
67
+ result = ocr.process_image('car_image.jpg', show_visualization=True)
68
+
69
+ # Get the plate number
70
+ for plate in result['plates']:
71
+ print(f"Plate: {plate['singleline_text']}")
72
+ print(f"Confidence: {plate['confidence_stats']['mean']:.1%}")
73
+ ```
74
+
75
+ ### Process pre-cropped plate image
76
+
77
+ ```python
78
+ import cv2
79
+ from pipeline import NumberPlateOCR
80
+
81
+ ocr = NumberPlateOCR(use_yolo=False) # No need for YOLO
82
+
83
+ plate_img = cv2.imread('cropped_plate.jpg')
84
+ result = ocr.process_from_plate_image(plate_img)
85
+
86
+ print(result['singleline_text'])
87
+ ```
88
+
89
+ ## Configuration
90
+
91
+ Edit `config/config.py` to customize:
92
+
93
+ ```python
94
+ # OCR settings
95
+ OCR_CONFIG = {
96
+ "input_size": (128, 128),
97
+ "num_classes": 71,
98
+ }
99
+
100
+ # YOLO settings
101
+ YOLO_CONFIG = {
102
+ "confidence_threshold": 0.5,
103
+ "iou_threshold": 0.45,
104
+ }
105
+
106
+ # Character detection
107
+ CONTOUR_CONFIG = {
108
+ "min_area": 100,
109
+ "center_threshold": 15, # For duplicate removal
110
+ }
111
+
112
+ # Inference
113
+ INFERENCE_CONFIG = {
114
+ "min_confidence": 0.10, # Minimum OCR confidence
115
+ }
116
+ ```
117
+
118
+ ## Supported Characters
119
+
120
+ | Type | Characters |
121
+ |------|------------|
122
+ | Digits (English) | 0-9 |
123
+ | Digits (Nepali) | ०-९ |
124
+ | Letters (English) | A-Z |
125
+ | Nepali Text | क, को, ख, ग, च, ज, झ, ञ, डि, त, ना, प, प्र, ब, बा, भे, म, मे, य, लु, सी, सु, से, ह |
126
+ | Special | Nepali Flag |
127
+
128
+ ## Output Format
129
+
130
+ ```json
131
+ {
132
+ "image_path": "car.jpg",
133
+ "num_plates": 1,
134
+ "plates": [
135
+ {
136
+ "plate_index": 0,
137
+ "lines": ["बा २", "प ४५६७"],
138
+ "singleline_text": "बा २ प ४५६७",
139
+ "multiline_text": "बा २\nप ४५६७",
140
+ "num_lines": 2,
141
+ "total_chars": 8,
142
+ "confidence_stats": {
143
+ "mean": 0.85,
144
+ "min": 0.65,
145
+ "max": 0.98
146
+ }
147
+ }
148
+ ]
149
+ }
150
+ ```
151
+
152
+ ## Model Files
153
+
154
+ - `final_models/ocr_model_em_np_eng.pth` - OCR model (ResNet18-based)
155
+ - `../best/best.pt` - YOLO model for plate detection
156
+
157
+ ## Tips
158
+
159
+ 1. **Better accuracy**: Ensure good lighting and clear plate images
160
+ 2. **Multi-line plates**: The pipeline automatically detects line breaks
161
+ 3. **Low confidence**: Characters below 10% confidence are filtered out
162
+ 4. **Border removal**: The first 2 detected contours (usually borders) are skipped
163
+
164
+ ## Troubleshooting
165
+
166
+ **"YOLO model not found"**
167
+ - Ensure `best/best.pt` exists in the project root
168
+
169
+ **"OCR model not found"**
170
+ - Run: `cp best/ocr_model_em_np_eng.pth pipeline/final_models/`
171
+
172
+ **Low accuracy**
173
+ - Adjust `PREPROCESS_CONFIG["binary_threshold"]` in config
174
+ - Try different `CONTOUR_CONFIG["min_area"]` values
__init__.py ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ OCR Number Plate Pipeline
3
+ =========================
4
+
5
+ A complete pipeline for detecting and recognizing number plates,
6
+ supporting both Nepali and English characters.
7
+
8
+ Components:
9
+ - PlateDetector: YOLO-based number plate detection
10
+ - CharacterRecognizer: ResNet18-based OCR model
11
+ - NumberPlateOCR: Complete pipeline combining detection and recognition
12
+
13
+ Usage:
14
+ from pipeline import NumberPlateOCR
15
+
16
+ # Initialize pipeline
17
+ ocr = NumberPlateOCR(use_yolo=True)
18
+
19
+ # Process image
20
+ result = ocr.process_image('plate.jpg')
21
+ print(result['plates'][0]['singleline_text'])
22
+ """
23
+
24
+ from .main import NumberPlateOCR
25
+ from .model.ocr import CharacterRecognizer, OCRModel
26
+ from .model.plate_detector import PlateDetector, PlateDetectorLite, get_detector
27
+ from .config.config import (
28
+ OCR_CONFIG, YOLO_CONFIG, CONTOUR_CONFIG,
29
+ get_device, setup_directories
30
+ )
31
+
32
+ __version__ = "1.0.0"
33
+ __author__ = "OCR Team"
34
+
35
+ __all__ = [
36
+ 'NumberPlateOCR',
37
+ 'CharacterRecognizer',
38
+ 'OCRModel',
39
+ 'PlateDetector',
40
+ 'PlateDetectorLite',
41
+ 'get_detector',
42
+ 'OCR_CONFIG',
43
+ 'YOLO_CONFIG',
44
+ 'CONTOUR_CONFIG',
45
+ 'get_device',
46
+ 'setup_directories'
47
+ ]
__pycache__/app.cpython-311.pyc ADDED
Binary file (4.19 kB). View file
 
__pycache__/main.cpython-311.pyc ADDED
Binary file (28 kB). View file
 
app.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import fastapi
2
+ import uvicorn
3
+ from fastapi import FastAPI, UploadFile, File, HTTPException
4
+ from typing import Dict
5
+ import cv2
6
+ import numpy as np
7
+ from main import NumberPlateOCR
8
+
9
+ app = FastAPI()
10
+
11
+ # Initialize the OCR pipeline
12
+ pipeline = NumberPlateOCR(use_yolo=True, verbose=False)
13
+
14
+ @app.get("/")
15
+ def read_root():
16
+ return {"Hello": "World"}
17
+
18
+ @app.post("/ocr-only/")
19
+ async def ocr_only(file: UploadFile = File(...)) -> Dict:
20
+ """
21
+ Perform OCR on an uploaded image without YOLO detection.
22
+
23
+ Args:
24
+ file: Uploaded image file.
25
+
26
+ Returns:
27
+ JSON response with OCR results.
28
+ """
29
+ try:
30
+ # Read image from upload
31
+ contents = await file.read()
32
+ np_img = np.frombuffer(contents, np.uint8)
33
+ image = cv2.imdecode(np_img, cv2.IMREAD_COLOR)
34
+
35
+ if image is None:
36
+ raise HTTPException(status_code=400, detail="Invalid image file")
37
+
38
+ # Process the image directly (skip YOLO)
39
+ results = pipeline.process_from_plate_image(image, show_visualization=False)
40
+
41
+ return {
42
+ "lines": results["lines"],
43
+ "multiline_text": results["multiline_text"],
44
+ "singleline_text": results["singleline_text"],
45
+ "num_lines": results["num_lines"],
46
+ "total_chars": results["total_chars"],
47
+ "confidence_stats": results["confidence_stats"]
48
+ }
49
+
50
+ except Exception as e:
51
+ raise HTTPException(status_code=500, detail=str(e))
52
+
53
+
54
+
55
+ @app.post("/yolo-ocr/")
56
+ async def yolo_ocr(file: UploadFile = File(...)) -> Dict:
57
+ """
58
+ Perform YOLO-based plate detection and OCR on an uploaded image.
59
+
60
+ Args:
61
+ file: Uploaded image file.
62
+
63
+ Returns:
64
+ JSON response with OCR results.
65
+ """
66
+ try:
67
+ # Read image from upload
68
+ contents = await file.read()
69
+ np_img = np.frombuffer(contents, np.uint8)
70
+ image_path = "uploaded_image.jpg"
71
+
72
+ # Save the uploaded image temporarily
73
+ with open(image_path, "wb") as f:
74
+ f.write(contents)
75
+
76
+ # Process the image using the pipeline
77
+ results = pipeline.process_image(image_path, save_contours=False, show_visualization=False)
78
+
79
+ # Format the response
80
+ response = {
81
+ "image_path": results["image_path"],
82
+ "num_plates": results["num_plates"],
83
+ "plates": []
84
+ }
85
+
86
+ for plate in results["plates"]:
87
+ response["plates"].append({
88
+ "lines": plate["lines"],
89
+ "multiline_text": plate["multiline_text"],
90
+ "singleline_text": plate["singleline_text"],
91
+ "num_lines": plate["num_lines"],
92
+ "total_chars": plate["total_chars"],
93
+ "confidence_stats": plate["confidence_stats"]
94
+ })
95
+
96
+ return response
97
+
98
+ except Exception as e:
99
+ raise HTTPException(status_code=500, detail=str(e))
config/__init__.py ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Configuration module for OCR pipeline."""
2
+ from .config import (
3
+ BASE_DIR, PROJECT_ROOT,
4
+ YOLO_MODEL_PATH, OCR_MODEL_PATH, LABEL_MAP_PATH,
5
+ OUTPUT_DIR, CONTOURS_DIR, CONTOURS_BW_DIR, RESULTS_DIR,
6
+ OCR_CONFIG, YOLO_CONFIG, PREPROCESS_CONFIG,
7
+ CONTOUR_CONFIG, LINE_CONFIG, INFERENCE_CONFIG, VIZ_CONFIG,
8
+ get_device, setup_directories
9
+ )
10
+
11
+ __all__ = [
12
+ 'BASE_DIR', 'PROJECT_ROOT',
13
+ 'YOLO_MODEL_PATH', 'OCR_MODEL_PATH', 'LABEL_MAP_PATH',
14
+ 'OUTPUT_DIR', 'CONTOURS_DIR', 'CONTOURS_BW_DIR', 'RESULTS_DIR',
15
+ 'OCR_CONFIG', 'YOLO_CONFIG', 'PREPROCESS_CONFIG',
16
+ 'CONTOUR_CONFIG', 'LINE_CONFIG', 'INFERENCE_CONFIG', 'VIZ_CONFIG',
17
+ 'get_device', 'setup_directories'
18
+ ]
config/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (907 Bytes). View file
 
config/__pycache__/config.cpython-311.pyc ADDED
Binary file (3.3 kB). View file
 
config/config.py ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Configuration file for the OCR Number Plate Pipeline
3
+ """
4
+ import os
5
+ from pathlib import Path
6
+
7
+ # ============== PATHS ==============
8
+ BASE_DIR = Path(__file__).parent.parent
9
+ PROJECT_ROOT = BASE_DIR.parent
10
+
11
+ # Model paths
12
+ YOLO_MODEL_PATH = PROJECT_ROOT / "final" / "final_models" / "best12345.pt"
13
+ OCR_MODEL_PATH = PROJECT_ROOT / "final" / "final_models" / "ocr_model.pth"
14
+ LABEL_MAP_PATH = PROJECT_ROOT / "label_map.json"
15
+
16
+ # Output paths
17
+ OUTPUT_DIR = PROJECT_ROOT / "output"
18
+ CONTOURS_DIR = OUTPUT_DIR / "contours"
19
+ CONTOURS_BW_DIR = OUTPUT_DIR / "contours_bw"
20
+ RESULTS_DIR = OUTPUT_DIR / "results"
21
+
22
+ # ============== OCR MODEL CONFIG ==============
23
+ OCR_CONFIG = {
24
+ "input_size": (128, 128),
25
+ "num_classes": 71, # 0-9, A-Z, Nepali chars, Nepali Flag
26
+ "backbone": "resnet18",
27
+ "pretrained": False,
28
+ }
29
+
30
+ # ============== YOLO CONFIG ==============
31
+ YOLO_CONFIG = {
32
+ "confidence_threshold": 0.5,
33
+ "iou_threshold": 0.45,
34
+ "img_size": 640,
35
+ "device": "auto", # "auto", "cuda", "cpu"
36
+ }
37
+
38
+ # ============== PREPROCESSING CONFIG ==============
39
+ PREPROCESS_CONFIG = {
40
+ "clahe_clip_limit": 2.0,
41
+ "clahe_grid_size": (8, 8),
42
+ "gaussian_blur_kernel": (3, 3),
43
+ "binary_threshold": 180,
44
+ "otsu_threshold": True,
45
+ }
46
+
47
+ # ============== CONTOUR DETECTION CONFIG ==============
48
+ CONTOUR_CONFIG = {
49
+ "min_area": 60,
50
+ "min_width": 3,
51
+ "min_height": 2,
52
+ "min_aspect_ratio": 0.08,
53
+ "max_aspect_ratio": 4.0,
54
+ "max_width_ratio": 0.55,
55
+ "max_height_ratio": 0.95,
56
+ "min_height_rel_median": 0.45,
57
+ "max_height_rel_median": 2.2,
58
+ "min_width_rel_median": 0.35,
59
+ "max_width_rel_median": 2.4,
60
+ "min_area_rel_median": 0.20,
61
+ "max_area_rel_median": 3.8,
62
+ "padding": 1,
63
+ "center_threshold": 15, # For removing overlapping contours
64
+ "skip_borders": 0, # Legacy key (no blind skipping)
65
+ "remove_edge_artifacts": True,
66
+ "edge_margin": 2,
67
+ }
68
+
69
+ # ============== LINE GROUPING CONFIG ==============
70
+ LINE_CONFIG = {
71
+ "y_threshold_ratio": 0.5, # Ratio of average height for line grouping
72
+ "default_y_threshold": 20,
73
+ }
74
+
75
+ # ============== OCR INFERENCE CONFIG ==============
76
+ INFERENCE_CONFIG = {
77
+ "min_confidence": 0.10, # 10% minimum confidence
78
+ "batch_size": 16,
79
+ }
80
+
81
+ # ============== VISUALIZATION CONFIG ==============
82
+ VIZ_CONFIG = {
83
+ "show_plots": True,
84
+ "save_results": True,
85
+ "figure_size": (12, 6),
86
+ "font_size": 9,
87
+ "max_cols": 10,
88
+ }
89
+
90
+ # ============== DEVICE CONFIGURATION ==============
91
+ def get_device():
92
+ """Auto-detect the best available device."""
93
+ import torch
94
+ if YOLO_CONFIG["device"] == "auto":
95
+ return torch.device("cuda" if torch.cuda.is_available() else "cpu")
96
+ return torch.device(YOLO_CONFIG["device"])
97
+
98
+ # ============== CREATE DIRECTORIES ==============
99
+ def setup_directories():
100
+ """Create necessary output directories."""
101
+ for dir_path in [OUTPUT_DIR, CONTOURS_DIR, CONTOURS_BW_DIR, RESULTS_DIR]:
102
+ dir_path.mkdir(parents=True, exist_ok=True)
demo.py ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Demo script for the Number Plate OCR Pipeline.
4
+
5
+ This script demonstrates:
6
+ 1. Full pipeline with YOLO detection
7
+ 2. Processing pre-cropped plate images
8
+ 3. Batch processing multiple images
9
+ """
10
+
11
+ import cv2
12
+ import os
13
+ import sys
14
+ from pathlib import Path
15
+
16
+ # Add pipeline to path
17
+ sys.path.insert(0, str(Path(__file__).parent))
18
+
19
+ from main import NumberPlateOCR
20
+
21
+
22
+ def demo_full_pipeline(image_path: str):
23
+ """Demo: Full pipeline with YOLO plate detection."""
24
+ print("\n" + "="*60)
25
+ print("DEMO 1: Full Pipeline with YOLO Detection")
26
+ print("="*60)
27
+
28
+ # Initialize with YOLO
29
+ pipeline = NumberPlateOCR(use_yolo=True, verbose=True)
30
+
31
+ # Process image
32
+ result = pipeline.process_image(
33
+ image_path,
34
+ save_contours=True,
35
+ show_visualization=True
36
+ )
37
+
38
+ return result
39
+
40
+
41
+ def demo_cropped_plate(plate_image_path: str):
42
+ """Demo: Process a pre-cropped plate image."""
43
+ print("\n" + "="*60)
44
+ print("DEMO 2: Pre-cropped Plate Image (No YOLO)")
45
+ print("="*60)
46
+
47
+ # Initialize without YOLO
48
+ pipeline = NumberPlateOCR(use_yolo=False, verbose=True)
49
+
50
+ # Load cropped plate
51
+ plate_img = cv2.imread(plate_image_path)
52
+
53
+ if plate_img is None:
54
+ print(f"Error: Could not load image: {plate_image_path}")
55
+ return None
56
+
57
+ # Process plate directly
58
+ result = pipeline.process_from_plate_image(
59
+ plate_img,
60
+ show_visualization=True
61
+ )
62
+
63
+ print(f"\nRecognized text: {result['singleline_text']}")
64
+ return result
65
+
66
+
67
+ def demo_batch_processing(image_dir: str):
68
+ """Demo: Batch process multiple images."""
69
+ print("\n" + "="*60)
70
+ print("DEMO 3: Batch Processing")
71
+ print("="*60)
72
+
73
+ # Get all images
74
+ image_extensions = {'.jpg', '.jpeg', '.png', '.bmp'}
75
+ images = [f for f in Path(image_dir).iterdir()
76
+ if f.suffix.lower() in image_extensions]
77
+
78
+ if not images:
79
+ print(f"No images found in: {image_dir}")
80
+ return []
81
+
82
+ print(f"Found {len(images)} images")
83
+
84
+ # Initialize pipeline once
85
+ pipeline = NumberPlateOCR(use_yolo=True, verbose=False)
86
+
87
+ results = []
88
+ for img_path in images:
89
+ print(f"\nProcessing: {img_path.name}")
90
+ try:
91
+ result = pipeline.process_image(
92
+ str(img_path),
93
+ show_visualization=False
94
+ )
95
+
96
+ for plate in result['plates']:
97
+ print(f" → {plate['singleline_text']}")
98
+
99
+ results.append(result)
100
+ except Exception as e:
101
+ print(f" Error: {e}")
102
+
103
+ return results
104
+
105
+
106
+ def demo_api_usage():
107
+ """Demo: Show various API usage patterns."""
108
+ print("\n" + "="*60)
109
+ print("DEMO 4: API Usage Examples")
110
+ print("="*60)
111
+
112
+ print("""
113
+ # Example 1: Basic usage
114
+ from pipeline import NumberPlateOCR
115
+ ocr = NumberPlateOCR()
116
+ result = ocr.process_image('car.jpg')
117
+ print(result['plates'][0]['singleline_text'])
118
+
119
+ # Example 2: Just the OCR model
120
+ from pipeline import CharacterRecognizer
121
+ rec = CharacterRecognizer(
122
+ model_path='pipeline/final_models/ocr_model_em_np_eng.pth',
123
+ label_map_path='label_map.json'
124
+ )
125
+ char, conf, img = rec.predict(char_image)
126
+ print(f"{char}: {conf:.1%}")
127
+
128
+ # Example 3: Just the plate detector
129
+ from pipeline import PlateDetector
130
+ detector = PlateDetector()
131
+ detections = detector.detect('car.jpg')
132
+ for det in detections:
133
+ print(f"Plate at {det['bbox']} with {det['confidence']:.1%} confidence")
134
+
135
+ # Example 4: Top-k predictions
136
+ predictions = rec.get_top_k_predictions(char_image, k=5)
137
+ for char, conf in predictions:
138
+ print(f" {char}: {conf:.1%}")
139
+ """)
140
+
141
+
142
+ if __name__ == "__main__":
143
+ print("="*60)
144
+ print("Number Plate OCR Pipeline - Demo")
145
+ print("="*60)
146
+
147
+ # Check for test images
148
+ project_root = Path(__file__).parent.parent
149
+
150
+ # Look for test images
151
+ test_images = list(project_root.glob("*.png")) + list(project_root.glob("*.jpg"))
152
+
153
+ if test_images:
154
+ test_image = str(test_images[0])
155
+ print(f"\nUsing test image: {test_image}")
156
+
157
+ # Run demo 1
158
+ demo_full_pipeline(test_image)
159
+ else:
160
+ print("\nNo test images found. Please provide an image path.")
161
+ print("Usage: python demo.py <image_path>")
162
+
163
+ # Show API examples
164
+ demo_api_usage()
final_models/best12345.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:99e5a27144d4768b13e63b553253dab37db83cf5b8e08576d3a97bcb40047601
3
+ size 5362053
main.py ADDED
@@ -0,0 +1,542 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ ============== COMPLETE OCR PIPELINE (Multi-Line Support) ==============
4
+
5
+ This pipeline combines:
6
+ 1. YOLO-based number plate detection
7
+ 2. Character segmentation using contour detection
8
+ 3. OCR using a ResNet18-based model
9
+ 4. Multi-line plate support (for Nepali plates)
10
+
11
+ Usage:
12
+ python main.py <image_path>
13
+ python main.py <image_path> --no-yolo # Skip YOLO detection
14
+ python main.py <image_path> --save # Save results
15
+ """
16
+
17
+ import cv2
18
+ import numpy as np
19
+ import matplotlib.pyplot as plt
20
+ import argparse
21
+ import os
22
+ from pathlib import Path
23
+ from typing import List, Dict, Optional, Tuple
24
+ import json
25
+
26
+ # Local imports
27
+ from config.config import (
28
+ CONTOUR_CONFIG, INFERENCE_CONFIG, VIZ_CONFIG,
29
+ OCR_MODEL_PATH, LABEL_MAP_PATH, YOLO_MODEL_PATH,
30
+ setup_directories, get_device, RESULTS_DIR, CONTOURS_BW_DIR
31
+ )
32
+ from model.ocr import CharacterRecognizer
33
+ from model.plate_detector import get_detector
34
+ from utils.helper import (
35
+ detect_contours, filter_contours_by_size, extract_roi,
36
+ convert_to_binary, remove_overlapping_centers,
37
+ group_contours_by_line, format_plate_number,
38
+ draw_detections, calculate_confidence_stats, save_contour_images
39
+ )
40
+
41
+
42
+ class NumberPlateOCR:
43
+ """
44
+ Complete Number Plate OCR Pipeline.
45
+
46
+ Supports:
47
+ - YOLO-based plate detection (optional)
48
+ - Multi-line plate recognition
49
+ - Nepali and English characters
50
+ - Embossed number plates
51
+ """
52
+
53
+ def __init__(self, use_yolo: bool = True, verbose: bool = True):
54
+ """
55
+ Initialize the OCR pipeline.
56
+
57
+ Args:
58
+ use_yolo: Whether to use YOLO for plate detection
59
+ verbose: Print progress messages
60
+ """
61
+ self.verbose = verbose
62
+ self.device = get_device()
63
+
64
+ # Setup directories
65
+ setup_directories()
66
+
67
+ # Initialize OCR model
68
+ self._log("Loading OCR model...")
69
+ self.ocr = CharacterRecognizer(
70
+ model_path=str(OCR_MODEL_PATH),
71
+ label_map_path=str(LABEL_MAP_PATH),
72
+ device=self.device
73
+ )
74
+
75
+ # Initialize plate detector
76
+ self.use_yolo = use_yolo
77
+ if use_yolo:
78
+ self._log("Loading YOLO plate detector...")
79
+ self.detector = get_detector(use_yolo=True, model_path=str(YOLO_MODEL_PATH))
80
+ else:
81
+ self.detector = None
82
+
83
+ self._log("✓ Pipeline initialized successfully!")
84
+
85
+ @staticmethod
86
+ def _is_nepali_token(token: str) -> bool:
87
+ """Check if token is Nepali (Devanagari) or Nepali-specific label."""
88
+ if not token:
89
+ return False
90
+ if token == "Nepali Flag":
91
+ return True
92
+ return any('\u0900' <= ch <= '\u097F' for ch in token)
93
+
94
+ @staticmethod
95
+ def _is_english_token(token: str) -> bool:
96
+ """Check if token is plain English alphanumeric."""
97
+ if not token:
98
+ return False
99
+ return all(('0' <= ch <= '9') or ('A' <= ch <= 'Z') or ('a' <= ch <= 'z') for ch in token)
100
+
101
+ @staticmethod
102
+ def _english_digit_to_nepali(token: str) -> str:
103
+ """Convert English digits to Nepali digits (keeps non-digits unchanged)."""
104
+ digit_map = str.maketrans("0123456789", "०१२३४५६७८९")
105
+ return token.translate(digit_map)
106
+
107
+ def _apply_nepali_dominant_correction(self, line_results: List[Dict]):
108
+ """
109
+ If a line is predominantly Nepali, replace English predictions using
110
+ next Nepali top-k prediction from OCR model.
111
+ """
112
+ if not line_results:
113
+ return
114
+
115
+ nepali_count = sum(1 for r in line_results if self._is_nepali_token(r['char']))
116
+ english_count = sum(1 for r in line_results if self._is_english_token(r['char']))
117
+
118
+ if nepali_count <= english_count:
119
+ return
120
+
121
+ for r in line_results:
122
+ curr_char = r['char']
123
+ if not self._is_english_token(curr_char):
124
+ continue
125
+
126
+ replacement_char = None
127
+ replacement_conf = None
128
+
129
+ top_k = self.ocr.get_top_k_predictions(r['_roi_bw'], k=5)
130
+ for candidate_char, candidate_conf in top_k[1:]:
131
+ if self._is_nepali_token(candidate_char):
132
+ replacement_char = candidate_char
133
+ replacement_conf = candidate_conf
134
+ break
135
+
136
+ if replacement_char is None and any(ch.isdigit() for ch in curr_char):
137
+ replacement_char = self._english_digit_to_nepali(curr_char)
138
+ replacement_conf = r['conf']
139
+
140
+ if replacement_char is not None:
141
+ r['char'] = replacement_char
142
+ r['conf'] = float(replacement_conf)
143
+
144
+ def _log(self, message: str):
145
+ """Print log message if verbose."""
146
+ if self.verbose:
147
+ print(message)
148
+
149
+ def process_image(self, image_path: str,
150
+ save_contours: bool = False,
151
+ show_visualization: bool = True) -> Dict:
152
+ """
153
+ Process an image and extract plate number.
154
+
155
+ Args:
156
+ image_path: Path to input image
157
+ save_contours: Whether to save extracted character images
158
+ show_visualization: Whether to display matplotlib visualizations
159
+
160
+ Returns:
161
+ Dict with recognition results
162
+ """
163
+ # Load image
164
+ self._log(f"\n{'='*60}")
165
+ self._log(f"Processing: {image_path}")
166
+ self._log(f"{'='*60}")
167
+
168
+ orig_image = cv2.imread(image_path)
169
+ gray_image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
170
+
171
+ if orig_image is None:
172
+ raise ValueError(f"Could not load image: {image_path}")
173
+
174
+ # Step 1: Detect plates (optional YOLO step)
175
+ if self.use_yolo and self.detector:
176
+ self._log("\n📍 Step 1: Detecting number plates with YOLO...")
177
+ plates = self._detect_plates(orig_image)
178
+
179
+ if not plates:
180
+ self._log("⚠ No plates detected by YOLO, processing full image...")
181
+ plates = [{'plate_image': orig_image, 'bbox': None, 'confidence': 1.0}]
182
+ else:
183
+ self._log("\n📍 Step 1: Using full image (YOLO disabled)...")
184
+ plates = [{'plate_image': orig_image, 'bbox': None, 'confidence': 1.0}]
185
+
186
+ # Process each detected plate
187
+ all_results = []
188
+ for plate_idx, plate_data in enumerate(plates):
189
+ self._log(f"\n📋 Processing Plate {plate_idx + 1}/{len(plates)}")
190
+
191
+ plate_img = plate_data['plate_image']
192
+ plate_gray = cv2.cvtColor(plate_img, cv2.COLOR_BGR2GRAY) if len(plate_img.shape) == 3 else plate_img
193
+
194
+ # Step 2: Extract character contours
195
+ self._log("📍 Step 2: Detecting character contours...")
196
+ contours = self._extract_contours(plate_gray, plate_img)
197
+
198
+ if not contours:
199
+ self._log("⚠ No characters detected in plate")
200
+ continue
201
+
202
+ # Save contours if requested
203
+ if save_contours:
204
+ self._log(f" Saving contour images to {CONTOURS_BW_DIR}")
205
+ save_contour_images(contours, plate_img, str(CONTOURS_BW_DIR))
206
+
207
+ # Step 3: Group by lines
208
+ self._log("📍 Step 3: Grouping characters by lines...")
209
+ lines = group_contours_by_line(contours)
210
+ self._log(f" Detected {len(lines)} line(s)")
211
+ for i, line in enumerate(lines):
212
+ self._log(f" Line {i+1}: {len(line)} characters")
213
+
214
+ # Step 4: Run OCR
215
+ self._log("📍 Step 4: Running OCR on characters...")
216
+ ocr_results = self._run_ocr(lines, plate_img)
217
+
218
+ # Step 5: Format results
219
+ formatted = format_plate_number(lines, ocr_results)
220
+ confidence_stats = calculate_confidence_stats(ocr_results)
221
+
222
+ result = {
223
+ 'plate_index': plate_idx,
224
+ 'plate_bbox': plate_data['bbox'],
225
+ 'plate_confidence': plate_data.get('confidence', 1.0),
226
+ 'plate_image': plate_img,
227
+ 'lines': formatted['lines'],
228
+ 'multiline_text': formatted['multiline'],
229
+ 'singleline_text': formatted['singleline'],
230
+ 'num_lines': formatted['num_lines'],
231
+ 'total_chars': formatted['total_chars'],
232
+ 'details': formatted['details'],
233
+ 'confidence_stats': confidence_stats,
234
+ 'raw_ocr_results': ocr_results
235
+ }
236
+ all_results.append(result)
237
+
238
+ # Visualize
239
+ if show_visualization:
240
+ self._visualize_plate(plate_img, lines, ocr_results, plate_idx)
241
+
242
+ # Print final summary
243
+ self._print_results(all_results)
244
+
245
+ return {
246
+ 'image_path': image_path,
247
+ 'num_plates': len(all_results),
248
+ 'plates': all_results
249
+ }
250
+
251
+ def _detect_plates(self, image: np.ndarray) -> List[Dict]:
252
+ """Detect plates using YOLO."""
253
+ detections = self.detector.detect(image)
254
+
255
+ self._log(f" Found {len(detections)} plate(s)")
256
+ for i, det in enumerate(detections):
257
+ self._log(f" Plate {i+1}: confidence={det['confidence']:.2%}")
258
+
259
+ return detections
260
+
261
+ def _extract_contours(self, gray_image: np.ndarray,
262
+ color_image: np.ndarray) -> List[Dict]:
263
+ """Extract and filter character contours."""
264
+
265
+ # Detect contours
266
+ contours, hierarchy, thresh = detect_contours(gray_image)
267
+ self._log(f" Total contours found: {len(contours)}")
268
+
269
+ # Filter by size
270
+ filtered = filter_contours_by_size(contours, gray_image.shape)
271
+ self._log(f" After size filter: {len(filtered)}")
272
+
273
+ # Sort by x position
274
+ sorted_contours = sorted(filtered, key=lambda c: (c['x'], c['y']))
275
+
276
+ # Remove only true edge artifacts (do not blindly drop first contours)
277
+ remove_edge_artifacts = CONTOUR_CONFIG.get("remove_edge_artifacts", True)
278
+ edge_margin = CONTOUR_CONFIG.get("edge_margin", 2)
279
+ if remove_edge_artifacts and len(sorted_contours) > 4:
280
+ image_h, image_w = gray_image.shape[:2]
281
+ non_edge_contours = [
282
+ c for c in sorted_contours
283
+ if (
284
+ c['x'] > edge_margin and
285
+ c['y'] > edge_margin and
286
+ (c['x'] + c['w']) < (image_w - edge_margin) and
287
+ (c['y'] + c['h']) < (image_h - edge_margin)
288
+ )
289
+ ]
290
+
291
+ # Keep edge filtering only if it does not remove too many candidates
292
+ if len(non_edge_contours) >= max(3, int(0.6 * len(sorted_contours))):
293
+ sorted_contours = non_edge_contours
294
+ self._log(f" After edge-artifact filter: {len(sorted_contours)}")
295
+
296
+ # Extract ROI for each contour
297
+ for c in sorted_contours:
298
+ roi = extract_roi(color_image, c)
299
+ c['roi_bw'] = convert_to_binary(roi)
300
+
301
+ # Remove overlapping centers (like inner hole of '0')
302
+ final_contours = remove_overlapping_centers(sorted_contours, verbose=self.verbose)
303
+ removed = len(sorted_contours) - len(final_contours)
304
+ if removed > 0:
305
+ self._log(f" Removed {removed} overlapping contours")
306
+
307
+ return final_contours
308
+
309
+ def _run_ocr(self, lines: List[List[Dict]],
310
+ plate_image: np.ndarray) -> List[List[Dict]]:
311
+ """Run OCR on grouped character lines."""
312
+
313
+ min_confidence = INFERENCE_CONFIG["min_confidence"]
314
+ results_by_line = []
315
+
316
+ for line_idx, line in enumerate(lines):
317
+ line_results = []
318
+
319
+ for c in line:
320
+ char, conf, processed_img = self.ocr.predict(c['roi_bw'])
321
+
322
+ if conf > min_confidence:
323
+ line_results.append({
324
+ 'char': char,
325
+ 'conf': conf,
326
+ 'x': c['x'],
327
+ 'y': c['y'],
328
+ 'w': c['w'],
329
+ 'h': c['h'],
330
+ 'processed_img': processed_img,
331
+ '_roi_bw': c['roi_bw']
332
+ })
333
+
334
+ self._apply_nepali_dominant_correction(line_results)
335
+
336
+ for r in line_results:
337
+ r.pop('_roi_bw', None)
338
+
339
+ results_by_line.append(line_results)
340
+
341
+ total_chars = sum(len(line) for line in results_by_line)
342
+ self._log(f" Characters with confidence > {min_confidence*100:.0f}%: {total_chars}")
343
+
344
+ return results_by_line
345
+
346
+ def _visualize_plate(self, plate_image: np.ndarray,
347
+ lines: List[List[Dict]],
348
+ ocr_results: List[List[Dict]],
349
+ plate_idx: int):
350
+ """Visualize OCR results."""
351
+
352
+ if not VIZ_CONFIG["show_plots"]:
353
+ return
354
+
355
+ # Show original plate
356
+ plt.figure(figsize=VIZ_CONFIG["figure_size"])
357
+ plt.imshow(cv2.cvtColor(plate_image, cv2.COLOR_BGR2RGB))
358
+ plt.title(f'Plate {plate_idx + 1} - {len(lines)} Line(s) Detected')
359
+ plt.axis('off')
360
+ plt.show()
361
+
362
+ # Show OCR results for each line
363
+ for line_idx, line_results in enumerate(ocr_results):
364
+ n = len(line_results)
365
+ if n > 0:
366
+ cols = min(VIZ_CONFIG["max_cols"], n)
367
+ rows = (n + cols - 1) // cols
368
+
369
+ fig, axes = plt.subplots(rows, cols, figsize=(cols*1.5, rows*2))
370
+ axes = np.array(axes).reshape(-1) if n > 1 else [axes]
371
+
372
+ for i, r in enumerate(line_results):
373
+ axes[i].imshow(r['processed_img'], cmap='gray')
374
+ axes[i].set_title(f'"{r["char"]}" ({r["conf"]:.0%})',
375
+ fontsize=VIZ_CONFIG["font_size"])
376
+ axes[i].axis('off')
377
+
378
+ # Hide empty subplots
379
+ for i in range(n, len(axes)):
380
+ axes[i].axis('off')
381
+
382
+ line_text = "".join([r['char'] for r in line_results])
383
+ plt.suptitle(f'Line {line_idx+1}: "{line_text}"', fontsize=12)
384
+ plt.tight_layout()
385
+ plt.show()
386
+
387
+ def _print_results(self, results: List[Dict]):
388
+ """Print formatted results."""
389
+
390
+ print("\n" + "="*60)
391
+ print("📋 PLATE NUMBER RECOGNITION RESULTS")
392
+ print("="*60)
393
+
394
+ for result in results:
395
+ plate_idx = result['plate_index'] + 1
396
+
397
+ print(f"\n🏷️ PLATE {plate_idx}:")
398
+ print("-"*40)
399
+
400
+ for line_detail in result['details']:
401
+ print(f"\n 📌 Line {line_detail['line_num']}:")
402
+ for i, char_info in enumerate(line_detail['characters']):
403
+ print(f" {i+1}. '{char_info['char']}' ({char_info['conf']:.1%})")
404
+ print(f" → Result: {line_detail['text']}")
405
+
406
+ # Final result
407
+ print("\n" + "-"*40)
408
+ if result['num_lines'] > 1:
409
+ print(" Multi-line format:")
410
+ for i, line in enumerate(result['lines']):
411
+ print(f" Line {i+1}: {line}")
412
+ print(f"\n Single-line: {result['singleline_text']}")
413
+ else:
414
+ text = result['lines'][0] if result['lines'] else 'No characters detected'
415
+ print(f" Result: {text}")
416
+
417
+ # Confidence stats
418
+ stats = result['confidence_stats']
419
+ print(f"\n Confidence: avg={stats['mean']:.1%}, min={stats['min']:.1%}, max={stats['max']:.1%}")
420
+
421
+ print("\n" + "="*60)
422
+
423
+ def process_from_plate_image(self, plate_image: np.ndarray,
424
+ show_visualization: bool = True) -> Dict:
425
+ """
426
+ Process a pre-cropped plate image (skip YOLO detection).
427
+
428
+ Args:
429
+ plate_image: Cropped plate image (BGR)
430
+ show_visualization: Whether to show plots
431
+
432
+ Returns:
433
+ Recognition result dict
434
+ """
435
+ plate_gray = cv2.cvtColor(plate_image, cv2.COLOR_BGR2GRAY) if len(plate_image.shape) == 3 else plate_image
436
+
437
+ # Extract contours
438
+ contours = self._extract_contours(plate_gray, plate_image)
439
+
440
+ if not contours:
441
+ return {'lines': [], 'singleline_text': '', 'total_chars': 0}
442
+
443
+ # Group by lines
444
+ lines = group_contours_by_line(contours)
445
+
446
+ # Run OCR
447
+ ocr_results = self._run_ocr(lines, plate_image)
448
+
449
+ # Format results
450
+ formatted = format_plate_number(lines, ocr_results)
451
+
452
+ if show_visualization:
453
+ self._visualize_plate(plate_image, lines, ocr_results, 0)
454
+
455
+ return {
456
+ 'lines': formatted['lines'],
457
+ 'multiline_text': formatted['multiline'],
458
+ 'singleline_text': formatted['singleline'],
459
+ 'num_lines': formatted['num_lines'],
460
+ 'total_chars': formatted['total_chars'],
461
+ 'details': formatted['details'],
462
+ 'confidence_stats': calculate_confidence_stats(ocr_results)
463
+ }
464
+
465
+
466
+ def main():
467
+ """Main entry point."""
468
+ parser = argparse.ArgumentParser(
469
+ description="Number Plate OCR Pipeline",
470
+ formatter_class=argparse.RawDescriptionHelpFormatter,
471
+ epilog="""
472
+ Examples:
473
+ python main.py image.jpg
474
+ python main.py image.jpg --no-yolo
475
+ python main.py image.jpg --save --no-viz
476
+ python main.py image.jpg --output results.json
477
+ """
478
+ )
479
+
480
+ parser.add_argument('image', type=str, help='Path to input image')
481
+ parser.add_argument('--no-yolo', action='store_true',
482
+ help='Skip YOLO plate detection')
483
+ parser.add_argument('--save', action='store_true',
484
+ help='Save extracted character images')
485
+ parser.add_argument('--no-viz', action='store_true',
486
+ help='Disable visualization')
487
+ parser.add_argument('--output', '-o', type=str,
488
+ help='Save results to JSON file')
489
+ parser.add_argument('--quiet', '-q', action='store_true',
490
+ help='Suppress progress messages')
491
+
492
+ args = parser.parse_args()
493
+
494
+ # Validate input
495
+ if not os.path.exists(args.image):
496
+ print(f"Error: Image not found: {args.image}")
497
+ return 1
498
+
499
+ # Initialize pipeline
500
+ pipeline = NumberPlateOCR(
501
+ use_yolo=not args.no_yolo,
502
+ verbose=not args.quiet
503
+ )
504
+
505
+ # Process image
506
+ results = pipeline.process_image(
507
+ args.image,
508
+ save_contours=args.save,
509
+ show_visualization=not args.no_viz
510
+ )
511
+
512
+ # Save results if requested
513
+ if args.output:
514
+ # Remove non-serializable items
515
+ save_results = {
516
+ 'image_path': results['image_path'],
517
+ 'num_plates': results['num_plates'],
518
+ 'plates': []
519
+ }
520
+
521
+ for plate in results['plates']:
522
+ save_plate = {
523
+ 'plate_index': plate['plate_index'],
524
+ 'plate_bbox': plate['plate_bbox'],
525
+ 'lines': plate['lines'],
526
+ 'multiline_text': plate['multiline_text'],
527
+ 'singleline_text': plate['singleline_text'],
528
+ 'num_lines': plate['num_lines'],
529
+ 'total_chars': plate['total_chars'],
530
+ 'confidence_stats': plate['confidence_stats']
531
+ }
532
+ save_results['plates'].append(save_plate)
533
+
534
+ with open(args.output, 'w', encoding='utf-8') as f:
535
+ json.dump(save_results, f, indent=2, ensure_ascii=False)
536
+ print(f"\n✓ Results saved to: {args.output}")
537
+
538
+ return 0
539
+
540
+
541
+ if __name__ == "__main__":
542
+ exit(main())
model/__init__.py ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ """Model modules for OCR pipeline."""
2
+ from .ocr import OCRModel, CharacterRecognizer
3
+ from .plate_detector import PlateDetector, PlateDetectorLite, get_detector
4
+
5
+ __all__ = ['OCRModel', 'CharacterRecognizer', 'PlateDetector', 'PlateDetectorLite', 'get_detector']
model/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (496 Bytes). View file
 
model/__pycache__/ocr.cpython-311.pyc ADDED
Binary file (11.9 kB). View file
 
model/__pycache__/plate_detector.cpython-311.pyc ADDED
Binary file (13.9 kB). View file
 
model/ocr.py ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ OCR Model Definition and Inference for Number Plate Character Recognition
3
+ """
4
+ import torch
5
+ import torch.nn as nn
6
+ import numpy as np
7
+ from torchvision import models
8
+ import json
9
+ from sklearn.preprocessing import LabelEncoder
10
+ from pathlib import Path
11
+ import cv2
12
+
13
+ import sys
14
+ sys.path.append(str(Path(__file__).parent.parent))
15
+ from config.config import OCR_CONFIG, PREPROCESS_CONFIG, get_device
16
+
17
+
18
+ class OCRModel(nn.Module):
19
+ """
20
+ ResNet18-based OCR model for character recognition.
21
+ Supports grayscale input images.
22
+ """
23
+ def __init__(self, num_classes: int):
24
+ super(OCRModel, self).__init__()
25
+
26
+ # Use ResNet18 as backbone
27
+ self.features = models.resnet18(pretrained=OCR_CONFIG.get("pretrained", False))
28
+
29
+ # Modify first conv layer to accept single channel (grayscale)
30
+ self.features.conv1 = nn.Conv2d(
31
+ 1, 64, kernel_size=7, stride=2, padding=3, bias=False
32
+ )
33
+
34
+ # Remove the original FC layer
35
+ self.features.fc = nn.Identity()
36
+
37
+ # Custom classifier head
38
+ self.classifier = nn.Sequential(
39
+ nn.Linear(512, 256),
40
+ nn.ReLU(inplace=True),
41
+ nn.Dropout(0.5),
42
+ nn.Linear(256, num_classes)
43
+ )
44
+
45
+ def forward(self, x: torch.Tensor) -> torch.Tensor:
46
+ features = self.features(x)
47
+ return self.classifier(features)
48
+
49
+
50
+ class CharacterRecognizer:
51
+ """
52
+ High-level wrapper for character recognition.
53
+ Handles model loading, preprocessing, and inference.
54
+ """
55
+ def __init__(self, model_path: str, label_map_path: str, device: torch.device = None):
56
+ self.device = device or get_device()
57
+ self.model_path = Path(model_path)
58
+ self.label_map_path = Path(label_map_path)
59
+
60
+ # Load label map
61
+ self._load_label_map()
62
+
63
+ # Initialize and load model
64
+ self._load_model()
65
+
66
+ # Setup CLAHE
67
+ self.clahe = cv2.createCLAHE(
68
+ clipLimit=PREPROCESS_CONFIG["clahe_clip_limit"],
69
+ tileGridSize=PREPROCESS_CONFIG["clahe_grid_size"]
70
+ )
71
+
72
+ def _load_label_map(self):
73
+ """Load label map from JSON file."""
74
+ with open(self.label_map_path, 'r', encoding='utf-8') as f:
75
+ self.label_map = json.load(f)
76
+
77
+ self.num_classes = len(self.label_map)
78
+
79
+ # Setup label encoder
80
+ self.label_encoder = LabelEncoder()
81
+ self.label_encoder.classes_ = np.array([
82
+ self.label_map[str(i)] for i in range(self.num_classes)
83
+ ])
84
+
85
+ def _load_model(self):
86
+ """Load trained model weights."""
87
+ self.model = OCRModel(self.num_classes).to(self.device)
88
+ self.model.load_state_dict(
89
+ torch.load(self.model_path, map_location=self.device)
90
+ )
91
+ self.model.eval()
92
+ print(f"✓ OCR Model loaded on: {self.device}")
93
+
94
+ def preprocess(self, img_region: np.ndarray) -> tuple:
95
+ """
96
+ Preprocess image region for OCR.
97
+
98
+ Args:
99
+ img_region: Grayscale image region (numpy array)
100
+
101
+ Returns:
102
+ Tuple of (tensor, preprocessed_image)
103
+ """
104
+ input_size = OCR_CONFIG["input_size"]
105
+
106
+ # Resize to model input size
107
+ img_resized = cv2.resize(img_region, input_size)
108
+
109
+ # Apply CLAHE (Contrast Limited Adaptive Histogram Equalization)
110
+ img_eq = self.clahe.apply(img_resized)
111
+
112
+ # Apply Gaussian blur to reduce noise
113
+ img_blur = cv2.GaussianBlur(
114
+ img_eq, PREPROCESS_CONFIG["gaussian_blur_kernel"], 0
115
+ )
116
+
117
+ # Convert to tensor and normalize
118
+ img_tensor = torch.from_numpy(img_blur).unsqueeze(0).unsqueeze(0).float() / 255.0
119
+ img_tensor = img_tensor.to(self.device)
120
+
121
+ return img_tensor, img_blur
122
+
123
+ def predict(self, img_region: np.ndarray) -> tuple:
124
+ """
125
+ Perform OCR on a single image region.
126
+
127
+ Args:
128
+ img_region: Grayscale image region
129
+
130
+ Returns:
131
+ Tuple of (predicted_char, confidence, preprocessed_image)
132
+ """
133
+ img_tensor, preprocessed_img = self.preprocess(img_region)
134
+
135
+ with torch.no_grad():
136
+ output = self.model(img_tensor)
137
+ predicted_index = output.argmax(dim=1).item()
138
+ confidence = torch.softmax(output, dim=1).max().item()
139
+
140
+ predicted_char = self.label_encoder.inverse_transform([predicted_index])[0]
141
+
142
+ return predicted_char, confidence, preprocessed_img
143
+
144
+ def predict_batch(self, img_regions: list) -> list:
145
+ """
146
+ Perform OCR on multiple image regions.
147
+
148
+ Args:
149
+ img_regions: List of grayscale image regions
150
+
151
+ Returns:
152
+ List of (predicted_char, confidence, preprocessed_image) tuples
153
+ """
154
+ if not img_regions:
155
+ return []
156
+
157
+ # Preprocess all images
158
+ tensors = []
159
+ preprocessed_imgs = []
160
+ for img in img_regions:
161
+ tensor, preprocessed = self.preprocess(img)
162
+ tensors.append(tensor)
163
+ preprocessed_imgs.append(preprocessed)
164
+
165
+ # Stack tensors for batch inference
166
+ batch_tensor = torch.cat(tensors, dim=0)
167
+
168
+ with torch.no_grad():
169
+ outputs = self.model(batch_tensor)
170
+ predicted_indices = outputs.argmax(dim=1).cpu().numpy()
171
+ confidences = torch.softmax(outputs, dim=1).max(dim=1).values.cpu().numpy()
172
+
173
+ # Decode predictions
174
+ predicted_chars = self.label_encoder.inverse_transform(predicted_indices)
175
+
176
+ return list(zip(predicted_chars, confidences, preprocessed_imgs))
177
+
178
+ def get_top_k_predictions(self, img_region: np.ndarray, k: int = 5) -> list:
179
+ """
180
+ Get top-k predictions with confidence scores.
181
+
182
+ Args:
183
+ img_region: Grayscale image region
184
+ k: Number of top predictions to return
185
+
186
+ Returns:
187
+ List of (char, confidence) tuples
188
+ """
189
+ img_tensor, _ = self.preprocess(img_region)
190
+
191
+ with torch.no_grad():
192
+ output = self.model(img_tensor)
193
+ probs = torch.softmax(output, dim=1)[0]
194
+ top_k = torch.topk(probs, k)
195
+
196
+ results = []
197
+ for idx, conf in zip(top_k.indices.cpu().numpy(), top_k.values.cpu().numpy()):
198
+ char = self.label_encoder.inverse_transform([idx])[0]
199
+ results.append((char, float(conf)))
200
+
201
+ return results
202
+
model/plate_detector.py ADDED
@@ -0,0 +1,321 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ YOLO-based Number Plate Detection Module
3
+ """
4
+ import cv2
5
+ import numpy as np
6
+ from pathlib import Path
7
+ from typing import List, Dict, Optional, Union, Tuple
8
+ import torch
9
+
10
+ import sys
11
+ sys.path.append(str(Path(__file__).parent.parent))
12
+ from config.config import YOLO_CONFIG, YOLO_MODEL_PATH, get_device
13
+
14
+
15
+ class PlateDetector:
16
+ """
17
+ YOLO-based number plate detector.
18
+ Detects number plates in images and returns bounding boxes.
19
+ """
20
+
21
+ def __init__(self, model_path: str = None, device: torch.device = None):
22
+ """
23
+ Initialize the plate detector.
24
+
25
+ Args:
26
+ model_path: Path to YOLO model weights (default from config)
27
+ device: Torch device for inference
28
+ """
29
+ self.model_path = Path(model_path) if model_path else YOLO_MODEL_PATH
30
+ self.device = device or get_device()
31
+ self.model = None
32
+
33
+ self._load_model()
34
+
35
+ def _load_model(self):
36
+ """Load YOLO model."""
37
+ try:
38
+ from ultralytics import YOLO
39
+
40
+ if not self.model_path.exists():
41
+ raise FileNotFoundError(f"YOLO model not found at: {self.model_path}")
42
+
43
+ self.model = YOLO(str(self.model_path))
44
+
45
+ # Set device
46
+ device_str = "cuda" if self.device.type == "cuda" else "cpu"
47
+ self.model.to(device_str)
48
+
49
+ print(f"✓ YOLO Plate Detector loaded on: {self.device}")
50
+
51
+ except ImportError:
52
+ raise ImportError("ultralytics package is required. Install with: pip install ultralytics")
53
+
54
+ def detect(self, image: Union[str, np.ndarray],
55
+ conf_threshold: float = None,
56
+ iou_threshold: float = None) -> List[Dict]:
57
+ """
58
+ Detect number plates in an image.
59
+
60
+ Args:
61
+ image: Image path or numpy array (BGR format)
62
+ conf_threshold: Confidence threshold (default from config)
63
+ iou_threshold: IOU threshold for NMS (default from config)
64
+
65
+ Returns:
66
+ List of detection dicts with keys: 'bbox', 'confidence', 'class', 'plate_image'
67
+ """
68
+ if conf_threshold is None:
69
+ conf_threshold = YOLO_CONFIG["confidence_threshold"]
70
+ if iou_threshold is None:
71
+ iou_threshold = YOLO_CONFIG["iou_threshold"]
72
+
73
+ # Load image if path provided
74
+ if isinstance(image, str):
75
+ img = cv2.imread(image)
76
+ if img is None:
77
+ raise ValueError(f"Could not load image: {image}")
78
+ else:
79
+ img = image.copy()
80
+
81
+ # Run inference
82
+ results = self.model(
83
+ img,
84
+ conf=conf_threshold,
85
+ iou=iou_threshold,
86
+ verbose=False
87
+ )
88
+
89
+ # Parse results
90
+ detections = []
91
+ for result in results:
92
+ boxes = result.boxes
93
+
94
+ if boxes is None or len(boxes) == 0:
95
+ continue
96
+
97
+ for i in range(len(boxes)):
98
+ # Get bounding box coordinates (x1, y1, x2, y2)
99
+ bbox = boxes.xyxy[i].cpu().numpy().astype(int)
100
+ conf = float(boxes.conf[i].cpu().numpy())
101
+ cls = int(boxes.cls[i].cpu().numpy()) if boxes.cls is not None else 0
102
+
103
+ x1, y1, x2, y2 = bbox
104
+
105
+ # Extract plate region
106
+ plate_img = img[y1:y2, x1:x2].copy()
107
+
108
+ detections.append({
109
+ 'bbox': {
110
+ 'x1': int(x1),
111
+ 'y1': int(y1),
112
+ 'x2': int(x2),
113
+ 'y2': int(y2),
114
+ 'width': int(x2 - x1),
115
+ 'height': int(y2 - y1)
116
+ },
117
+ 'confidence': conf,
118
+ 'class': cls,
119
+ 'plate_image': plate_img
120
+ })
121
+
122
+ return detections
123
+
124
+ def detect_and_crop(self, image: Union[str, np.ndarray],
125
+ expand_ratio: float = 0.1) -> List[np.ndarray]:
126
+ """
127
+ Detect plates and return cropped plate images.
128
+
129
+ Args:
130
+ image: Image path or numpy array
131
+ expand_ratio: Ratio to expand bounding box (default 10%)
132
+
133
+ Returns:
134
+ List of cropped plate images
135
+ """
136
+ detections = self.detect(image)
137
+
138
+ plates = []
139
+ for det in detections:
140
+ bbox = det['bbox']
141
+
142
+ if expand_ratio > 0:
143
+ # Calculate expansion
144
+ w_expand = int(bbox['width'] * expand_ratio)
145
+ h_expand = int(bbox['height'] * expand_ratio)
146
+
147
+ # Load original image
148
+ if isinstance(image, str):
149
+ img = cv2.imread(image)
150
+ else:
151
+ img = image
152
+
153
+ h, w = img.shape[:2]
154
+
155
+ # Expanded coordinates
156
+ x1 = max(0, bbox['x1'] - w_expand)
157
+ y1 = max(0, bbox['y1'] - h_expand)
158
+ x2 = min(w, bbox['x2'] + w_expand)
159
+ y2 = min(h, bbox['y2'] + h_expand)
160
+
161
+ plates.append(img[y1:y2, x1:x2].copy())
162
+ else:
163
+ plates.append(det['plate_image'])
164
+
165
+ return plates
166
+
167
+ def draw_detections(self, image: Union[str, np.ndarray],
168
+ detections: List[Dict] = None,
169
+ color: Tuple[int, int, int] = (0, 255, 0),
170
+ thickness: int = 2) -> np.ndarray:
171
+ """
172
+ Draw detection boxes on image.
173
+
174
+ Args:
175
+ image: Image path or numpy array
176
+ detections: List of detections (if None, will detect)
177
+ color: Box color in BGR
178
+ thickness: Line thickness
179
+
180
+ Returns:
181
+ Annotated image
182
+ """
183
+ # Load image
184
+ if isinstance(image, str):
185
+ img = cv2.imread(image)
186
+ else:
187
+ img = image.copy()
188
+
189
+ # Detect if not provided
190
+ if detections is None:
191
+ detections = self.detect(img)
192
+
193
+ for det in detections:
194
+ bbox = det['bbox']
195
+ conf = det['confidence']
196
+
197
+ # Draw rectangle
198
+ cv2.rectangle(
199
+ img,
200
+ (bbox['x1'], bbox['y1']),
201
+ (bbox['x2'], bbox['y2']),
202
+ color,
203
+ thickness
204
+ )
205
+
206
+ # Draw label
207
+ label = f"Plate: {conf:.2%}"
208
+ cv2.putText(
209
+ img, label,
210
+ (bbox['x1'], bbox['y1'] - 10),
211
+ cv2.FONT_HERSHEY_SIMPLEX,
212
+ 0.6, color, 2
213
+ )
214
+
215
+ return img
216
+
217
+
218
+ class PlateDetectorLite:
219
+ """
220
+ Lightweight plate detector using OpenCV (no YOLO).
221
+ Uses image processing techniques to find plate regions.
222
+ Useful when YOLO is not available.
223
+ """
224
+
225
+ def __init__(self):
226
+ """Initialize the lite detector."""
227
+ print("✓ PlateDetectorLite initialized (OpenCV-based)")
228
+
229
+ def detect(self, image: Union[str, np.ndarray],
230
+ min_area: int = 5000,
231
+ aspect_ratio_range: Tuple[float, float] = (1.5, 5.0)) -> List[Dict]:
232
+ """
233
+ Detect potential plate regions using edge detection.
234
+
235
+ Args:
236
+ image: Image path or numpy array
237
+ min_area: Minimum contour area
238
+ aspect_ratio_range: (min, max) aspect ratio for plates
239
+
240
+ Returns:
241
+ List of detection dicts
242
+ """
243
+ # Load image
244
+ if isinstance(image, str):
245
+ img = cv2.imread(image)
246
+ else:
247
+ img = image.copy()
248
+
249
+ # Convert to grayscale
250
+ gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
251
+
252
+ # Apply bilateral filter to reduce noise while keeping edges sharp
253
+ blur = cv2.bilateralFilter(gray, 11, 17, 17)
254
+
255
+ # Edge detection
256
+ edges = cv2.Canny(blur, 30, 200)
257
+
258
+ # Find contours
259
+ contours, _ = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
260
+
261
+ # Sort by area (largest first)
262
+ contours = sorted(contours, key=cv2.contourArea, reverse=True)[:20]
263
+
264
+ detections = []
265
+ for cnt in contours:
266
+ area = cv2.contourArea(cnt)
267
+ if area < min_area:
268
+ continue
269
+
270
+ # Approximate the contour
271
+ peri = cv2.arcLength(cnt, True)
272
+ approx = cv2.approxPolyDP(cnt, 0.02 * peri, True)
273
+
274
+ # Looking for rectangles (4 corners)
275
+ if len(approx) >= 4:
276
+ x, y, w, h = cv2.boundingRect(cnt)
277
+ aspect_ratio = w / h if h > 0 else 0
278
+
279
+ # Check if aspect ratio matches plate dimensions
280
+ if aspect_ratio_range[0] <= aspect_ratio <= aspect_ratio_range[1]:
281
+ plate_img = img[y:y+h, x:x+w].copy()
282
+
283
+ detections.append({
284
+ 'bbox': {
285
+ 'x1': x, 'y1': y,
286
+ 'x2': x + w, 'y2': y + h,
287
+ 'width': w, 'height': h
288
+ },
289
+ 'confidence': 0.5, # Estimated confidence
290
+ 'class': 0,
291
+ 'plate_image': plate_img
292
+ })
293
+
294
+ return detections
295
+
296
+ def detect_and_crop(self, image: Union[str, np.ndarray]) -> List[np.ndarray]:
297
+ """Get cropped plate images."""
298
+ detections = self.detect(image)
299
+ return [det['plate_image'] for det in detections]
300
+
301
+
302
+ def get_detector(use_yolo: bool = True, model_path: str = None) -> Union[PlateDetector, PlateDetectorLite]:
303
+ """
304
+ Factory function to get appropriate detector.
305
+
306
+ Args:
307
+ use_yolo: Whether to use YOLO detector
308
+ model_path: Path to YOLO model
309
+
310
+ Returns:
311
+ Detector instance
312
+ """
313
+ if use_yolo:
314
+ try:
315
+ return PlateDetector(model_path)
316
+ except (ImportError, FileNotFoundError) as e:
317
+ print(f"⚠ YOLO not available: {e}")
318
+ print(" Falling back to PlateDetectorLite")
319
+ return PlateDetectorLite()
320
+ else:
321
+ return PlateDetectorLite()
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ fastapi
2
+ uvicorn
3
+ opencv-python-headless
4
+ numpy
5
+ torchvision
6
+ torch
utils/__init__.py ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Helper utilities for OCR pipeline."""
2
+ from .helper import (
3
+ detect_contours,
4
+ filter_contours_by_size,
5
+ extract_roi,
6
+ convert_to_binary,
7
+ remove_overlapping_centers,
8
+ group_contours_by_line,
9
+ format_plate_number,
10
+ draw_detections,
11
+ calculate_confidence_stats,
12
+ save_contour_images,
13
+ preprocess_plate_image,
14
+ resize_with_aspect_ratio,
15
+ validate_plate_format
16
+ )
17
+
18
+ __all__ = [
19
+ 'detect_contours',
20
+ 'filter_contours_by_size',
21
+ 'extract_roi',
22
+ 'convert_to_binary',
23
+ 'remove_overlapping_centers',
24
+ 'group_contours_by_line',
25
+ 'format_plate_number',
26
+ 'draw_detections',
27
+ 'calculate_confidence_stats',
28
+ 'save_contour_images',
29
+ 'preprocess_plate_image',
30
+ 'resize_with_aspect_ratio',
31
+ 'validate_plate_format'
32
+ ]
utils/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (830 Bytes). View file
 
utils/__pycache__/helper.cpython-311.pyc ADDED
Binary file (20.9 kB). View file
 
utils/helper.py ADDED
@@ -0,0 +1,505 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Helper utilities for the OCR Number Plate Pipeline
3
+ """
4
+ import cv2
5
+ import numpy as np
6
+ from typing import List, Dict, Tuple, Optional
7
+ import os
8
+ from pathlib import Path
9
+
10
+ import sys
11
+ sys.path.append(str(Path(__file__).parent.parent))
12
+ from config.config import CONTOUR_CONFIG, LINE_CONFIG, PREPROCESS_CONFIG
13
+
14
+
15
+ # ============== CONTOUR PROCESSING ==============
16
+
17
+ def detect_contours(image: np.ndarray, threshold: int = None) -> tuple:
18
+ """
19
+ Detect contours in a grayscale image.
20
+
21
+ Args:
22
+ image: Grayscale input image
23
+ threshold: Binary threshold value (default from config)
24
+
25
+ Returns:
26
+ Tuple of (contours, hierarchy, thresholded_image)
27
+ """
28
+ if threshold is None:
29
+ threshold = PREPROCESS_CONFIG["binary_threshold"]
30
+
31
+ _, thresh = cv2.threshold(image, threshold, 255, cv2.THRESH_BINARY)
32
+ contours, hierarchy = cv2.findContours(
33
+ thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE
34
+ )
35
+
36
+ return contours, hierarchy, thresh
37
+
38
+
39
+ def filter_contours_by_size(contours: list, image_shape: tuple) -> List[Dict]:
40
+ """
41
+ Filter contours by minimum size requirements.
42
+
43
+ Args:
44
+ contours: List of OpenCV contours
45
+ image_shape: Shape of the original image (height, width)
46
+
47
+ Returns:
48
+ List of dicts with contour information
49
+ """
50
+ min_area = CONTOUR_CONFIG["min_area"]
51
+ min_width = CONTOUR_CONFIG["min_width"]
52
+ min_height = CONTOUR_CONFIG["min_height"]
53
+
54
+ image_h, image_w = image_shape[:2]
55
+ min_aspect = CONTOUR_CONFIG.get("min_aspect_ratio", 0.12)
56
+ max_aspect = CONTOUR_CONFIG.get("max_aspect_ratio", 2.8)
57
+ max_width_ratio = CONTOUR_CONFIG.get("max_width_ratio", 0.45)
58
+ max_height_ratio = CONTOUR_CONFIG.get("max_height_ratio", 0.95)
59
+
60
+ size_filtered = []
61
+ prefiltered = []
62
+ for idx, cnt in enumerate(contours):
63
+ x, y, w, h = cv2.boundingRect(cnt)
64
+
65
+ # Skip if too small
66
+ if w < min_width or h < min_height or w * h < min_area:
67
+ continue
68
+
69
+ base_item = {
70
+ 'idx': idx,
71
+ 'x': x,
72
+ 'y': y,
73
+ 'w': w,
74
+ 'h': h,
75
+ 'area': w * h,
76
+ 'contour': cnt
77
+ }
78
+ size_filtered.append(base_item)
79
+
80
+ # Skip obvious non-character blobs
81
+ if image_w > 0 and (w / image_w) > max_width_ratio:
82
+ continue
83
+ if image_h > 0 and (h / image_h) > max_height_ratio:
84
+ continue
85
+
86
+ aspect_ratio = w / max(h, 1)
87
+ if aspect_ratio < min_aspect or aspect_ratio > max_aspect:
88
+ continue
89
+
90
+ candidate = dict(base_item)
91
+ candidate['aspect_ratio'] = aspect_ratio
92
+ prefiltered.append(candidate)
93
+
94
+ if len(prefiltered) <= 2:
95
+ return prefiltered if prefiltered else size_filtered
96
+
97
+ # If shape rules are too strict for this plate, fallback to basic size filter
98
+ if len(size_filtered) > 0 and len(prefiltered) < max(3, int(0.50 * len(size_filtered))):
99
+ prefiltered = size_filtered
100
+
101
+ # Adaptive pass: keep contours with character-like size relative to median stats
102
+ heights = np.array([c['h'] for c in prefiltered], dtype=np.float32)
103
+ widths = np.array([c['w'] for c in prefiltered], dtype=np.float32)
104
+ areas = np.array([c['area'] for c in prefiltered], dtype=np.float32)
105
+
106
+ median_h = float(np.median(heights))
107
+ median_w = float(np.median(widths))
108
+ median_area = float(np.median(areas))
109
+
110
+ min_h_rel = CONTOUR_CONFIG.get("min_height_rel_median", 0.45)
111
+ max_h_rel = CONTOUR_CONFIG.get("max_height_rel_median", 2.2)
112
+ min_w_rel = CONTOUR_CONFIG.get("min_width_rel_median", 0.35)
113
+ max_w_rel = CONTOUR_CONFIG.get("max_width_rel_median", 2.4)
114
+ min_area_rel = CONTOUR_CONFIG.get("min_area_rel_median", 0.20)
115
+ max_area_rel = CONTOUR_CONFIG.get("max_area_rel_median", 3.8)
116
+
117
+ filtered = []
118
+ for c in prefiltered:
119
+ h_ok = (median_h * min_h_rel) <= c['h'] <= (median_h * max_h_rel)
120
+ w_ok = (median_w * min_w_rel) <= c['w'] <= (median_w * max_w_rel)
121
+ area_ok = (median_area * min_area_rel) <= c['area'] <= (median_area * max_area_rel)
122
+
123
+ # Keep contour if it satisfies at least two of three adaptive constraints
124
+ if (h_ok + w_ok + area_ok) >= 2:
125
+ filtered.append(c)
126
+
127
+ # Fallback to prefiltered if adaptive stage is too aggressive
128
+ if len(filtered) < max(2, int(0.35 * len(prefiltered))):
129
+ return prefiltered
130
+
131
+ return filtered
132
+
133
+
134
+ def extract_roi(image: np.ndarray, contour_data: Dict, padding: int = None) -> np.ndarray:
135
+ """
136
+ Extract ROI (Region of Interest) from image with padding.
137
+
138
+ Args:
139
+ image: Original image (color or grayscale)
140
+ contour_data: Dict with 'x', 'y', 'w', 'h' keys
141
+ padding: Padding around the ROI
142
+
143
+ Returns:
144
+ Cropped ROI image
145
+ """
146
+ if padding is None:
147
+ padding = CONTOUR_CONFIG["padding"]
148
+
149
+ x, y, w, h = contour_data['x'], contour_data['y'], contour_data['w'], contour_data['h']
150
+
151
+ # Calculate padded boundaries
152
+ x1 = max(x - padding, 0)
153
+ y1 = max(y - padding, 0)
154
+ x2 = min(x + w + padding, image.shape[1])
155
+ y2 = min(y + h + padding, image.shape[0])
156
+
157
+ return image[y1:y2, x1:x2]
158
+
159
+
160
+ def convert_to_binary(image: np.ndarray) -> np.ndarray:
161
+ """
162
+ Convert image to binary (black and white).
163
+
164
+ Args:
165
+ image: Input image (color or grayscale)
166
+
167
+ Returns:
168
+ Binary image
169
+ """
170
+ if len(image.shape) == 3:
171
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
172
+ else:
173
+ gray = image
174
+
175
+ if PREPROCESS_CONFIG["otsu_threshold"]:
176
+ _, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
177
+ else:
178
+ _, binary = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY)
179
+
180
+ return binary
181
+
182
+
183
+ def remove_overlapping_centers(contours: List[Dict], center_threshold: int = None, verbose: bool = False) -> List[Dict]:
184
+ """
185
+ Remove contours that have nearly the same center (like inner/outer of '0').
186
+ Keeps the LARGER contour when centers overlap.
187
+
188
+ Args:
189
+ contours: List of contour dicts with 'x', 'y', 'w', 'h' keys
190
+ center_threshold: Max distance between centers to consider as duplicate
191
+ verbose: Print debug information
192
+
193
+ Returns:
194
+ Filtered list of contours
195
+ """
196
+ if not contours:
197
+ return []
198
+
199
+ if center_threshold is None:
200
+ center_threshold = CONTOUR_CONFIG["center_threshold"]
201
+
202
+ # Calculate center for each contour
203
+ for c in contours:
204
+ c['cx'] = c['x'] + c['w'] // 2
205
+ c['cy'] = c['y'] + c['h'] // 2
206
+ if 'area' not in c:
207
+ c['area'] = c['w'] * c['h']
208
+
209
+ # Sort by area (largest first)
210
+ sorted_contours = sorted(contours, key=lambda c: c['area'], reverse=True)
211
+
212
+ filtered = []
213
+ for curr in sorted_contours:
214
+ is_duplicate = False
215
+
216
+ for existing in filtered:
217
+ dx = abs(curr['cx'] - existing['cx'])
218
+ dy = abs(curr['cy'] - existing['cy'])
219
+ center_distance = (dx**2 + dy**2) ** 0.5
220
+
221
+ if center_distance < center_threshold:
222
+ is_duplicate = True
223
+ if verbose:
224
+ print(f" → Removing duplicate: center ({curr['cx']},{curr['cy']}) "
225
+ f"too close to ({existing['cx']},{existing['cy']}) dist={center_distance:.1f}")
226
+ break
227
+
228
+ if not is_duplicate:
229
+ filtered.append(curr)
230
+
231
+ return filtered
232
+
233
+
234
+ # ============== LINE GROUPING ==============
235
+
236
+ def group_contours_by_line(contours: List[Dict], y_threshold: float = None) -> List[List[Dict]]:
237
+ """
238
+ Groups contours into lines based on their vertical center position.
239
+ Contours with similar y-center (within y_threshold) are on the same line.
240
+
241
+ Args:
242
+ contours: List of contour dicts
243
+ y_threshold: Maximum vertical distance to consider same line
244
+
245
+ Returns:
246
+ List of lines, where each line is a list of contours sorted left-to-right
247
+ """
248
+ if not contours:
249
+ return []
250
+
251
+ # Calculate y-center and average height
252
+ avg_height = 0
253
+ for c in contours:
254
+ c['y_center'] = c['y'] + c['h'] // 2
255
+ avg_height += c['h']
256
+ avg_height /= len(contours)
257
+
258
+ # Determine y_threshold if not provided
259
+ if y_threshold is None:
260
+ y_threshold = avg_height * LINE_CONFIG["y_threshold_ratio"]
261
+ if y_threshold < LINE_CONFIG["default_y_threshold"]:
262
+ y_threshold = LINE_CONFIG["default_y_threshold"]
263
+
264
+ # Sort by y-center first
265
+ sorted_by_y = sorted(contours, key=lambda c: c['y_center'])
266
+
267
+ # Group into lines
268
+ lines = []
269
+ current_line = [sorted_by_y[0]]
270
+
271
+ for i in range(1, len(sorted_by_y)):
272
+ curr = sorted_by_y[i]
273
+ prev = current_line[-1]
274
+
275
+ # If y-center difference is small, same line
276
+ if abs(curr['y_center'] - prev['y_center']) <= y_threshold:
277
+ current_line.append(curr)
278
+ else:
279
+ lines.append(current_line)
280
+ current_line = [curr]
281
+
282
+ # Add the last line
283
+ lines.append(current_line)
284
+
285
+ # Sort each line by x position (left to right)
286
+ for line in lines:
287
+ line.sort(key=lambda c: c['x'])
288
+
289
+ return lines
290
+
291
+
292
+ # ============== IMAGE PREPROCESSING ==============
293
+
294
+ def preprocess_plate_image(image: np.ndarray) -> np.ndarray:
295
+ """
296
+ Preprocess plate image for better contour detection.
297
+
298
+ Args:
299
+ image: Input plate image (color)
300
+
301
+ Returns:
302
+ Preprocessed grayscale image
303
+ """
304
+ # Convert to grayscale if needed
305
+ if len(image.shape) == 3:
306
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
307
+ else:
308
+ gray = image
309
+
310
+ # Apply CLAHE
311
+ clahe = cv2.createCLAHE(
312
+ clipLimit=PREPROCESS_CONFIG["clahe_clip_limit"],
313
+ tileGridSize=PREPROCESS_CONFIG["clahe_grid_size"]
314
+ )
315
+ enhanced = clahe.apply(gray)
316
+
317
+ # Denoise
318
+ denoised = cv2.GaussianBlur(enhanced, PREPROCESS_CONFIG["gaussian_blur_kernel"], 0)
319
+
320
+ return denoised
321
+
322
+
323
+ def resize_with_aspect_ratio(image: np.ndarray, width: int = None, height: int = None) -> np.ndarray:
324
+ """
325
+ Resize image while maintaining aspect ratio.
326
+
327
+ Args:
328
+ image: Input image
329
+ width: Target width (optional)
330
+ height: Target height (optional)
331
+
332
+ Returns:
333
+ Resized image
334
+ """
335
+ h, w = image.shape[:2]
336
+
337
+ if width is None and height is None:
338
+ return image
339
+
340
+ if width is None:
341
+ ratio = height / h
342
+ new_size = (int(w * ratio), height)
343
+ else:
344
+ ratio = width / w
345
+ new_size = (width, int(h * ratio))
346
+
347
+ return cv2.resize(image, new_size, interpolation=cv2.INTER_AREA)
348
+
349
+
350
+ # ============== FORMATTING & OUTPUT ==============
351
+
352
+ def format_plate_number(lines: List[List[Dict]], results: List[List[Dict]]) -> Dict:
353
+ """
354
+ Format recognized characters into plate number.
355
+
356
+ Args:
357
+ lines: Lines of contours
358
+ results: OCR results for each line
359
+
360
+ Returns:
361
+ Dict with formatted plate information
362
+ """
363
+ plate_lines = []
364
+ all_results = []
365
+
366
+ for line_idx, line_results in enumerate(results):
367
+ line_text = "".join([r['char'] for r in line_results])
368
+ plate_lines.append(line_text)
369
+
370
+ line_detail = {
371
+ 'line_num': line_idx + 1,
372
+ 'text': line_text,
373
+ 'characters': line_results
374
+ }
375
+ all_results.append(line_detail)
376
+
377
+ return {
378
+ 'lines': plate_lines,
379
+ 'multiline': "\n".join(plate_lines),
380
+ 'singleline': " ".join(plate_lines),
381
+ 'details': all_results,
382
+ 'num_lines': len(plate_lines),
383
+ 'total_chars': sum(len(line) for line in results)
384
+ }
385
+
386
+
387
+ def save_contour_images(contours: List[Dict], image: np.ndarray, output_dir: str) -> List[str]:
388
+ """
389
+ Save extracted contour images to disk.
390
+
391
+ Args:
392
+ contours: List of contour dicts
393
+ image: Original image
394
+ output_dir: Output directory path
395
+
396
+ Returns:
397
+ List of saved file paths
398
+ """
399
+ os.makedirs(output_dir, exist_ok=True)
400
+ saved_paths = []
401
+
402
+ for i, c in enumerate(contours):
403
+ roi = extract_roi(image, c)
404
+ roi_bw = convert_to_binary(roi)
405
+
406
+ filepath = os.path.join(output_dir, f"char_{i:03d}.jpg")
407
+ cv2.imwrite(filepath, roi_bw)
408
+ saved_paths.append(filepath)
409
+
410
+ return saved_paths
411
+
412
+
413
+ def draw_detections(image: np.ndarray, contours: List[Dict],
414
+ results: List[Dict] = None, line_colors: bool = True) -> np.ndarray:
415
+ """
416
+ Draw bounding boxes and labels on image.
417
+
418
+ Args:
419
+ image: Input image
420
+ contours: List of contour dicts
421
+ results: OCR results (optional)
422
+ line_colors: Use different colors for different lines
423
+
424
+ Returns:
425
+ Annotated image
426
+ """
427
+ output = image.copy()
428
+
429
+ # Color palette for different lines
430
+ colors = [
431
+ (0, 255, 0), # Green
432
+ (255, 0, 0), # Blue
433
+ (0, 0, 255), # Red
434
+ (255, 255, 0), # Cyan
435
+ (255, 0, 255), # Magenta
436
+ ]
437
+
438
+ for i, c in enumerate(contours):
439
+ x, y, w, h = c['x'], c['y'], c['w'], c['h']
440
+
441
+ # Determine color
442
+ line_idx = c.get('line_idx', 0)
443
+ color = colors[line_idx % len(colors)] if line_colors else (0, 255, 0)
444
+
445
+ # Draw rectangle
446
+ cv2.rectangle(output, (x, y), (x + w, y + h), color, 2)
447
+
448
+ # Draw label if results provided
449
+ if results and i < len(results):
450
+ label = f"{results[i]['char']} ({results[i]['conf']:.0%})"
451
+ cv2.putText(output, label, (x, y - 5),
452
+ cv2.FONT_HERSHEY_SIMPLEX, 0.4, color, 1)
453
+
454
+ return output
455
+
456
+
457
+ # ============== VALIDATION ==============
458
+
459
+ def validate_plate_format(plate_text: str, format_type: str = "nepali") -> bool:
460
+ """
461
+ Validate if plate number matches expected format.
462
+
463
+ Args:
464
+ plate_text: Recognized plate text
465
+ format_type: Type of format to validate ("nepali", "embossed")
466
+
467
+ Returns:
468
+ True if valid, False otherwise
469
+ """
470
+ # Basic validation - can be extended based on actual formats
471
+ if not plate_text or len(plate_text) < 4:
472
+ return False
473
+
474
+ # Nepali plates typically have:
475
+ # - Province identifier (2 characters)
476
+ # - Class identifier (1-2 Nepali characters)
477
+ # - Numbers (4 digits)
478
+
479
+ return True
480
+
481
+
482
+ def calculate_confidence_stats(results: List[List[Dict]]) -> Dict:
483
+ """
484
+ Calculate confidence statistics for OCR results.
485
+
486
+ Args:
487
+ results: OCR results by line
488
+
489
+ Returns:
490
+ Dict with confidence statistics
491
+ """
492
+ all_confidences = []
493
+ for line in results:
494
+ for r in line:
495
+ all_confidences.append(r['conf'])
496
+
497
+ if not all_confidences:
498
+ return {'mean': 0, 'min': 0, 'max': 0, 'std': 0}
499
+
500
+ return {
501
+ 'mean': np.mean(all_confidences),
502
+ 'min': np.min(all_confidences),
503
+ 'max': np.max(all_confidences),
504
+ 'std': np.std(all_confidences)
505
+ }