Spaces:

Pujan-Dev
/

OCR_NUMBERPLATE_YOLO

Sleeping

App Files Files Community

Pujan-Dev commited on Mar 15

Commit

f9003ec

1 Parent(s): 658d299

test

Browse files

Files changed (24) hide show

Dockerfile +20 -0
README.md +174 -0
__init__.py +47 -0
__pycache__/app.cpython-311.pyc +0 -0
__pycache__/main.cpython-311.pyc +0 -0
app.py +99 -0
config/__init__.py +18 -0
config/__pycache__/__init__.cpython-311.pyc +0 -0
config/__pycache__/config.cpython-311.pyc +0 -0
config/config.py +102 -0
demo.py +164 -0
final_models/best12345.pt +3 -0
main.py +542 -0
model/__init__.py +5 -0
model/__pycache__/__init__.cpython-311.pyc +0 -0
model/__pycache__/ocr.cpython-311.pyc +0 -0
model/__pycache__/plate_detector.cpython-311.pyc +0 -0
model/ocr.py +202 -0
model/plate_detector.py +321 -0
requirements.txt +6 -0
utils/__init__.py +32 -0
utils/__pycache__/__init__.cpython-311.pyc +0 -0
utils/__pycache__/helper.cpython-311.pyc +0 -0
utils/helper.py +505 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,20 @@

+# Use the official Python image as a base image
+FROM python:3.11-slim
+# Set the working directory in the container
+WORKDIR /app
+# Copy the requirements file into the container
+COPY requirements.txt ./
+# Install the dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy the application code into the container
+COPY . .
+# Expose the port the app runs on
+EXPOSE 7860
+# Command to run the application
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

README.md ADDED Viewed

	@@ -0,0 +1,174 @@

+# Number Plate OCR Pipeline
+A complete pipeline for detecting and recognizing vehicle number plates, supporting both **Nepali** and **English** characters with multi-line plate support.
+## Features
+- 🎯 **YOLO-based plate detection** - Automatically locates number plates in images
+- 📝 **Multi-line plate support** - Handles single and multi-line plate formats
+- 🇳🇵 **Nepali character recognition** - Supports Devanagari script (क, ख, ग, etc.)
+- 🔤 **English character recognition** - A-Z, 0-9
+- 🔢 **Embossed plate support** - Works with both painted and embossed plates
+- 📊 **Confidence scoring** - Each character comes with a confidence score
+## Project Structure
+```
+pipeline/
+├── __init__.py          # Package initialization
+├── main.py              # Main pipeline entry point
+├── config/
+│   ├── __init__.py
+│   └── config.py        # Configuration settings
+├── model/
+│   ├── __init__.py
+│   ├── ocr.py           # OCR model definition
+│   └── plate_detector.py # YOLO plate detector
+├── utils/
+│   ├── __init__.py
+│   └── helper.py        # Helper functions
+└── final_models/
+    └── ocr_model_em_np_eng.pth  # Pre-trained OCR model
+```
+## Requirements
+```bash
+pip install torch torchvision opencv-python numpy scikit-learn matplotlib ultralytics
+```
+## Usage
+### Command Line
+```bash
+# Basic usage
+python pipeline/main.py image.jpg
+# Skip YOLO detection (process entire image)
+python pipeline/main.py plate_image.jpg --no-yolo
+# Save extracted characters and results
+python pipeline/main.py image.jpg --save --output results.json
+# Quiet mode (no progress messages)
+python pipeline/main.py image.jpg -q
+```
+### Python API
+```python
+from pipeline import NumberPlateOCR
+# Initialize with YOLO detection
+ocr = NumberPlateOCR(use_yolo=True)
+# Process an image
+result = ocr.process_image('car_image.jpg', show_visualization=True)
+# Get the plate number
+for plate in result['plates']:
+    print(f"Plate: {plate['singleline_text']}")
+    print(f"Confidence: {plate['confidence_stats']['mean']:.1%}")
+```
+### Process pre-cropped plate image
+```python
+import cv2
+from pipeline import NumberPlateOCR
+ocr = NumberPlateOCR(use_yolo=False)  # No need for YOLO
+plate_img = cv2.imread('cropped_plate.jpg')
+result = ocr.process_from_plate_image(plate_img)
+print(result['singleline_text'])
+```
+## Configuration
+Edit `config/config.py` to customize:
+```python
+# OCR settings
+OCR_CONFIG = {
+    "input_size": (128, 128),
+    "num_classes": 71,
+}
+# YOLO settings
+YOLO_CONFIG = {
+    "confidence_threshold": 0.5,
+    "iou_threshold": 0.45,
+}
+# Character detection
+CONTOUR_CONFIG = {
+    "min_area": 100,
+    "center_threshold": 15,  # For duplicate removal
+}
+# Inference
+INFERENCE_CONFIG = {
+    "min_confidence": 0.10,  # Minimum OCR confidence
+}
+```
+## Supported Characters
+| Type | Characters |
+|------|------------|
+| Digits (English) | 0-9 |
+| Digits (Nepali) | ०-९ |
+| Letters (English) | A-Z |
+| Nepali Text | क, को, ख, ग, च, ज, झ, ञ, डि, त, ना, प, प्र, ब, बा, भे, म, मे, य, लु, सी, सु, से, ह |
+| Special | Nepali Flag |
+## Output Format
+```json
+{
+  "image_path": "car.jpg",
+  "num_plates": 1,
+  "plates": [
+    {
+      "plate_index": 0,
+      "lines": ["बा २", "प ४५६७"],
+      "singleline_text": "बा २ प ४५६७",
+      "multiline_text": "बा २\nप ४५६७",
+      "num_lines": 2,
+      "total_chars": 8,
+      "confidence_stats": {
+        "mean": 0.85,
+        "min": 0.65,
+        "max": 0.98
+      }
+    }
+  ]
+}
+```
+## Model Files
+- `final_models/ocr_model_em_np_eng.pth` - OCR model (ResNet18-based)
+- `../best/best.pt` - YOLO model for plate detection
+## Tips
+1. **Better accuracy**: Ensure good lighting and clear plate images
+2. **Multi-line plates**: The pipeline automatically detects line breaks
+3. **Low confidence**: Characters below 10% confidence are filtered out
+4. **Border removal**: The first 2 detected contours (usually borders) are skipped
+## Troubleshooting
+**"YOLO model not found"**
+- Ensure `best/best.pt` exists in the project root
+**"OCR model not found"**
+- Run: `cp best/ocr_model_em_np_eng.pth pipeline/final_models/`
+**Low accuracy**
+- Adjust `PREPROCESS_CONFIG["binary_threshold"]` in config
+- Try different `CONTOUR_CONFIG["min_area"]` values

__init__.py ADDED Viewed

	@@ -0,0 +1,47 @@

+"""
+OCR Number Plate Pipeline
+=========================
+A complete pipeline for detecting and recognizing number plates,
+supporting both Nepali and English characters.
+Components:
+- PlateDetector: YOLO-based number plate detection
+- CharacterRecognizer: ResNet18-based OCR model
+- NumberPlateOCR: Complete pipeline combining detection and recognition
+Usage:
+    from pipeline import NumberPlateOCR
+    # Initialize pipeline
+    ocr = NumberPlateOCR(use_yolo=True)
+    # Process image
+    result = ocr.process_image('plate.jpg')
+    print(result['plates'][0]['singleline_text'])
+"""
+from .main import NumberPlateOCR
+from .model.ocr import CharacterRecognizer, OCRModel
+from .model.plate_detector import PlateDetector, PlateDetectorLite, get_detector
+from .config.config import (
+    OCR_CONFIG, YOLO_CONFIG, CONTOUR_CONFIG,
+    get_device, setup_directories
+)
+__version__ = "1.0.0"
+__author__ = "OCR Team"
+__all__ = [
+    'NumberPlateOCR',
+    'CharacterRecognizer',
+    'OCRModel',
+    'PlateDetector',
+    'PlateDetectorLite',
+    'get_detector',
+    'OCR_CONFIG',
+    'YOLO_CONFIG',
+    'CONTOUR_CONFIG',
+    'get_device',
+    'setup_directories'
+]

__pycache__/app.cpython-311.pyc ADDED Viewed

Binary file (4.19 kB). View file

__pycache__/main.cpython-311.pyc ADDED Viewed

Binary file (28 kB). View file

app.py ADDED Viewed

	@@ -0,0 +1,99 @@

+import fastapi
+import uvicorn
+from fastapi import FastAPI, UploadFile, File, HTTPException
+from typing import Dict
+import cv2
+import numpy as np
+from main import NumberPlateOCR
+app = FastAPI()
+# Initialize the OCR pipeline
+pipeline = NumberPlateOCR(use_yolo=True, verbose=False)
+@app.get("/")
+def read_root():
+    return {"Hello": "World"}
+@app.post("/ocr-only/")
+async def ocr_only(file: UploadFile = File(...)) -> Dict:
+    """
+    Perform OCR on an uploaded image without YOLO detection.
+    Args:
+        file: Uploaded image file.
+    Returns:
+        JSON response with OCR results.
+    """
+    try:
+        # Read image from upload
+        contents = await file.read()
+        np_img = np.frombuffer(contents, np.uint8)
+        image = cv2.imdecode(np_img, cv2.IMREAD_COLOR)
+        if image is None:
+            raise HTTPException(status_code=400, detail="Invalid image file")
+        # Process the image directly (skip YOLO)
+        results = pipeline.process_from_plate_image(image, show_visualization=False)
+        return {
+            "lines": results["lines"],
+            "multiline_text": results["multiline_text"],
+            "singleline_text": results["singleline_text"],
+            "num_lines": results["num_lines"],
+            "total_chars": results["total_chars"],
+            "confidence_stats": results["confidence_stats"]
+        }
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.post("/yolo-ocr/")
+async def yolo_ocr(file: UploadFile = File(...)) -> Dict:
+    """
+    Perform YOLO-based plate detection and OCR on an uploaded image.
+    Args:
+        file: Uploaded image file.
+    Returns:
+        JSON response with OCR results.
+    """
+    try:
+        # Read image from upload
+        contents = await file.read()
+        np_img = np.frombuffer(contents, np.uint8)
+        image_path = "uploaded_image.jpg"
+        # Save the uploaded image temporarily
+        with open(image_path, "wb") as f:
+            f.write(contents)
+        # Process the image using the pipeline
+        results = pipeline.process_image(image_path, save_contours=False, show_visualization=False)
+        # Format the response
+        response = {
+            "image_path": results["image_path"],
+            "num_plates": results["num_plates"],
+            "plates": []
+        }
+        for plate in results["plates"]:
+            response["plates"].append({
+                "lines": plate["lines"],
+                "multiline_text": plate["multiline_text"],
+                "singleline_text": plate["singleline_text"],
+                "num_lines": plate["num_lines"],
+                "total_chars": plate["total_chars"],
+                "confidence_stats": plate["confidence_stats"]
+            })
+        return response
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))

config/__init__.py ADDED Viewed

	@@ -0,0 +1,18 @@

+"""Configuration module for OCR pipeline."""
+from .config import (
+    BASE_DIR, PROJECT_ROOT,
+    YOLO_MODEL_PATH, OCR_MODEL_PATH, LABEL_MAP_PATH,
+    OUTPUT_DIR, CONTOURS_DIR, CONTOURS_BW_DIR, RESULTS_DIR,
+    OCR_CONFIG, YOLO_CONFIG, PREPROCESS_CONFIG,
+    CONTOUR_CONFIG, LINE_CONFIG, INFERENCE_CONFIG, VIZ_CONFIG,
+    get_device, setup_directories
+)
+__all__ = [
+    'BASE_DIR', 'PROJECT_ROOT',
+    'YOLO_MODEL_PATH', 'OCR_MODEL_PATH', 'LABEL_MAP_PATH',
+    'OUTPUT_DIR', 'CONTOURS_DIR', 'CONTOURS_BW_DIR', 'RESULTS_DIR',
+    'OCR_CONFIG', 'YOLO_CONFIG', 'PREPROCESS_CONFIG',
+    'CONTOUR_CONFIG', 'LINE_CONFIG', 'INFERENCE_CONFIG', 'VIZ_CONFIG',
+    'get_device', 'setup_directories'
+]

config/__pycache__/__init__.cpython-311.pyc ADDED Viewed

Binary file (907 Bytes). View file

config/__pycache__/config.cpython-311.pyc ADDED Viewed

Binary file (3.3 kB). View file

config/config.py ADDED Viewed

	@@ -0,0 +1,102 @@

+"""
+Configuration file for the OCR Number Plate Pipeline
+"""
+import os
+from pathlib import Path
+# ============== PATHS ==============
+BASE_DIR = Path(__file__).parent.parent
+PROJECT_ROOT = BASE_DIR.parent
+# Model paths
+YOLO_MODEL_PATH = PROJECT_ROOT / "final" / "final_models" / "best12345.pt"
+OCR_MODEL_PATH = PROJECT_ROOT / "final" / "final_models" / "ocr_model.pth"
+LABEL_MAP_PATH = PROJECT_ROOT / "label_map.json"
+# Output paths
+OUTPUT_DIR = PROJECT_ROOT / "output"
+CONTOURS_DIR = OUTPUT_DIR / "contours"
+CONTOURS_BW_DIR = OUTPUT_DIR / "contours_bw"
+RESULTS_DIR = OUTPUT_DIR / "results"
+# ============== OCR MODEL CONFIG ==============
+OCR_CONFIG = {
+    "input_size": (128, 128),
+    "num_classes": 71,  # 0-9, A-Z, Nepali chars, Nepali Flag
+    "backbone": "resnet18",
+    "pretrained": False,
+}
+# ============== YOLO CONFIG ==============
+YOLO_CONFIG = {
+    "confidence_threshold": 0.5,
+    "iou_threshold": 0.45,
+    "img_size": 640,
+    "device": "auto",  # "auto", "cuda", "cpu"
+}
+# ============== PREPROCESSING CONFIG ==============
+PREPROCESS_CONFIG = {
+    "clahe_clip_limit": 2.0,
+    "clahe_grid_size": (8, 8),
+    "gaussian_blur_kernel": (3, 3),
+    "binary_threshold": 180,
+    "otsu_threshold": True,
+}
+# ============== CONTOUR DETECTION CONFIG ==============
+CONTOUR_CONFIG = {
+    "min_area": 60,
+    "min_width": 3,
+    "min_height": 2,
+    "min_aspect_ratio": 0.08,
+    "max_aspect_ratio": 4.0,
+    "max_width_ratio": 0.55,
+    "max_height_ratio": 0.95,
+    "min_height_rel_median": 0.45,
+    "max_height_rel_median": 2.2,
+    "min_width_rel_median": 0.35,
+    "max_width_rel_median": 2.4,
+    "min_area_rel_median": 0.20,
+    "max_area_rel_median": 3.8,
+    "padding": 1,
+    "center_threshold": 15,  # For removing overlapping contours
+    "skip_borders": 0,  # Legacy key (no blind skipping)
+    "remove_edge_artifacts": True,
+    "edge_margin": 2,
+}
+# ============== LINE GROUPING CONFIG ==============
+LINE_CONFIG = {
+    "y_threshold_ratio": 0.5,  # Ratio of average height for line grouping
+    "default_y_threshold": 20,
+}
+# ============== OCR INFERENCE CONFIG ==============
+INFERENCE_CONFIG = {
+    "min_confidence": 0.10,  # 10% minimum confidence
+    "batch_size": 16,
+}
+# ============== VISUALIZATION CONFIG ==============
+VIZ_CONFIG = {
+    "show_plots": True,
+    "save_results": True,
+    "figure_size": (12, 6),
+    "font_size": 9,
+    "max_cols": 10,
+}
+# ============== DEVICE CONFIGURATION ==============
+def get_device():
+    """Auto-detect the best available device."""
+    import torch
+    if YOLO_CONFIG["device"] == "auto":
+        return torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    return torch.device(YOLO_CONFIG["device"])
+# ============== CREATE DIRECTORIES ==============
+def setup_directories():
+    """Create necessary output directories."""
+    for dir_path in [OUTPUT_DIR, CONTOURS_DIR, CONTOURS_BW_DIR, RESULTS_DIR]:
+        dir_path.mkdir(parents=True, exist_ok=True)

demo.py ADDED Viewed

	@@ -0,0 +1,164 @@

+#!/usr/bin/env python3
+"""
+Demo script for the Number Plate OCR Pipeline.
+This script demonstrates:
+1. Full pipeline with YOLO detection
+2. Processing pre-cropped plate images
+3. Batch processing multiple images
+"""
+import cv2
+import os
+import sys
+from pathlib import Path
+# Add pipeline to path
+sys.path.insert(0, str(Path(__file__).parent))
+from main import NumberPlateOCR
+def demo_full_pipeline(image_path: str):
+    """Demo: Full pipeline with YOLO plate detection."""
+    print("\n" + "="*60)
+    print("DEMO 1: Full Pipeline with YOLO Detection")
+    print("="*60)
+    # Initialize with YOLO
+    pipeline = NumberPlateOCR(use_yolo=True, verbose=True)
+    # Process image
+    result = pipeline.process_image(
+        image_path,
+        save_contours=True,
+        show_visualization=True
+    )
+    return result
+def demo_cropped_plate(plate_image_path: str):
+    """Demo: Process a pre-cropped plate image."""
+    print("\n" + "="*60)
+    print("DEMO 2: Pre-cropped Plate Image (No YOLO)")
+    print("="*60)
+    # Initialize without YOLO
+    pipeline = NumberPlateOCR(use_yolo=False, verbose=True)
+    # Load cropped plate
+    plate_img = cv2.imread(plate_image_path)
+    if plate_img is None:
+        print(f"Error: Could not load image: {plate_image_path}")
+        return None
+    # Process plate directly
+    result = pipeline.process_from_plate_image(
+        plate_img,
+        show_visualization=True
+    )
+    print(f"\nRecognized text: {result['singleline_text']}")
+    return result
+def demo_batch_processing(image_dir: str):
+    """Demo: Batch process multiple images."""
+    print("\n" + "="*60)
+    print("DEMO 3: Batch Processing")
+    print("="*60)
+    # Get all images
+    image_extensions = {'.jpg', '.jpeg', '.png', '.bmp'}
+    images = [f for f in Path(image_dir).iterdir()
+              if f.suffix.lower() in image_extensions]
+    if not images:
+        print(f"No images found in: {image_dir}")
+        return []
+    print(f"Found {len(images)} images")
+    # Initialize pipeline once
+    pipeline = NumberPlateOCR(use_yolo=True, verbose=False)
+    results = []
+    for img_path in images:
+        print(f"\nProcessing: {img_path.name}")
+        try:
+            result = pipeline.process_image(
+                str(img_path),
+                show_visualization=False
+            )
+            for plate in result['plates']:
+                print(f"  → {plate['singleline_text']}")
+            results.append(result)
+        except Exception as e:
+            print(f"  Error: {e}")
+    return results
+def demo_api_usage():
+    """Demo: Show various API usage patterns."""
+    print("\n" + "="*60)
+    print("DEMO 4: API Usage Examples")
+    print("="*60)
+    print("""
+# Example 1: Basic usage
+from pipeline import NumberPlateOCR
+ocr = NumberPlateOCR()
+result = ocr.process_image('car.jpg')
+print(result['plates'][0]['singleline_text'])
+# Example 2: Just the OCR model
+from pipeline import CharacterRecognizer
+rec = CharacterRecognizer(
+    model_path='pipeline/final_models/ocr_model_em_np_eng.pth',
+    label_map_path='label_map.json'
+)
+char, conf, img = rec.predict(char_image)
+print(f"{char}: {conf:.1%}")
+# Example 3: Just the plate detector
+from pipeline import PlateDetector
+detector = PlateDetector()
+detections = detector.detect('car.jpg')
+for det in detections:
+    print(f"Plate at {det['bbox']} with {det['confidence']:.1%} confidence")
+# Example 4: Top-k predictions
+predictions = rec.get_top_k_predictions(char_image, k=5)
+for char, conf in predictions:
+    print(f"  {char}: {conf:.1%}")
+""")
+if __name__ == "__main__":
+    print("="*60)
+    print("Number Plate OCR Pipeline - Demo")
+    print("="*60)
+    # Check for test images
+    project_root = Path(__file__).parent.parent
+    # Look for test images
+    test_images = list(project_root.glob("*.png")) + list(project_root.glob("*.jpg"))
+    if test_images:
+        test_image = str(test_images[0])
+        print(f"\nUsing test image: {test_image}")
+        # Run demo 1
+        demo_full_pipeline(test_image)
+    else:
+        print("\nNo test images found. Please provide an image path.")
+        print("Usage: python demo.py <image_path>")
+    # Show API examples
+    demo_api_usage()

final_models/best12345.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:99e5a27144d4768b13e63b553253dab37db83cf5b8e08576d3a97bcb40047601
+size 5362053

main.py ADDED Viewed

	@@ -0,0 +1,542 @@

+#!/usr/bin/env python3
+"""
+============== COMPLETE OCR PIPELINE (Multi-Line Support) ==============
+This pipeline combines:
+1. YOLO-based number plate detection
+2. Character segmentation using contour detection
+3. OCR using a ResNet18-based model
+4. Multi-line plate support (for Nepali plates)
+Usage:
+    python main.py <image_path>
+    python main.py <image_path> --no-yolo  # Skip YOLO detection
+    python main.py <image_path> --save     # Save results
+"""
+import cv2
+import numpy as np
+import matplotlib.pyplot as plt
+import argparse
+import os
+from pathlib import Path
+from typing import List, Dict, Optional, Tuple
+import json
+# Local imports
+from config.config import (
+    CONTOUR_CONFIG, INFERENCE_CONFIG, VIZ_CONFIG,
+    OCR_MODEL_PATH, LABEL_MAP_PATH, YOLO_MODEL_PATH,
+    setup_directories, get_device, RESULTS_DIR, CONTOURS_BW_DIR
+)
+from model.ocr import CharacterRecognizer
+from model.plate_detector import get_detector
+from utils.helper import (
+    detect_contours, filter_contours_by_size, extract_roi,
+    convert_to_binary, remove_overlapping_centers,
+    group_contours_by_line, format_plate_number,
+    draw_detections, calculate_confidence_stats, save_contour_images
+)
+class NumberPlateOCR:
+    """
+    Complete Number Plate OCR Pipeline.
+    Supports:
+    - YOLO-based plate detection (optional)
+    - Multi-line plate recognition
+    - Nepali and English characters
+    - Embossed number plates
+    """
+    def __init__(self, use_yolo: bool = True, verbose: bool = True):
+        """
+        Initialize the OCR pipeline.
+        Args:
+            use_yolo: Whether to use YOLO for plate detection
+            verbose: Print progress messages
+        """
+        self.verbose = verbose
+        self.device = get_device()
+        # Setup directories
+        setup_directories()
+        # Initialize OCR model
+        self._log("Loading OCR model...")
+        self.ocr = CharacterRecognizer(
+            model_path=str(OCR_MODEL_PATH),
+            label_map_path=str(LABEL_MAP_PATH),
+            device=self.device
+        )
+        # Initialize plate detector
+        self.use_yolo = use_yolo
+        if use_yolo:
+            self._log("Loading YOLO plate detector...")
+            self.detector = get_detector(use_yolo=True, model_path=str(YOLO_MODEL_PATH))
+        else:
+            self.detector = None
+        self._log("✓ Pipeline initialized successfully!")
+    @staticmethod
+    def _is_nepali_token(token: str) -> bool:
+        """Check if token is Nepali (Devanagari) or Nepali-specific label."""
+        if not token:
+            return False
+        if token == "Nepali Flag":
+            return True
+        return any('\u0900' <= ch <= '\u097F' for ch in token)
+    @staticmethod
+    def _is_english_token(token: str) -> bool:
+        """Check if token is plain English alphanumeric."""
+        if not token:
+            return False
+        return all(('0' <= ch <= '9') or ('A' <= ch <= 'Z') or ('a' <= ch <= 'z') for ch in token)
+    @staticmethod
+    def _english_digit_to_nepali(token: str) -> str:
+        """Convert English digits to Nepali digits (keeps non-digits unchanged)."""
+        digit_map = str.maketrans("0123456789", "०१२३४५६७८९")
+        return token.translate(digit_map)
+    def _apply_nepali_dominant_correction(self, line_results: List[Dict]):
+        """
+        If a line is predominantly Nepali, replace English predictions using
+        next Nepali top-k prediction from OCR model.
+        """
+        if not line_results:
+            return
+        nepali_count = sum(1 for r in line_results if self._is_nepali_token(r['char']))
+        english_count = sum(1 for r in line_results if self._is_english_token(r['char']))
+        if nepali_count <= english_count:
+            return
+        for r in line_results:
+            curr_char = r['char']
+            if not self._is_english_token(curr_char):
+                continue
+            replacement_char = None
+            replacement_conf = None
+            top_k = self.ocr.get_top_k_predictions(r['_roi_bw'], k=5)
+            for candidate_char, candidate_conf in top_k[1:]:
+                if self._is_nepali_token(candidate_char):
+                    replacement_char = candidate_char
+                    replacement_conf = candidate_conf
+                    break
+            if replacement_char is None and any(ch.isdigit() for ch in curr_char):
+                replacement_char = self._english_digit_to_nepali(curr_char)
+                replacement_conf = r['conf']
+            if replacement_char is not None:
+                r['char'] = replacement_char
+                r['conf'] = float(replacement_conf)
+    def _log(self, message: str):
+        """Print log message if verbose."""
+        if self.verbose:
+            print(message)
+    def process_image(self, image_path: str,
+                     save_contours: bool = False,
+                     show_visualization: bool = True) -> Dict:
+        """
+        Process an image and extract plate number.
+        Args:
+            image_path: Path to input image
+            save_contours: Whether to save extracted character images
+            show_visualization: Whether to display matplotlib visualizations
+        Returns:
+            Dict with recognition results
+        """
+        # Load image
+        self._log(f"\n{'='*60}")
+        self._log(f"Processing: {image_path}")
+        self._log(f"{'='*60}")
+        orig_image = cv2.imread(image_path)
+        gray_image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
+        if orig_image is None:
+            raise ValueError(f"Could not load image: {image_path}")
+        # Step 1: Detect plates (optional YOLO step)
+        if self.use_yolo and self.detector:
+            self._log("\n📍 Step 1: Detecting number plates with YOLO...")
+            plates = self._detect_plates(orig_image)
+            if not plates:
+                self._log("⚠ No plates detected by YOLO, processing full image...")
+                plates = [{'plate_image': orig_image, 'bbox': None, 'confidence': 1.0}]
+        else:
+            self._log("\n📍 Step 1: Using full image (YOLO disabled)...")
+            plates = [{'plate_image': orig_image, 'bbox': None, 'confidence': 1.0}]
+        # Process each detected plate
+        all_results = []
+        for plate_idx, plate_data in enumerate(plates):
+            self._log(f"\n📋 Processing Plate {plate_idx + 1}/{len(plates)}")
+            plate_img = plate_data['plate_image']
+            plate_gray = cv2.cvtColor(plate_img, cv2.COLOR_BGR2GRAY) if len(plate_img.shape) == 3 else plate_img
+            # Step 2: Extract character contours
+            self._log("📍 Step 2: Detecting character contours...")
+            contours = self._extract_contours(plate_gray, plate_img)
+            if not contours:
+                self._log("⚠ No characters detected in plate")
+                continue
+            # Save contours if requested
+            if save_contours:
+                self._log(f"   Saving contour images to {CONTOURS_BW_DIR}")
+                save_contour_images(contours, plate_img, str(CONTOURS_BW_DIR))
+            # Step 3: Group by lines
+            self._log("📍 Step 3: Grouping characters by lines...")
+            lines = group_contours_by_line(contours)
+            self._log(f"   Detected {len(lines)} line(s)")
+            for i, line in enumerate(lines):
+                self._log(f"   Line {i+1}: {len(line)} characters")
+            # Step 4: Run OCR
+            self._log("📍 Step 4: Running OCR on characters...")
+            ocr_results = self._run_ocr(lines, plate_img)
+            # Step 5: Format results
+            formatted = format_plate_number(lines, ocr_results)
+            confidence_stats = calculate_confidence_stats(ocr_results)
+            result = {
+                'plate_index': plate_idx,
+                'plate_bbox': plate_data['bbox'],
+                'plate_confidence': plate_data.get('confidence', 1.0),
+                'plate_image': plate_img,
+                'lines': formatted['lines'],
+                'multiline_text': formatted['multiline'],
+                'singleline_text': formatted['singleline'],
+                'num_lines': formatted['num_lines'],
+                'total_chars': formatted['total_chars'],
+                'details': formatted['details'],
+                'confidence_stats': confidence_stats,
+                'raw_ocr_results': ocr_results
+            }
+            all_results.append(result)
+            # Visualize
+            if show_visualization:
+                self._visualize_plate(plate_img, lines, ocr_results, plate_idx)
+        # Print final summary
+        self._print_results(all_results)
+        return {
+            'image_path': image_path,
+            'num_plates': len(all_results),
+            'plates': all_results
+        }
+    def _detect_plates(self, image: np.ndarray) -> List[Dict]:
+        """Detect plates using YOLO."""
+        detections = self.detector.detect(image)
+        self._log(f"   Found {len(detections)} plate(s)")
+        for i, det in enumerate(detections):
+            self._log(f"   Plate {i+1}: confidence={det['confidence']:.2%}")
+        return detections
+    def _extract_contours(self, gray_image: np.ndarray,
+                         color_image: np.ndarray) -> List[Dict]:
+        """Extract and filter character contours."""
+        # Detect contours
+        contours, hierarchy, thresh = detect_contours(gray_image)
+        self._log(f"   Total contours found: {len(contours)}")
+        # Filter by size
+        filtered = filter_contours_by_size(contours, gray_image.shape)
+        self._log(f"   After size filter: {len(filtered)}")
+        # Sort by x position
+        sorted_contours = sorted(filtered, key=lambda c: (c['x'], c['y']))
+        # Remove only true edge artifacts (do not blindly drop first contours)
+        remove_edge_artifacts = CONTOUR_CONFIG.get("remove_edge_artifacts", True)
+        edge_margin = CONTOUR_CONFIG.get("edge_margin", 2)
+        if remove_edge_artifacts and len(sorted_contours) > 4:
+            image_h, image_w = gray_image.shape[:2]
+            non_edge_contours = [
+                c for c in sorted_contours
+                if (
+                    c['x'] > edge_margin and
+                    c['y'] > edge_margin and
+                    (c['x'] + c['w']) < (image_w - edge_margin) and
+                    (c['y'] + c['h']) < (image_h - edge_margin)
+                )
+            ]
+            # Keep edge filtering only if it does not remove too many candidates
+            if len(non_edge_contours) >= max(3, int(0.6 * len(sorted_contours))):
+                sorted_contours = non_edge_contours
+                self._log(f"   After edge-artifact filter: {len(sorted_contours)}")
+        # Extract ROI for each contour
+        for c in sorted_contours:
+            roi = extract_roi(color_image, c)
+            c['roi_bw'] = convert_to_binary(roi)
+        # Remove overlapping centers (like inner hole of '0')
+        final_contours = remove_overlapping_centers(sorted_contours, verbose=self.verbose)
+        removed = len(sorted_contours) - len(final_contours)
+        if removed > 0:
+            self._log(f"   Removed {removed} overlapping contours")
+        return final_contours
+    def _run_ocr(self, lines: List[List[Dict]],
+                plate_image: np.ndarray) -> List[List[Dict]]:
+        """Run OCR on grouped character lines."""
+        min_confidence = INFERENCE_CONFIG["min_confidence"]
+        results_by_line = []
+        for line_idx, line in enumerate(lines):
+            line_results = []
+            for c in line:
+                char, conf, processed_img = self.ocr.predict(c['roi_bw'])
+                if conf > min_confidence:
+                    line_results.append({
+                        'char': char,
+                        'conf': conf,
+                        'x': c['x'],
+                        'y': c['y'],
+                        'w': c['w'],
+                        'h': c['h'],
+                        'processed_img': processed_img,
+                        '_roi_bw': c['roi_bw']
+                    })
+            self._apply_nepali_dominant_correction(line_results)
+            for r in line_results:
+                r.pop('_roi_bw', None)
+            results_by_line.append(line_results)
+        total_chars = sum(len(line) for line in results_by_line)
+        self._log(f"   Characters with confidence > {min_confidence*100:.0f}%: {total_chars}")
+        return results_by_line
+    def _visualize_plate(self, plate_image: np.ndarray,
+                        lines: List[List[Dict]],
+                        ocr_results: List[List[Dict]],
+                        plate_idx: int):
+        """Visualize OCR results."""
+        if not VIZ_CONFIG["show_plots"]:
+            return
+        # Show original plate
+        plt.figure(figsize=VIZ_CONFIG["figure_size"])
+        plt.imshow(cv2.cvtColor(plate_image, cv2.COLOR_BGR2RGB))
+        plt.title(f'Plate {plate_idx + 1} - {len(lines)} Line(s) Detected')
+        plt.axis('off')
+        plt.show()
+        # Show OCR results for each line
+        for line_idx, line_results in enumerate(ocr_results):
+            n = len(line_results)
+            if n > 0:
+                cols = min(VIZ_CONFIG["max_cols"], n)
+                rows = (n + cols - 1) // cols
+                fig, axes = plt.subplots(rows, cols, figsize=(cols*1.5, rows*2))
+                axes = np.array(axes).reshape(-1) if n > 1 else [axes]
+                for i, r in enumerate(line_results):
+                    axes[i].imshow(r['processed_img'], cmap='gray')
+                    axes[i].set_title(f'"{r["char"]}" ({r["conf"]:.0%})',
+                                     fontsize=VIZ_CONFIG["font_size"])
+                    axes[i].axis('off')
+                # Hide empty subplots
+                for i in range(n, len(axes)):
+                    axes[i].axis('off')
+                line_text = "".join([r['char'] for r in line_results])
+                plt.suptitle(f'Line {line_idx+1}: "{line_text}"', fontsize=12)
+                plt.tight_layout()
+                plt.show()
+    def _print_results(self, results: List[Dict]):
+        """Print formatted results."""
+        print("\n" + "="*60)
+        print("📋 PLATE NUMBER RECOGNITION RESULTS")
+        print("="*60)
+        for result in results:
+            plate_idx = result['plate_index'] + 1
+            print(f"\n🏷️  PLATE {plate_idx}:")
+            print("-"*40)
+            for line_detail in result['details']:
+                print(f"\n  📌 Line {line_detail['line_num']}:")
+                for i, char_info in enumerate(line_detail['characters']):
+                    print(f"      {i+1}. '{char_info['char']}' ({char_info['conf']:.1%})")
+                print(f"     → Result: {line_detail['text']}")
+            # Final result
+            print("\n" + "-"*40)
+            if result['num_lines'] > 1:
+                print("  Multi-line format:")
+                for i, line in enumerate(result['lines']):
+                    print(f"    Line {i+1}: {line}")
+                print(f"\n  Single-line: {result['singleline_text']}")
+            else:
+                text = result['lines'][0] if result['lines'] else 'No characters detected'
+                print(f"  Result: {text}")
+            # Confidence stats
+            stats = result['confidence_stats']
+            print(f"\n  Confidence: avg={stats['mean']:.1%}, min={stats['min']:.1%}, max={stats['max']:.1%}")
+        print("\n" + "="*60)
+    def process_from_plate_image(self, plate_image: np.ndarray,
+                                 show_visualization: bool = True) -> Dict:
+        """
+        Process a pre-cropped plate image (skip YOLO detection).
+        Args:
+            plate_image: Cropped plate image (BGR)
+            show_visualization: Whether to show plots
+        Returns:
+            Recognition result dict
+        """
+        plate_gray = cv2.cvtColor(plate_image, cv2.COLOR_BGR2GRAY) if len(plate_image.shape) == 3 else plate_image
+        # Extract contours
+        contours = self._extract_contours(plate_gray, plate_image)
+        if not contours:
+            return {'lines': [], 'singleline_text': '', 'total_chars': 0}
+        # Group by lines
+        lines = group_contours_by_line(contours)
+        # Run OCR
+        ocr_results = self._run_ocr(lines, plate_image)
+        # Format results
+        formatted = format_plate_number(lines, ocr_results)
+        if show_visualization:
+            self._visualize_plate(plate_image, lines, ocr_results, 0)
+        return {
+            'lines': formatted['lines'],
+            'multiline_text': formatted['multiline'],
+            'singleline_text': formatted['singleline'],
+            'num_lines': formatted['num_lines'],
+            'total_chars': formatted['total_chars'],
+            'details': formatted['details'],
+            'confidence_stats': calculate_confidence_stats(ocr_results)
+        }
+def main():
+    """Main entry point."""
+    parser = argparse.ArgumentParser(
+        description="Number Plate OCR Pipeline",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    python main.py image.jpg
+    python main.py image.jpg --no-yolo
+    python main.py image.jpg --save --no-viz
+    python main.py image.jpg --output results.json
+        """
+    )
+    parser.add_argument('image', type=str, help='Path to input image')
+    parser.add_argument('--no-yolo', action='store_true',
+                       help='Skip YOLO plate detection')
+    parser.add_argument('--save', action='store_true',
+                       help='Save extracted character images')
+    parser.add_argument('--no-viz', action='store_true',
+                       help='Disable visualization')
+    parser.add_argument('--output', '-o', type=str,
+                       help='Save results to JSON file')
+    parser.add_argument('--quiet', '-q', action='store_true',
+                       help='Suppress progress messages')
+    args = parser.parse_args()
+    # Validate input
+    if not os.path.exists(args.image):
+        print(f"Error: Image not found: {args.image}")
+        return 1
+    # Initialize pipeline
+    pipeline = NumberPlateOCR(
+        use_yolo=not args.no_yolo,
+        verbose=not args.quiet
+    )
+    # Process image
+    results = pipeline.process_image(
+        args.image,
+        save_contours=args.save,
+        show_visualization=not args.no_viz
+    )
+    # Save results if requested
+    if args.output:
+        # Remove non-serializable items
+        save_results = {
+            'image_path': results['image_path'],
+            'num_plates': results['num_plates'],
+            'plates': []
+        }
+        for plate in results['plates']:
+            save_plate = {
+                'plate_index': plate['plate_index'],
+                'plate_bbox': plate['plate_bbox'],
+                'lines': plate['lines'],
+                'multiline_text': plate['multiline_text'],
+                'singleline_text': plate['singleline_text'],
+                'num_lines': plate['num_lines'],
+                'total_chars': plate['total_chars'],
+                'confidence_stats': plate['confidence_stats']
+            }
+            save_results['plates'].append(save_plate)
+        with open(args.output, 'w', encoding='utf-8') as f:
+            json.dump(save_results, f, indent=2, ensure_ascii=False)
+        print(f"\n✓ Results saved to: {args.output}")
+    return 0
+if __name__ == "__main__":
+    exit(main())

model/__init__.py ADDED Viewed

	@@ -0,0 +1,5 @@

+"""Model modules for OCR pipeline."""
+from .ocr import OCRModel, CharacterRecognizer
+from .plate_detector import PlateDetector, PlateDetectorLite, get_detector
+__all__ = ['OCRModel', 'CharacterRecognizer', 'PlateDetector', 'PlateDetectorLite', 'get_detector']

model/__pycache__/__init__.cpython-311.pyc ADDED Viewed

Binary file (496 Bytes). View file

model/__pycache__/ocr.cpython-311.pyc ADDED Viewed

Binary file (11.9 kB). View file

model/__pycache__/plate_detector.cpython-311.pyc ADDED Viewed

Binary file (13.9 kB). View file

model/ocr.py ADDED Viewed

	@@ -0,0 +1,202 @@

+"""
+OCR Model Definition and Inference for Number Plate Character Recognition
+"""
+import torch
+import torch.nn as nn
+import numpy as np
+from torchvision import models
+import json
+from sklearn.preprocessing import LabelEncoder
+from pathlib import Path
+import cv2
+import sys
+sys.path.append(str(Path(__file__).parent.parent))
+from config.config import OCR_CONFIG, PREPROCESS_CONFIG, get_device
+class OCRModel(nn.Module):
+    """
+    ResNet18-based OCR model for character recognition.
+    Supports grayscale input images.
+    """
+    def __init__(self, num_classes: int):
+        super(OCRModel, self).__init__()
+        # Use ResNet18 as backbone
+        self.features = models.resnet18(pretrained=OCR_CONFIG.get("pretrained", False))
+        # Modify first conv layer to accept single channel (grayscale)
+        self.features.conv1 = nn.Conv2d(
+            1, 64, kernel_size=7, stride=2, padding=3, bias=False
+        )
+        # Remove the original FC layer
+        self.features.fc = nn.Identity()
+        # Custom classifier head
+        self.classifier = nn.Sequential(
+            nn.Linear(512, 256),
+            nn.ReLU(inplace=True),
+            nn.Dropout(0.5),
+            nn.Linear(256, num_classes)
+        )
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        features = self.features(x)
+        return self.classifier(features)
+class CharacterRecognizer:
+    """
+    High-level wrapper for character recognition.
+    Handles model loading, preprocessing, and inference.
+    """
+    def __init__(self, model_path: str, label_map_path: str, device: torch.device = None):
+        self.device = device or get_device()
+        self.model_path = Path(model_path)
+        self.label_map_path = Path(label_map_path)
+        # Load label map
+        self._load_label_map()
+        # Initialize and load model
+        self._load_model()
+        # Setup CLAHE
+        self.clahe = cv2.createCLAHE(
+            clipLimit=PREPROCESS_CONFIG["clahe_clip_limit"],
+            tileGridSize=PREPROCESS_CONFIG["clahe_grid_size"]
+        )
+    def _load_label_map(self):
+        """Load label map from JSON file."""
+        with open(self.label_map_path, 'r', encoding='utf-8') as f:
+            self.label_map = json.load(f)
+        self.num_classes = len(self.label_map)
+        # Setup label encoder
+        self.label_encoder = LabelEncoder()
+        self.label_encoder.classes_ = np.array([
+            self.label_map[str(i)] for i in range(self.num_classes)
+        ])
+    def _load_model(self):
+        """Load trained model weights."""
+        self.model = OCRModel(self.num_classes).to(self.device)
+        self.model.load_state_dict(
+            torch.load(self.model_path, map_location=self.device)
+        )
+        self.model.eval()
+        print(f"✓ OCR Model loaded on: {self.device}")
+    def preprocess(self, img_region: np.ndarray) -> tuple:
+        """
+        Preprocess image region for OCR.
+        Args:
+            img_region: Grayscale image region (numpy array)
+        Returns:
+            Tuple of (tensor, preprocessed_image)
+        """
+        input_size = OCR_CONFIG["input_size"]
+        # Resize to model input size
+        img_resized = cv2.resize(img_region, input_size)
+        # Apply CLAHE (Contrast Limited Adaptive Histogram Equalization)
+        img_eq = self.clahe.apply(img_resized)
+        # Apply Gaussian blur to reduce noise
+        img_blur = cv2.GaussianBlur(
+            img_eq, PREPROCESS_CONFIG["gaussian_blur_kernel"], 0
+        )
+        # Convert to tensor and normalize
+        img_tensor = torch.from_numpy(img_blur).unsqueeze(0).unsqueeze(0).float() / 255.0
+        img_tensor = img_tensor.to(self.device)
+        return img_tensor, img_blur
+    def predict(self, img_region: np.ndarray) -> tuple:
+        """
+        Perform OCR on a single image region.
+        Args:
+            img_region: Grayscale image region
+        Returns:
+            Tuple of (predicted_char, confidence, preprocessed_image)
+        """
+        img_tensor, preprocessed_img = self.preprocess(img_region)
+        with torch.no_grad():
+            output = self.model(img_tensor)
+            predicted_index = output.argmax(dim=1).item()
+            confidence = torch.softmax(output, dim=1).max().item()
+        predicted_char = self.label_encoder.inverse_transform([predicted_index])[0]
+        return predicted_char, confidence, preprocessed_img
+    def predict_batch(self, img_regions: list) -> list:
+        """
+        Perform OCR on multiple image regions.
+        Args:
+            img_regions: List of grayscale image regions
+        Returns:
+            List of (predicted_char, confidence, preprocessed_image) tuples
+        """
+        if not img_regions:
+            return []
+        # Preprocess all images
+        tensors = []
+        preprocessed_imgs = []
+        for img in img_regions:
+            tensor, preprocessed = self.preprocess(img)
+            tensors.append(tensor)
+            preprocessed_imgs.append(preprocessed)
+        # Stack tensors for batch inference
+        batch_tensor = torch.cat(tensors, dim=0)
+        with torch.no_grad():
+            outputs = self.model(batch_tensor)
+            predicted_indices = outputs.argmax(dim=1).cpu().numpy()
+            confidences = torch.softmax(outputs, dim=1).max(dim=1).values.cpu().numpy()
+        # Decode predictions
+        predicted_chars = self.label_encoder.inverse_transform(predicted_indices)
+        return list(zip(predicted_chars, confidences, preprocessed_imgs))
+    def get_top_k_predictions(self, img_region: np.ndarray, k: int = 5) -> list:
+        """
+        Get top-k predictions with confidence scores.
+        Args:
+            img_region: Grayscale image region
+            k: Number of top predictions to return
+        Returns:
+            List of (char, confidence) tuples
+        """
+        img_tensor, _ = self.preprocess(img_region)
+        with torch.no_grad():
+            output = self.model(img_tensor)
+            probs = torch.softmax(output, dim=1)[0]
+            top_k = torch.topk(probs, k)
+        results = []
+        for idx, conf in zip(top_k.indices.cpu().numpy(), top_k.values.cpu().numpy()):
+            char = self.label_encoder.inverse_transform([idx])[0]
+            results.append((char, float(conf)))
+        return results

model/plate_detector.py ADDED Viewed

	@@ -0,0 +1,321 @@

+"""
+YOLO-based Number Plate Detection Module
+"""
+import cv2
+import numpy as np
+from pathlib import Path
+from typing import List, Dict, Optional, Union, Tuple
+import torch
+import sys
+sys.path.append(str(Path(__file__).parent.parent))
+from config.config import YOLO_CONFIG, YOLO_MODEL_PATH, get_device
+class PlateDetector:
+    """
+    YOLO-based number plate detector.
+    Detects number plates in images and returns bounding boxes.
+    """
+    def __init__(self, model_path: str = None, device: torch.device = None):
+        """
+        Initialize the plate detector.
+        Args:
+            model_path: Path to YOLO model weights (default from config)
+            device: Torch device for inference
+        """
+        self.model_path = Path(model_path) if model_path else YOLO_MODEL_PATH
+        self.device = device or get_device()
+        self.model = None
+        self._load_model()
+    def _load_model(self):
+        """Load YOLO model."""
+        try:
+            from ultralytics import YOLO
+            if not self.model_path.exists():
+                raise FileNotFoundError(f"YOLO model not found at: {self.model_path}")
+            self.model = YOLO(str(self.model_path))
+            # Set device
+            device_str = "cuda" if self.device.type == "cuda" else "cpu"
+            self.model.to(device_str)
+            print(f"✓ YOLO Plate Detector loaded on: {self.device}")
+        except ImportError:
+            raise ImportError("ultralytics package is required. Install with: pip install ultralytics")
+    def detect(self, image: Union[str, np.ndarray],
+               conf_threshold: float = None,
+               iou_threshold: float = None) -> List[Dict]:
+        """
+        Detect number plates in an image.
+        Args:
+            image: Image path or numpy array (BGR format)
+            conf_threshold: Confidence threshold (default from config)
+            iou_threshold: IOU threshold for NMS (default from config)
+        Returns:
+            List of detection dicts with keys: 'bbox', 'confidence', 'class', 'plate_image'
+        """
+        if conf_threshold is None:
+            conf_threshold = YOLO_CONFIG["confidence_threshold"]
+        if iou_threshold is None:
+            iou_threshold = YOLO_CONFIG["iou_threshold"]
+        # Load image if path provided
+        if isinstance(image, str):
+            img = cv2.imread(image)
+            if img is None:
+                raise ValueError(f"Could not load image: {image}")
+        else:
+            img = image.copy()
+        # Run inference
+        results = self.model(
+            img,
+            conf=conf_threshold,
+            iou=iou_threshold,
+            verbose=False
+        )
+        # Parse results
+        detections = []
+        for result in results:
+            boxes = result.boxes
+            if boxes is None or len(boxes) == 0:
+                continue
+            for i in range(len(boxes)):
+                # Get bounding box coordinates (x1, y1, x2, y2)
+                bbox = boxes.xyxy[i].cpu().numpy().astype(int)
+                conf = float(boxes.conf[i].cpu().numpy())
+                cls = int(boxes.cls[i].cpu().numpy()) if boxes.cls is not None else 0
+                x1, y1, x2, y2 = bbox
+                # Extract plate region
+                plate_img = img[y1:y2, x1:x2].copy()
+                detections.append({
+                    'bbox': {
+                        'x1': int(x1),
+                        'y1': int(y1),
+                        'x2': int(x2),
+                        'y2': int(y2),
+                        'width': int(x2 - x1),
+                        'height': int(y2 - y1)
+                    },
+                    'confidence': conf,
+                    'class': cls,
+                    'plate_image': plate_img
+                })
+        return detections
+    def detect_and_crop(self, image: Union[str, np.ndarray],
+                        expand_ratio: float = 0.1) -> List[np.ndarray]:
+        """
+        Detect plates and return cropped plate images.
+        Args:
+            image: Image path or numpy array
+            expand_ratio: Ratio to expand bounding box (default 10%)
+        Returns:
+            List of cropped plate images
+        """
+        detections = self.detect(image)
+        plates = []
+        for det in detections:
+            bbox = det['bbox']
+            if expand_ratio > 0:
+                # Calculate expansion
+                w_expand = int(bbox['width'] * expand_ratio)
+                h_expand = int(bbox['height'] * expand_ratio)
+                # Load original image
+                if isinstance(image, str):
+                    img = cv2.imread(image)
+                else:
+                    img = image
+                h, w = img.shape[:2]
+                # Expanded coordinates
+                x1 = max(0, bbox['x1'] - w_expand)
+                y1 = max(0, bbox['y1'] - h_expand)
+                x2 = min(w, bbox['x2'] + w_expand)
+                y2 = min(h, bbox['y2'] + h_expand)
+                plates.append(img[y1:y2, x1:x2].copy())
+            else:
+                plates.append(det['plate_image'])
+        return plates
+    def draw_detections(self, image: Union[str, np.ndarray],
+                       detections: List[Dict] = None,
+                       color: Tuple[int, int, int] = (0, 255, 0),
+                       thickness: int = 2) -> np.ndarray:
+        """
+        Draw detection boxes on image.
+        Args:
+            image: Image path or numpy array
+            detections: List of detections (if None, will detect)
+            color: Box color in BGR
+            thickness: Line thickness
+        Returns:
+            Annotated image
+        """
+        # Load image
+        if isinstance(image, str):
+            img = cv2.imread(image)
+        else:
+            img = image.copy()
+        # Detect if not provided
+        if detections is None:
+            detections = self.detect(img)
+        for det in detections:
+            bbox = det['bbox']
+            conf = det['confidence']
+            # Draw rectangle
+            cv2.rectangle(
+                img,
+                (bbox['x1'], bbox['y1']),
+                (bbox['x2'], bbox['y2']),
+                color,
+                thickness
+            )
+            # Draw label
+            label = f"Plate: {conf:.2%}"
+            cv2.putText(
+                img, label,
+                (bbox['x1'], bbox['y1'] - 10),
+                cv2.FONT_HERSHEY_SIMPLEX,
+                0.6, color, 2
+            )
+        return img
+class PlateDetectorLite:
+    """
+    Lightweight plate detector using OpenCV (no YOLO).
+    Uses image processing techniques to find plate regions.
+    Useful when YOLO is not available.
+    """
+    def __init__(self):
+        """Initialize the lite detector."""
+        print("✓ PlateDetectorLite initialized (OpenCV-based)")
+    def detect(self, image: Union[str, np.ndarray],
+               min_area: int = 5000,
+               aspect_ratio_range: Tuple[float, float] = (1.5, 5.0)) -> List[Dict]:
+        """
+        Detect potential plate regions using edge detection.
+        Args:
+            image: Image path or numpy array
+            min_area: Minimum contour area
+            aspect_ratio_range: (min, max) aspect ratio for plates
+        Returns:
+            List of detection dicts
+        """
+        # Load image
+        if isinstance(image, str):
+            img = cv2.imread(image)
+        else:
+            img = image.copy()
+        # Convert to grayscale
+        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
+        # Apply bilateral filter to reduce noise while keeping edges sharp
+        blur = cv2.bilateralFilter(gray, 11, 17, 17)
+        # Edge detection
+        edges = cv2.Canny(blur, 30, 200)
+        # Find contours
+        contours, _ = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
+        # Sort by area (largest first)
+        contours = sorted(contours, key=cv2.contourArea, reverse=True)[:20]
+        detections = []
+        for cnt in contours:
+            area = cv2.contourArea(cnt)
+            if area < min_area:
+                continue
+            # Approximate the contour
+            peri = cv2.arcLength(cnt, True)
+            approx = cv2.approxPolyDP(cnt, 0.02 * peri, True)
+            # Looking for rectangles (4 corners)
+            if len(approx) >= 4:
+                x, y, w, h = cv2.boundingRect(cnt)
+                aspect_ratio = w / h if h > 0 else 0
+                # Check if aspect ratio matches plate dimensions
+                if aspect_ratio_range[0] <= aspect_ratio <= aspect_ratio_range[1]:
+                    plate_img = img[y:y+h, x:x+w].copy()
+                    detections.append({
+                        'bbox': {
+                            'x1': x, 'y1': y,
+                            'x2': x + w, 'y2': y + h,
+                            'width': w, 'height': h
+                        },
+                        'confidence': 0.5,  # Estimated confidence
+                        'class': 0,
+                        'plate_image': plate_img
+                    })
+        return detections
+    def detect_and_crop(self, image: Union[str, np.ndarray]) -> List[np.ndarray]:
+        """Get cropped plate images."""
+        detections = self.detect(image)
+        return [det['plate_image'] for det in detections]
+def get_detector(use_yolo: bool = True, model_path: str = None) -> Union[PlateDetector, PlateDetectorLite]:
+    """
+    Factory function to get appropriate detector.
+    Args:
+        use_yolo: Whether to use YOLO detector
+        model_path: Path to YOLO model
+    Returns:
+        Detector instance
+    """
+    if use_yolo:
+        try:
+            return PlateDetector(model_path)
+        except (ImportError, FileNotFoundError) as e:
+            print(f"⚠ YOLO not available: {e}")
+            print("  Falling back to PlateDetectorLite")
+            return PlateDetectorLite()
+    else:
+        return PlateDetectorLite()

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+fastapi
+uvicorn
+opencv-python-headless
+numpy
+torchvision
+torch

utils/__init__.py ADDED Viewed

	@@ -0,0 +1,32 @@

+"""Helper utilities for OCR pipeline."""
+from .helper import (
+    detect_contours,
+    filter_contours_by_size,
+    extract_roi,
+    convert_to_binary,
+    remove_overlapping_centers,
+    group_contours_by_line,
+    format_plate_number,
+    draw_detections,
+    calculate_confidence_stats,
+    save_contour_images,
+    preprocess_plate_image,
+    resize_with_aspect_ratio,
+    validate_plate_format
+)
+__all__ = [
+    'detect_contours',
+    'filter_contours_by_size',
+    'extract_roi',
+    'convert_to_binary',
+    'remove_overlapping_centers',
+    'group_contours_by_line',
+    'format_plate_number',
+    'draw_detections',
+    'calculate_confidence_stats',
+    'save_contour_images',
+    'preprocess_plate_image',
+    'resize_with_aspect_ratio',
+    'validate_plate_format'
+]

utils/__pycache__/__init__.cpython-311.pyc ADDED Viewed

Binary file (830 Bytes). View file

utils/__pycache__/helper.cpython-311.pyc ADDED Viewed

Binary file (20.9 kB). View file

utils/helper.py ADDED Viewed

	@@ -0,0 +1,505 @@

+"""
+Helper utilities for the OCR Number Plate Pipeline
+"""
+import cv2
+import numpy as np
+from typing import List, Dict, Tuple, Optional
+import os
+from pathlib import Path
+import sys
+sys.path.append(str(Path(__file__).parent.parent))
+from config.config import CONTOUR_CONFIG, LINE_CONFIG, PREPROCESS_CONFIG
+# ============== CONTOUR PROCESSING ==============
+def detect_contours(image: np.ndarray, threshold: int = None) -> tuple:
+    """
+    Detect contours in a grayscale image.
+    Args:
+        image: Grayscale input image
+        threshold: Binary threshold value (default from config)
+    Returns:
+        Tuple of (contours, hierarchy, thresholded_image)
+    """
+    if threshold is None:
+        threshold = PREPROCESS_CONFIG["binary_threshold"]
+    _, thresh = cv2.threshold(image, threshold, 255, cv2.THRESH_BINARY)
+    contours, hierarchy = cv2.findContours(
+        thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE
+    )
+    return contours, hierarchy, thresh
+def filter_contours_by_size(contours: list, image_shape: tuple) -> List[Dict]:
+    """
+    Filter contours by minimum size requirements.
+    Args:
+        contours: List of OpenCV contours
+        image_shape: Shape of the original image (height, width)
+    Returns:
+        List of dicts with contour information
+    """
+    min_area = CONTOUR_CONFIG["min_area"]
+    min_width = CONTOUR_CONFIG["min_width"]
+    min_height = CONTOUR_CONFIG["min_height"]
+    image_h, image_w = image_shape[:2]
+    min_aspect = CONTOUR_CONFIG.get("min_aspect_ratio", 0.12)
+    max_aspect = CONTOUR_CONFIG.get("max_aspect_ratio", 2.8)
+    max_width_ratio = CONTOUR_CONFIG.get("max_width_ratio", 0.45)
+    max_height_ratio = CONTOUR_CONFIG.get("max_height_ratio", 0.95)
+    size_filtered = []
+    prefiltered = []
+    for idx, cnt in enumerate(contours):
+        x, y, w, h = cv2.boundingRect(cnt)
+        # Skip if too small
+        if w < min_width or h < min_height or w * h < min_area:
+            continue
+        base_item = {
+            'idx': idx,
+            'x': x,
+            'y': y,
+            'w': w,
+            'h': h,
+            'area': w * h,
+            'contour': cnt
+        }
+        size_filtered.append(base_item)
+        # Skip obvious non-character blobs
+        if image_w > 0 and (w / image_w) > max_width_ratio:
+            continue
+        if image_h > 0 and (h / image_h) > max_height_ratio:
+            continue
+        aspect_ratio = w / max(h, 1)
+        if aspect_ratio < min_aspect or aspect_ratio > max_aspect:
+            continue
+        candidate = dict(base_item)
+        candidate['aspect_ratio'] = aspect_ratio
+        prefiltered.append(candidate)
+    if len(prefiltered) <= 2:
+        return prefiltered if prefiltered else size_filtered
+    # If shape rules are too strict for this plate, fallback to basic size filter
+    if len(size_filtered) > 0 and len(prefiltered) < max(3, int(0.50 * len(size_filtered))):
+        prefiltered = size_filtered
+    # Adaptive pass: keep contours with character-like size relative to median stats
+    heights = np.array([c['h'] for c in prefiltered], dtype=np.float32)
+    widths = np.array([c['w'] for c in prefiltered], dtype=np.float32)
+    areas = np.array([c['area'] for c in prefiltered], dtype=np.float32)
+    median_h = float(np.median(heights))
+    median_w = float(np.median(widths))
+    median_area = float(np.median(areas))
+    min_h_rel = CONTOUR_CONFIG.get("min_height_rel_median", 0.45)
+    max_h_rel = CONTOUR_CONFIG.get("max_height_rel_median", 2.2)
+    min_w_rel = CONTOUR_CONFIG.get("min_width_rel_median", 0.35)
+    max_w_rel = CONTOUR_CONFIG.get("max_width_rel_median", 2.4)
+    min_area_rel = CONTOUR_CONFIG.get("min_area_rel_median", 0.20)
+    max_area_rel = CONTOUR_CONFIG.get("max_area_rel_median", 3.8)
+    filtered = []
+    for c in prefiltered:
+        h_ok = (median_h * min_h_rel) <= c['h'] <= (median_h * max_h_rel)
+        w_ok = (median_w * min_w_rel) <= c['w'] <= (median_w * max_w_rel)
+        area_ok = (median_area * min_area_rel) <= c['area'] <= (median_area * max_area_rel)
+        # Keep contour if it satisfies at least two of three adaptive constraints
+        if (h_ok + w_ok + area_ok) >= 2:
+            filtered.append(c)
+    # Fallback to prefiltered if adaptive stage is too aggressive
+    if len(filtered) < max(2, int(0.35 * len(prefiltered))):
+        return prefiltered
+    return filtered
+def extract_roi(image: np.ndarray, contour_data: Dict, padding: int = None) -> np.ndarray:
+    """
+    Extract ROI (Region of Interest) from image with padding.
+    Args:
+        image: Original image (color or grayscale)
+        contour_data: Dict with 'x', 'y', 'w', 'h' keys
+        padding: Padding around the ROI
+    Returns:
+        Cropped ROI image
+    """
+    if padding is None:
+        padding = CONTOUR_CONFIG["padding"]
+    x, y, w, h = contour_data['x'], contour_data['y'], contour_data['w'], contour_data['h']
+    # Calculate padded boundaries
+    x1 = max(x - padding, 0)
+    y1 = max(y - padding, 0)
+    x2 = min(x + w + padding, image.shape[1])
+    y2 = min(y + h + padding, image.shape[0])
+    return image[y1:y2, x1:x2]
+def convert_to_binary(image: np.ndarray) -> np.ndarray:
+    """
+    Convert image to binary (black and white).
+    Args:
+        image: Input image (color or grayscale)
+    Returns:
+        Binary image
+    """
+    if len(image.shape) == 3:
+        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
+    else:
+        gray = image
+    if PREPROCESS_CONFIG["otsu_threshold"]:
+        _, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
+    else:
+        _, binary = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY)
+    return binary
+def remove_overlapping_centers(contours: List[Dict], center_threshold: int = None, verbose: bool = False) -> List[Dict]:
+    """
+    Remove contours that have nearly the same center (like inner/outer of '0').
+    Keeps the LARGER contour when centers overlap.
+    Args:
+        contours: List of contour dicts with 'x', 'y', 'w', 'h' keys
+        center_threshold: Max distance between centers to consider as duplicate
+        verbose: Print debug information
+    Returns:
+        Filtered list of contours
+    """
+    if not contours:
+        return []
+    if center_threshold is None:
+        center_threshold = CONTOUR_CONFIG["center_threshold"]
+    # Calculate center for each contour
+    for c in contours:
+        c['cx'] = c['x'] + c['w'] // 2
+        c['cy'] = c['y'] + c['h'] // 2
+        if 'area' not in c:
+            c['area'] = c['w'] * c['h']
+    # Sort by area (largest first)
+    sorted_contours = sorted(contours, key=lambda c: c['area'], reverse=True)
+    filtered = []
+    for curr in sorted_contours:
+        is_duplicate = False
+        for existing in filtered:
+            dx = abs(curr['cx'] - existing['cx'])
+            dy = abs(curr['cy'] - existing['cy'])
+            center_distance = (dx**2 + dy**2) ** 0.5
+            if center_distance < center_threshold:
+                is_duplicate = True
+                if verbose:
+                    print(f"  → Removing duplicate: center ({curr['cx']},{curr['cy']}) "
+                          f"too close to ({existing['cx']},{existing['cy']}) dist={center_distance:.1f}")
+                break
+        if not is_duplicate:
+            filtered.append(curr)
+    return filtered
+# ============== LINE GROUPING ==============
+def group_contours_by_line(contours: List[Dict], y_threshold: float = None) -> List[List[Dict]]:
+    """
+    Groups contours into lines based on their vertical center position.
+    Contours with similar y-center (within y_threshold) are on the same line.
+    Args:
+        contours: List of contour dicts
+        y_threshold: Maximum vertical distance to consider same line
+    Returns:
+        List of lines, where each line is a list of contours sorted left-to-right
+    """
+    if not contours:
+        return []
+    # Calculate y-center and average height
+    avg_height = 0
+    for c in contours:
+        c['y_center'] = c['y'] + c['h'] // 2
+        avg_height += c['h']
+    avg_height /= len(contours)
+    # Determine y_threshold if not provided
+    if y_threshold is None:
+        y_threshold = avg_height * LINE_CONFIG["y_threshold_ratio"]
+        if y_threshold < LINE_CONFIG["default_y_threshold"]:
+            y_threshold = LINE_CONFIG["default_y_threshold"]
+    # Sort by y-center first
+    sorted_by_y = sorted(contours, key=lambda c: c['y_center'])
+    # Group into lines
+    lines = []
+    current_line = [sorted_by_y[0]]
+    for i in range(1, len(sorted_by_y)):
+        curr = sorted_by_y[i]
+        prev = current_line[-1]
+        # If y-center difference is small, same line
+        if abs(curr['y_center'] - prev['y_center']) <= y_threshold:
+            current_line.append(curr)
+        else:
+            lines.append(current_line)
+            current_line = [curr]
+    # Add the last line
+    lines.append(current_line)
+    # Sort each line by x position (left to right)
+    for line in lines:
+        line.sort(key=lambda c: c['x'])
+    return lines
+# ============== IMAGE PREPROCESSING ==============
+def preprocess_plate_image(image: np.ndarray) -> np.ndarray:
+    """
+    Preprocess plate image for better contour detection.
+    Args:
+        image: Input plate image (color)
+    Returns:
+        Preprocessed grayscale image
+    """
+    # Convert to grayscale if needed
+    if len(image.shape) == 3:
+        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
+    else:
+        gray = image
+    # Apply CLAHE
+    clahe = cv2.createCLAHE(
+        clipLimit=PREPROCESS_CONFIG["clahe_clip_limit"],
+        tileGridSize=PREPROCESS_CONFIG["clahe_grid_size"]
+    )
+    enhanced = clahe.apply(gray)
+    # Denoise
+    denoised = cv2.GaussianBlur(enhanced, PREPROCESS_CONFIG["gaussian_blur_kernel"], 0)
+    return denoised
+def resize_with_aspect_ratio(image: np.ndarray, width: int = None, height: int = None) -> np.ndarray:
+    """
+    Resize image while maintaining aspect ratio.
+    Args:
+        image: Input image
+        width: Target width (optional)
+        height: Target height (optional)
+    Returns:
+        Resized image
+    """
+    h, w = image.shape[:2]
+    if width is None and height is None:
+        return image
+    if width is None:
+        ratio = height / h
+        new_size = (int(w * ratio), height)
+    else:
+        ratio = width / w
+        new_size = (width, int(h * ratio))
+    return cv2.resize(image, new_size, interpolation=cv2.INTER_AREA)
+# ============== FORMATTING & OUTPUT ==============
+def format_plate_number(lines: List[List[Dict]], results: List[List[Dict]]) -> Dict:
+    """
+    Format recognized characters into plate number.
+    Args:
+        lines: Lines of contours
+        results: OCR results for each line
+    Returns:
+        Dict with formatted plate information
+    """
+    plate_lines = []
+    all_results = []
+    for line_idx, line_results in enumerate(results):
+        line_text = "".join([r['char'] for r in line_results])
+        plate_lines.append(line_text)
+        line_detail = {
+            'line_num': line_idx + 1,
+            'text': line_text,
+            'characters': line_results
+        }
+        all_results.append(line_detail)
+    return {
+        'lines': plate_lines,
+        'multiline': "\n".join(plate_lines),
+        'singleline': " ".join(plate_lines),
+        'details': all_results,
+        'num_lines': len(plate_lines),
+        'total_chars': sum(len(line) for line in results)
+    }
+def save_contour_images(contours: List[Dict], image: np.ndarray, output_dir: str) -> List[str]:
+    """
+    Save extracted contour images to disk.
+    Args:
+        contours: List of contour dicts
+        image: Original image
+        output_dir: Output directory path
+    Returns:
+        List of saved file paths
+    """
+    os.makedirs(output_dir, exist_ok=True)
+    saved_paths = []
+    for i, c in enumerate(contours):
+        roi = extract_roi(image, c)
+        roi_bw = convert_to_binary(roi)
+        filepath = os.path.join(output_dir, f"char_{i:03d}.jpg")
+        cv2.imwrite(filepath, roi_bw)
+        saved_paths.append(filepath)
+    return saved_paths
+def draw_detections(image: np.ndarray, contours: List[Dict],
+                   results: List[Dict] = None, line_colors: bool = True) -> np.ndarray:
+    """
+    Draw bounding boxes and labels on image.
+    Args:
+        image: Input image
+        contours: List of contour dicts
+        results: OCR results (optional)
+        line_colors: Use different colors for different lines
+    Returns:
+        Annotated image
+    """
+    output = image.copy()
+    # Color palette for different lines
+    colors = [
+        (0, 255, 0),    # Green
+        (255, 0, 0),    # Blue
+        (0, 0, 255),    # Red
+        (255, 255, 0),  # Cyan
+        (255, 0, 255),  # Magenta
+    ]
+    for i, c in enumerate(contours):
+        x, y, w, h = c['x'], c['y'], c['w'], c['h']
+        # Determine color
+        line_idx = c.get('line_idx', 0)
+        color = colors[line_idx % len(colors)] if line_colors else (0, 255, 0)
+        # Draw rectangle
+        cv2.rectangle(output, (x, y), (x + w, y + h), color, 2)
+        # Draw label if results provided
+        if results and i < len(results):
+            label = f"{results[i]['char']} ({results[i]['conf']:.0%})"
+            cv2.putText(output, label, (x, y - 5),
+                       cv2.FONT_HERSHEY_SIMPLEX, 0.4, color, 1)
+    return output
+# ============== VALIDATION ==============
+def validate_plate_format(plate_text: str, format_type: str = "nepali") -> bool:
+    """
+    Validate if plate number matches expected format.
+    Args:
+        plate_text: Recognized plate text
+        format_type: Type of format to validate ("nepali", "embossed")
+    Returns:
+        True if valid, False otherwise
+    """
+    # Basic validation - can be extended based on actual formats
+    if not plate_text or len(plate_text) < 4:
+        return False
+    # Nepali plates typically have:
+    # - Province identifier (2 characters)
+    # - Class identifier (1-2 Nepali characters)
+    # - Numbers (4 digits)
+    return True
+def calculate_confidence_stats(results: List[List[Dict]]) -> Dict:
+    """
+    Calculate confidence statistics for OCR results.
+    Args:
+        results: OCR results by line
+    Returns:
+        Dict with confidence statistics
+    """
+    all_confidences = []
+    for line in results:
+        for r in line:
+            all_confidences.append(r['conf'])
+    if not all_confidences:
+        return {'mean': 0, 'min': 0, 'max': 0, 'std': 0}
+    return {
+        'mean': np.mean(all_confidences),
+        'min': np.min(all_confidences),
+        'max': np.max(all_confidences),
+        'std': np.std(all_confidences)
+    }