Spaces:

jeyanthangj2004
/

ocr

Runtime error

ocr

File size: 28,185 Bytes

3f42a6f

{
    "cells": [
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "header"
            },
            "source": [
                "# 🔧 eDOCr2 - Engineering Drawing OCR Testing Notebook\n",
                "\n",
                "This notebook allows you to test the **eDOCr2** tool on your own engineering drawings.\n",
                "\n",
                "**What it does:**\n",
                "- Segments engineering drawings into layers (tables, dimensions, GD&T)\n",
                "- Performs OCR on dimensions, tolerances, and symbols\n",
                "- Extracts structured data (JSON/CSV)\n",
                "- Generates visual masks showing detected elements\n",
                "\n",
                "---"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "setup_header"
            },
            "source": [
                "## 📦 Step 1: Setup Environment\n",
                "\n",
                "**IMPORTANT**: After running this cell, you MUST restart the runtime:\n",
                "- Go to `Runtime` → `Restart runtime`\n",
                "- Then continue with Step 2"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "install_system_deps"
            },
            "outputs": [],
            "source": [
                "# Install system dependencies\n",
                "!apt-get update -qq\n",
                "!apt-get install -y -qq tesseract-ocr poppler-utils\n",
                "\n",
                "# Clone the eDOCr2 repository\n",
                "!git clone https://github.com/javvi51/edocr2.git\n",
                "%cd edocr2\n",
                "\n",
                "# Install all dependencies EXCEPT NumPy-dependent packages\n",
                "print(\"📦 Installing base dependencies...\")\n",
                "!pip install -q pdf2image pandas validators imgaug scikit-learn tqdm\n",
                "!pip install -q essential_generators editdistance pyclipper python-dotenv\n",
                "!pip install -q accelerate sentence-transformers shapely pytesseract\n",
                "\n",
                "# Install specific versions that work with NumPy 1.x\n",
                "print(\"📦 Installing NumPy 1.26.4...\")\n",
                "!pip uninstall -y numpy -q\n",
                "!pip install numpy==1.26.4 -q\n",
                "\n",
                "print(\"📦 Installing compatible packages...\")\n",
                "!pip install -q scikit-image==0.21.0  # Compatible with NumPy 1.x\n",
                "!pip install -q opencv-python==4.8.1.78 opencv-contrib-python==4.8.1.78\n",
                "!pip install -q efficientnet==1.0.0\n",
                "!pip install -q tf-keras\n",
                "\n",
                "print(\"\\n\" + \"=\"*60)\n",
                "print(\"⚠️  IMPORTANT: RESTART RUNTIME NOW!\")\n",
                "print(\"=\"*60)\n",
                "print(\"1. Go to: Runtime → Restart runtime\")\n",
                "print(\"2. Then run Step 2 to verify installation\")\n",
                "print(\"=\"*60)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "verify_header"
            },
            "source": [
                "## ✅ Step 2: Verify Installation\n",
                "\n",
                "Run this AFTER restarting the runtime to verify NumPy version."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "verify_install"
            },
            "outputs": [],
            "source": [
                "import numpy as np\n",
                "import cv2\n",
                "import sys\n",
                "\n",
                "print(\"🔍 Checking installation...\\n\")\n",
                "print(f\"Python version: {sys.version.split()[0]}\")\n",
                "print(f\"NumPy version: {np.__version__}\")\n",
                "print(f\"OpenCV version: {cv2.__version__}\")\n",
                "\n",
                "if np.__version__.startswith('1.'):\n",
                "    print(\"\\n✅ SUCCESS! NumPy 1.x is installed.\")\n",
                "    print(\"   You can proceed to Step 3.\")\n",
                "else:\n",
                "    print(\"\\n⚠️ WARNING: NumPy 2.x detected!\")\n",
                "    print(\"   Please go back to Step 1 and restart runtime.\")\n",
                "\n",
                "# Change to edocr2 directory\n",
                "%cd /content/edocr2"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "download_models_header"
            },
            "source": [
                "## 🧠 Step 3: Download Pre-trained Models\n",
                "\n",
                "The models are hosted in the GitHub releases (v1.0.0)."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "download_models"
            },
            "outputs": [],
            "source": [
                "import os\n",
                "import urllib.request\n",
                "\n",
                "# Create models directory\n",
                "os.makedirs('edocr2/models', exist_ok=True)\n",
                "\n",
                "# Model URLs from GitHub releases (v1.0.0)\n",
                "base_url = \"https://github.com/javvi51/edocr2/releases/download/v1.0.0/\"\n",
                "\n",
                "models = {\n",
                "    'recognizer_gdts.keras': base_url + 'recognizer_gdts.keras',\n",
                "    'recognizer_gdts.txt': base_url + 'recognizer_gdts.txt',\n",
                "    'recognizer_dimensions_2.keras': base_url + 'recognizer_dimensions_2.keras',\n",
                "    'recognizer_dimensions_2.txt': base_url + 'recognizer_dimensions_2.txt'\n",
                "}\n",
                "\n",
                "print(\"📥 Downloading models (this may take a few minutes)...\\n\")\n",
                "for filename, url in models.items():\n",
                "    filepath = f'edocr2/models/{filename}'\n",
                "    if not os.path.exists(filepath):\n",
                "        try:\n",
                "            print(f\"  ⏳ Downloading {filename}...\", end=' ')\n",
                "            urllib.request.urlretrieve(url, filepath)\n",
                "            file_size = os.path.getsize(filepath) / (1024 * 1024)  # Convert to MB\n",
                "            print(f\"✅ ({file_size:.1f} MB)\")\n",
                "        except Exception as e:\n",
                "            print(f\"❌\")\n",
                "            print(f\"     ⚠️ Failed: {e}\")\n",
                "            print(f\"     Please download manually from: {url}\")\n",
                "    else:\n",
                "        file_size = os.path.getsize(filepath) / (1024 * 1024)\n",
                "        print(f\"  ✅ {filename} already exists ({file_size:.1f} MB)\")\n",
                "\n",
                "print(\"\\n✅ Model download complete!\")"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "upload_header"
            },
            "source": [
                "## 📤 Step 4: Upload Your Engineering Drawings\n",
                "\n",
                "Upload your `.jpg`, `.png`, or `.pdf` engineering drawing files."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "upload_files"
            },
            "outputs": [],
            "source": [
                "from google.colab import files\n",
                "import os\n",
                "\n",
                "# Create upload directory\n",
                "os.makedirs('my_drawings', exist_ok=True)\n",
                "\n",
                "print(\"📤 Please upload your engineering drawing files...\")\n",
                "print(\"   Supported formats: .jpg, .png, .pdf\\n\")\n",
                "uploaded = files.upload()\n",
                "\n",
                "# Move uploaded files to my_drawings folder\n",
                "for filename in uploaded.keys():\n",
                "    os.rename(filename, f'my_drawings/{filename}')\n",
                "    file_size = len(uploaded[filename]) / (1024 * 1024)\n",
                "    print(f\"✅ Uploaded: {filename} ({file_size:.2f} MB)\")\n",
                "\n",
                "print(f\"\\n✅ Total files uploaded: {len(uploaded)}\")"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "process_header"
            },
            "source": [
                "## 🔍 Step 5: Process Your Drawings\n",
                "\n",
                "This will run the OCR pipeline on all uploaded images."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "process_drawings"
            },
            "outputs": [],
            "source": [
                "import cv2\n",
                "import numpy as np\n",
                "from edocr2 import tools\n",
                "from pdf2image import convert_from_path\n",
                "from edocr2.keras_ocr.recognition import Recognizer\n",
                "from edocr2.keras_ocr.detection import Detector\n",
                "import tensorflow as tf\n",
                "import time\n",
                "import warnings\n",
                "import os\n",
                "\n",
                "# Suppress warnings\n",
                "warnings.filterwarnings('ignore')\n",
                "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'\n",
                "\n",
                "# Verify NumPy version before proceeding\n",
                "print(f\"🔍 NumPy version: {np.__version__}\")\n",
                "if not np.__version__.startswith('1.'):\n",
                "    print(\"\\n⚠️ ERROR: NumPy 2.x detected!\")\n",
                "    print(\"   This will cause errors. Please:\")\n",
                "    print(\"   1. Go back to Step 1\")\n",
                "    print(\"   2. Restart runtime after Step 1\")\n",
                "    print(\"   3. Run Step 2 to verify\")\n",
                "    raise RuntimeError(\"NumPy 1.x is required!\")\n",
                "\n",
                "# Configure TensorFlow\n",
                "gpus = tf.config.list_physical_devices('GPU')\n",
                "for gpu in gpus:\n",
                "    tf.config.experimental.set_memory_growth(gpu, True)\n",
                "\n",
                "print(\"🔧 Loading models...\")\n",
                "start_time = time.time()\n",
                "\n",
                "# Load GD&T recognizer\n",
                "gdt_model = 'edocr2/models/recognizer_gdts.keras'\n",
                "recognizer_gdt = Recognizer(alphabet=tools.ocr_pipelines.read_alphabet(gdt_model))\n",
                "recognizer_gdt.model.load_weights(gdt_model)\n",
                "\n",
                "# Load dimension recognizer\n",
                "dim_model = 'edocr2/models/recognizer_dimensions_2.keras'\n",
                "alphabet_dim = tools.ocr_pipelines.read_alphabet(dim_model)\n",
                "recognizer_dim = Recognizer(alphabet=alphabet_dim)\n",
                "recognizer_dim.model.load_weights(dim_model)\n",
                "\n",
                "# Load detector\n",
                "detector = Detector()\n",
                "\n",
                "# Warm up models\n",
                "dummy_image = np.zeros((1, 1, 3), dtype=np.float32)\n",
                "_ = recognizer_gdt.recognize(dummy_image)\n",
                "_ = recognizer_dim.recognize(dummy_image)\n",
                "dummy_image = np.zeros((32, 32, 3), dtype=np.float32)\n",
                "_ = detector.detect([dummy_image])\n",
                "\n",
                "end_time = time.time()\n",
                "print(f\"✅ Models loaded in {end_time - start_time:.2f} seconds\\n\")\n",
                "\n",
                "# Process each uploaded file\n",
                "import glob\n",
                "import json\n",
                "\n",
                "drawing_files = glob.glob('my_drawings/*')\n",
                "results_all = {}\n",
                "\n",
                "if not drawing_files:\n",
                "    print(\"⚠️ No files found in my_drawings/\")\n",
                "    print(\"   Please upload files in Step 4 first!\")\n",
                "else:\n",
                "    for file_path in drawing_files:\n",
                "        print(f\"\\n{'='*60}\")\n",
                "        print(f\"📄 Processing: {os.path.basename(file_path)}\")\n",
                "        print(f\"{'='*60}\")\n",
                "        \n",
                "        try:\n",
                "            # Read file\n",
                "            if file_path.lower().endswith('.pdf'):\n",
                "                img = convert_from_path(file_path)\n",
                "                img = np.array(img[0])\n",
                "                gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)\n",
                "                _, img = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY)\n",
                "                img = cv2.merge([img, img, img])\n",
                "            else:\n",
                "                img = cv2.imread(file_path)\n",
                "            \n",
                "            if img is None:\n",
                "                print(f\"⚠️ Could not read file: {file_path}\")\n",
                "                continue\n",
                "            \n",
                "            filename = os.path.splitext(os.path.basename(file_path))[0]\n",
                "            output_path = f'results/{filename}'\n",
                "            os.makedirs(output_path, exist_ok=True)\n",
                "            \n",
                "            # Segmentation\n",
                "            print(\"  🔍 Segmenting layers...\")\n",
                "            img_boxes, frame, gdt_boxes, tables, dim_boxes = tools.layer_segm.segment_img(\n",
                "                img, autoframe=True, frame_thres=0.7, GDT_thres=0.02, binary_thres=127\n",
                "            )\n",
                "            \n",
                "            # OCR Tables\n",
                "            print(\"  📋 Processing tables...\")\n",
                "            process_img = img.copy()\n",
                "            table_results, updated_tables, process_img = tools.ocr_pipelines.ocr_tables(\n",
                "                tables, process_img, language='eng'\n",
                "            )\n",
                "            \n",
                "            # OCR GD&T\n",
                "            print(\"  🎯 Processing GD&T symbols...\")\n",
                "            gdt_results, updated_gdt_boxes, process_img = tools.ocr_pipelines.ocr_gdt(\n",
                "                process_img, gdt_boxes, recognizer_gdt\n",
                "            )\n",
                "            \n",
                "            # OCR Dimensions\n",
                "            print(\"  📏 Processing dimensions...\")\n",
                "            if frame:\n",
                "                process_img = process_img[frame.y : frame.y + frame.h, frame.x : frame.x + frame.w]\n",
                "            \n",
                "            dimensions, other_info, process_img, dim_tess = tools.ocr_pipelines.ocr_dimensions(\n",
                "                process_img, detector, recognizer_dim, alphabet_dim, frame, dim_boxes,\n",
                "                cluster_thres=20, max_img_size=1048, language='eng', backg_save=False\n",
                "            )\n",
                "            \n",
                "            # Generate mask image\n",
                "            print(\"  🎨 Generating visualization...\")\n",
                "            mask_img = tools.output_tools.mask_img(\n",
                "                img, updated_gdt_boxes, updated_tables, dimensions, frame, other_info\n",
                "            )\n",
                "            cv2.imwrite(f'{output_path}/{filename}_mask.png', mask_img)\n",
                "            \n",
                "            # Process and save results\n",
                "            print(\"  💾 Saving results...\")\n",
                "            table_results, gdt_results, dimensions, other_info = tools.output_tools.process_raw_output(\n",
                "                output_path, table_results, gdt_results, dimensions, other_info, save=True\n",
                "            )\n",
                "            \n",
                "            results_all[filename] = {\n",
                "                'tables': table_results,\n",
                "                'gdts': gdt_results,\n",
                "                'dimensions': dimensions,\n",
                "                'other_info': other_info\n",
                "            }\n",
                "            \n",
                "            print(f\"  ✅ Processing complete!\")\n",
                "            print(f\"     - Tables found: {len(table_results)}\")\n",
                "            print(f\"     - GD&T symbols: {len(gdt_results)}\")\n",
                "            print(f\"     - Dimensions: {len(dimensions)}\")\n",
                "            print(f\"     - Other info: {len(other_info)}\")\n",
                "            \n",
                "        except Exception as e:\n",
                "            print(f\"  ❌ Error processing {file_path}: {str(e)}\")\n",
                "            import traceback\n",
                "            traceback.print_exc()\n",
                "\n",
                "    print(f\"\\n\\n{'='*60}\")\n",
                "    print(f\"✅ All drawings processed!\")\n",
                "    print(f\"{'='*60}\")"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "visualize_header"
            },
            "source": [
                "## 👁️ Step 6: Visualize Results\n",
                "\n",
                "Display the mask images showing detected elements."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "visualize_results"
            },
            "outputs": [],
            "source": [
                "from IPython.display import Image, display\n",
                "import glob\n",
                "import os\n",
                "\n",
                "mask_images = glob.glob('results/**/*_mask.png', recursive=True)\n",
                "\n",
                "if mask_images:\n",
                "    print(f\"📊 Displaying {len(mask_images)} result(s):\\n\")\n",
                "    for mask_path in mask_images:\n",
                "        print(f\"\\n{'='*60}\")\n",
                "        print(f\"📄 {os.path.basename(mask_path)}\")\n",
                "        print(f\"{'='*60}\")\n",
                "        display(Image(filename=mask_path, width=800))\n",
                "else:\n",
                "    print(\"⚠️ No mask images found. Please run the processing step first.\")"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "view_data_header"
            },
            "source": [
                "## 📊 Step 7: View Extracted Data\n",
                "\n",
                "Display the extracted structured data (dimensions, tables, GD&T)."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "view_extracted_data"
            },
            "outputs": [],
            "source": [
                "import json\n",
                "import glob\n",
                "import os\n",
                "\n",
                "json_files = glob.glob('results/**/*.json', recursive=True)\n",
                "\n",
                "if json_files:\n",
                "    for json_path in json_files:\n",
                "        print(f\"\\n{'='*60}\")\n",
                "        print(f\"📄 {os.path.basename(json_path)}\")\n",
                "        print(f\"{'='*60}\\n\")\n",
                "        \n",
                "        with open(json_path, 'r') as f:\n",
                "            data = json.load(f)\n",
                "            print(json.dumps(data, indent=2))\n",
                "else:\n",
                "    print(\"⚠️ No JSON files found.\")\n",
                "\n",
                "# Also check for CSV files\n",
                "csv_files = glob.glob('results/**/*.csv', recursive=True)\n",
                "if csv_files:\n",
                "    import pandas as pd\n",
                "    print(\"\\n\\n📊 CSV Files:\")\n",
                "    for csv_path in csv_files:\n",
                "        print(f\"\\n{'='*60}\")\n",
                "        print(f\"📄 {os.path.basename(csv_path)}\")\n",
                "        print(f\"{'='*60}\\n\")\n",
                "        df = pd.read_csv(csv_path)\n",
                "        display(df)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "download_header"
            },
            "source": [
                "## 💾 Step 8: Download Results\n",
                "\n",
                "Download all results as a ZIP file."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "download_results"
            },
            "outputs": [],
            "source": [
                "import shutil\n",
                "from google.colab import files\n",
                "import os\n",
                "\n",
                "# Create ZIP archive\n",
                "if os.path.exists('results'):\n",
                "    print(\"📦 Creating ZIP archive...\")\n",
                "    shutil.make_archive('edocr2_results', 'zip', 'results')\n",
                "    print(\"✅ Archive created!\\n\")\n",
                "    \n",
                "    print(\"⬇️ Downloading results...\")\n",
                "    files.download('edocr2_results.zip')\n",
                "    print(\"✅ Download complete!\")\n",
                "else:\n",
                "    print(\"⚠️ No results folder found. Please run the processing step first.\")"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "test_sample_header"
            },
            "source": [
                "## 🧪 Optional: Test with Sample Images\n",
                "\n",
                "If you want to test without uploading your own images, use the sample drawings from the repository."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {
                "id": "test_samples"
            },
            "outputs": [],
            "source": [
                "import shutil\n",
                "import os\n",
                "\n",
                "# Process a sample drawing from the repository\n",
                "sample_file = 'tests/test_samples/Candle_holder.jpg'\n",
                "\n",
                "if os.path.exists(sample_file):\n",
                "    print(f\"🧪 Testing with sample: {sample_file}\\n\")\n",
                "    \n",
                "    # Copy to my_drawings folder\n",
                "    os.makedirs('my_drawings', exist_ok=True)\n",
                "    shutil.copy(sample_file, 'my_drawings/')\n",
                "    \n",
                "    print(\"✅ Sample copied to my_drawings/\")\n",
                "    print(\"   Now run Step 5 to process it!\")\n",
                "else:\n",
                "    print(\"⚠️ Sample file not found. Please run Step 1 first.\")"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {
                "id": "footer"
            },
            "source": [
                "---\n",
                "\n",
                "## 📝 Usage Instructions\n",
                "\n",
                "### **CRITICAL WORKFLOW:**\n",
                "1. **Run Step 1** (Install dependencies)\n",
                "2. **RESTART RUNTIME** (`Runtime` → `Restart runtime`)\n",
                "3. **Run Step 2** (Verify NumPy 1.x is installed)\n",
                "4. **Continue with Steps 3-8**\n",
                "\n",
                "### **Why Restart?**\n",
                "Google Colab caches NumPy in memory. Restarting ensures NumPy 1.26.4 is loaded correctly.\n",
                "\n",
                "---\n",
                "\n",
                "## 📝 Notes\n",
                "\n",
                "- **Supported formats**: JPG, PNG, PDF\n",
                "- **Best results**: High-resolution scans with clear text\n",
                "- **Processing time**: ~10-30 seconds per drawing\n",
                "- **GPU acceleration**: Automatically used if available\n",
                "- **Model size**: ~140 MB total\n",
                "- **Required**: NumPy 1.26.4, OpenCV 4.8.1\n",
                "\n",
                "## 🔗 Resources\n",
                "\n",
                "- [eDOCr2 GitHub Repository](https://github.com/javvi51/edocr2)\n",
                "- [Research Paper](http://dx.doi.org/10.2139/ssrn.5045921)\n",
                "- [Model Downloads (v1.0.0)](https://github.com/javvi51/edocr2/releases/tag/v1.0.0)\n",
                "\n",
                "## 🐛 Troubleshooting\n",
                "\n",
                "**NumPy Error (`np.sctypes` or `AttributeError`)**: \n",
                "- Did you restart runtime after Step 1?\n",
                "- Run Step 2 to verify NumPy version\n",
                "- If still NumPy 2.x, re-run Step 1 and restart again\n",
                "\n",
                "**Model Download Failed**: \n",
                "- Check internet connection\n",
                "- Download manually from releases page\n",
                "\n",
                "**Processing Errors**: \n",
                "- Ensure images are clear and high-resolution\n",
                "- Try the sample image first (Step 9)\n",
                "\n",
                "---\n",
                "\n",
                "**Created by**: Jeyanthan GJ  \n",
                "**Based on**: eDOCr2 by Javier Villena Toro\n"
            ]
        }
    ],
    "metadata": {
        "accelerator": "GPU",
        "colab": {
            "gpuType": "T4",
            "provenance": []
        },
        "kernelspec": {
            "display_name": "Python 3",
            "name": "python3"
        },
        "language_info": {
            "name": "python"
        }
    },
    "nbformat": 4,
    "nbformat_minor": 0
}