Spaces:
Runtime error
Runtime error
| # ๐ง eDOCr2 Local Web Application | |
| A Flask-based web interface for running eDOCr2 engineering drawing OCR locally on your machine. | |
| ## ๐ฏ Features | |
| - โ **Drag & Drop Upload** - Easy file upload interface | |
| - โ **Real-time Processing** - Live feedback during OCR processing | |
| - โ **Visual Results** - Annotated drawings with detected elements highlighted | |
| - โ **Structured Data** - Extract tables, dimensions, and GD&T symbols | |
| - โ **Download Results** - Get all results as a ZIP file | |
| - โ **Responsive Design** - Works on desktop and mobile browsers | |
| ## ๐ Prerequisites | |
| ### System Requirements | |
| 1. **Python 3.8 - 3.11** (NumPy 1.26.4 compatibility) | |
| 2. **Tesseract OCR** - Required for text recognition | |
| 3. **Poppler** (for PDF support) | |
| ### Installing System Dependencies | |
| #### Windows: | |
| ```bash | |
| # Install Tesseract OCR | |
| # Download from: https://github.com/UB-Mannheim/tesseract/wiki | |
| # Add to System PATH | |
| # Install Poppler | |
| # Download from: https://github.com/oschwartz10612/poppler-windows/releases | |
| # Extract and add bin/ to System PATH | |
| ``` | |
| #### Linux (Ubuntu/Debian): | |
| ```bash | |
| sudo apt-get update | |
| sudo apt-get install tesseract-ocr poppler-utils | |
| ``` | |
| #### macOS: | |
| ```bash | |
| brew install tesseract poppler | |
| ``` | |
| ## ๐ Installation | |
| ### Step 1: Clone Repository | |
| ```bash | |
| git clone https://github.com/javvi51/edocr2.git | |
| cd edocr2 | |
| ``` | |
| ### Step 2: Create Virtual Environment | |
| ```bash | |
| # Windows | |
| python -m venv venv | |
| venv\Scripts\activate | |
| # Linux/Mac | |
| python3 -m venv venv | |
| source venv/bin/activate | |
| ``` | |
| ### Step 3: Install Python Dependencies | |
| ```bash | |
| pip install -r requirements_webapp.txt | |
| ``` | |
| ### Step 4: Download Pre-trained Models | |
| Download the model files from the [GitHub Releases](https://github.com/javvi51/edocr2/releases/tag/v1.0.0): | |
| 1. `recognizer_gdts.keras` (67.2 MB) | |
| 2. `recognizer_gdts.txt` | |
| 3. `recognizer_dimensions_2.keras` (67.2 MB) | |
| 4. `recognizer_dimensions_2.txt` | |
| Place them in: `edocr2/models/` | |
| **Quick Download (Linux/Mac):** | |
| ```bash | |
| mkdir -p edocr2/models | |
| cd edocr2/models | |
| wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.keras | |
| wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.txt | |
| wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.keras | |
| wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.txt | |
| cd ../.. | |
| ``` | |
| **Quick Download (Windows PowerShell):** | |
| ```powershell | |
| New-Item -ItemType Directory -Force -Path edocr2\models | |
| cd edocr2\models | |
| Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.keras" -OutFile "recognizer_gdts.keras" | |
| Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.txt" -OutFile "recognizer_gdts.txt" | |
| Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.keras" -OutFile "recognizer_dimensions_2.keras" | |
| Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.txt" -OutFile "recognizer_dimensions_2.txt" | |
| cd ..\.. | |
| ``` | |
| ## ๐ฎ Usage | |
| ### Start the Server | |
| ```bash | |
| python app.py | |
| ``` | |
| You should see: | |
| ``` | |
| ๐ง Loading OCR models... | |
| โ Models loaded in X.XX seconds | |
| โ Server ready! | |
| ๐ฑ Open your browser and go to: http://localhost:5000 | |
| ``` | |
| ### Using the Web Interface | |
| 1. **Open Browser** - Navigate to `http://localhost:5000` | |
| 2. **Upload Drawing** - Drag & drop or click to browse | |
| 3. **Wait for Processing** - Takes 10-30 seconds | |
| 4. **View Results** - See annotated drawing and extracted data | |
| 5. **Download** - Get all results as ZIP file | |
| ### Supported File Formats | |
| - โ **JPG/JPEG** - Engineering drawing images | |
| - โ **PNG** - Engineering drawing images | |
| - โ **PDF** - Engineering drawing PDFs (first page only) | |
| **Maximum file size:** 50 MB | |
| ## ๐ What Gets Extracted | |
| The application extracts: | |
| 1. **Tables** - Title blocks, revision tables, bill of materials | |
| 2. **GD&T Symbols** - Geometric dimensioning and tolerancing | |
| 3. **Dimensions** - Measurements with tolerances | |
| 4. **Other Info** - Additional text and annotations | |
| ## ๐ Output Files | |
| Results are saved in the `results/` folder: | |
| ``` | |
| results/ | |
| โโโ 20231218_123456_drawing_name/ | |
| โโโ drawing_name_mask.png # Annotated visualization | |
| โโโ drawing_name.json # Structured data (JSON) | |
| โโโ drawing_name.csv # Tabular data (CSV) | |
| ``` | |
| ## ๐ Troubleshooting | |
| ### Models Not Loading | |
| **Error:** `Model files not found!` | |
| **Solution:** | |
| - Ensure models are in `edocr2/models/` | |
| - Check file names match exactly | |
| - Verify files aren't corrupted (check file sizes) | |
| ### NumPy Version Error | |
| **Error:** `AttributeError: np.sctypes was removed` | |
| **Solution:** | |
| ```bash | |
| pip uninstall numpy | |
| pip install numpy==1.26.4 | |
| ``` | |
| ### Tesseract Not Found | |
| **Error:** `TesseractNotFoundError` | |
| **Solution:** | |
| - Install Tesseract OCR | |
| - Add to System PATH | |
| - Restart terminal/IDE | |
| ### PDF Processing Error | |
| **Error:** `PDFInfoNotInstalledError` | |
| **Solution:** | |
| - Install Poppler | |
| - Add poppler/bin to System PATH | |
| - Restart terminal | |
| ### Port Already in Use | |
| **Error:** `Address already in use` | |
| **Solution:** | |
| ```bash | |
| # Change port in app.py (last line): | |
| app.run(debug=True, host='0.0.0.0', port=5001) # Use different port | |
| ``` | |
| ## โ๏ธ Configuration | |
| Edit `app.py` to customize: | |
| ```python | |
| # Maximum upload size | |
| app.config['MAX_CONTENT_LENGTH'] = 50 * 1024 * 1024 # 50MB | |
| # Server port | |
| app.run(debug=True, host='0.0.0.0', port=5000) | |
| # Processing parameters | |
| frame_thres=0.7 # Frame detection threshold | |
| GDT_thres=0.02 # GD&T detection threshold | |
| cluster_thres=20 # Dimension clustering threshold | |
| max_img_size=1048 # Maximum image size for processing | |
| ``` | |
| ## ๐ Security Notes | |
| โ ๏ธ **This is a local development server** | |
| - Not suitable for production deployment | |
| - No authentication/authorization | |
| - No HTTPS encryption | |
| - Only use on trusted local network | |
| For production deployment, use: | |
| - Gunicorn/uWSGI | |
| - Nginx reverse proxy | |
| - HTTPS certificates | |
| - Authentication middleware | |
| ## ๐ API Endpoints | |
| ### `POST /upload` | |
| Upload and process a drawing file. | |
| **Request:** `multipart/form-data` with `file` field | |
| **Response:** | |
| ```json | |
| { | |
| "success": true, | |
| "filename": "drawing.jpg", | |
| "processing_time": 15.23, | |
| "stats": { | |
| "tables_found": 2, | |
| "gdt_symbols": 5, | |
| "dimensions": 23, | |
| "other_info": 8 | |
| }, | |
| "data": { ... }, | |
| "mask_path": "drawing_mask.png", | |
| "output_dir": "20231218_123456_drawing" | |
| } | |
| ``` | |
| ### `GET /results/<folder>/<filename>` | |
| Retrieve a result file. | |
| ### `GET /download/<folder>` | |
| Download all results as ZIP. | |
| ### `GET /health` | |
| Health check endpoint. | |
| ## ๐ Resources | |
| - [eDOCr2 GitHub](https://github.com/javvi51/edocr2) | |
| - [Research Paper](http://dx.doi.org/10.2139/ssrn.5045921) | |
| - [Model Downloads](https://github.com/javvi51/edocr2/releases/tag/v1.0.0) | |
| ## ๐จโ๐ป Credits | |
| - **Original eDOCr2**: Javier Villena Toro | |
| - **Web Application**: Jeyanthan GJ | |
| - **License**: MIT | |
| ## ๐ค Contributing | |
| Issues and pull requests are welcome! | |
| --- | |
| **Enjoy using eDOCr2! ๐** | |