--- title: eDOCr2 - Engineering Drawing OCR emoji: 🔧 colorFrom: purple colorTo: blue sdk: gradio sdk_version: 4.44.0 app_file: app.py pinned: false license: mit --- # 🔧 eDOCr2 - Engineering Drawing OCR Extract **dimensions**, **tables**, and **GD&T symbols** from engineering drawings automatically using deep learning. ## 🎯 Features - ✅ **Table Extraction** - Title blocks, revision tables, bill of materials - ✅ **GD&T Recognition** - Geometric dimensioning and tolerancing symbols - ✅ **Dimension Detection** - Measurements with tolerances - ✅ **Multi-format Support** - JPG, PNG, PDF - ✅ **Structured Output** - JSON and CSV export - ✅ **Visual Annotation** - Highlighted detection results ## 🚀 How to Use 1. **Upload** your engineering drawing (JPG, PNG, or PDF) 2. **Click** "Process Drawing" button 3. **View** annotated results and extracted data 4. **Download** complete results as ZIP file ## 📊 What Gets Extracted ### Tables - Title blocks with part information - Revision history tables - Bill of materials (BOM) - General notes and specifications ### GD&T Symbols - Geometric tolerancing symbols - Feature control frames - Datum references ### Dimensions - Linear dimensions - Angular dimensions - Tolerance values - Diameter and radius callouts ## 🔧 Technology Stack - **Deep Learning Models**: Custom-trained Keras OCR models - **Text Detection**: CRAFT-based detector - **Text Recognition**: CRNN-based recognizer - **Symbol Matching**: Template matching algorithms - **Framework**: Gradio for web interface ## 📚 Research This tool is based on the research paper: **"eDOCr2: Automated Extraction of Information from Engineering Drawings"** [http://dx.doi.org/10.2139/ssrn.5045921](http://dx.doi.org/10.2139/ssrn.5045921) ## 💡 Tips for Best Results - Use **high-resolution** scans (300 DPI or higher) - Ensure **clear text** and symbols - Avoid **skewed** or rotated images - Use **clean** drawings without handwritten annotations ## 🛠️ Local Installation To run this locally: ```bash # Clone repository git clone https://github.com/javvi51/edocr2.git cd edocr2 # Install dependencies pip install -r requirements.txt # Download models (see releases) # Place in edocr2/models/ # Run app python app.py ``` ## 📦 Model Files The pre-trained models are automatically loaded from the repository: - `recognizer_gdts.keras` (67.2 MB) - GD&T symbol recognition - `recognizer_dimensions_2.keras` (67.2 MB) - Dimension recognition Download from: [GitHub Releases](https://github.com/javvi51/edocr2/releases/tag/v1.0.0) ## 🔗 Links - **GitHub Repository**: [github.com/javvi51/edocr2](https://github.com/javvi51/edocr2) - **Research Paper**: [DOI:10.2139/ssrn.5045921](http://dx.doi.org/10.2139/ssrn.5045921) - **Original Author**: Javier Villena Toro - **Deployed by**: Jeyanthan GJ ## 📝 License MIT License - See LICENSE file for details ## 🤝 Citation If you use this tool in your research, please cite: ```bibtex @article{villena2024edocr2, title={eDOCr2: Automated Extraction of Information from Engineering Drawings}, author={Villena Toro, Javier}, year={2024}, doi={10.2139/ssrn.5045921} } ``` ## ⚠️ Limitations - Works best with mechanical/production drawings - Requires clear, high-quality scans - May struggle with handwritten annotations - Processing time: 10-30 seconds per drawing ## 🐛 Known Issues - PDF support limited to first page only - Very large images (>10MB) may timeout - Some custom GD&T symbols may not be recognized ## 📧 Contact For issues and questions: - Open an issue on [GitHub](https://github.com/javvi51/edocr2/issues) - Check the [documentation](https://github.com/javvi51/edocr2/blob/main/docs/examples.md) --- **Enjoy using eDOCr2! 🚀**