File size: 3,788 Bytes
447659f
3f42a6f
 
 
447659f
 
3f42a6f
447659f
 
 
 
 
3f42a6f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
title: eDOCr2 - Engineering Drawing OCR
emoji: πŸ”§
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---

# πŸ”§ eDOCr2 - Engineering Drawing OCR

Extract **dimensions**, **tables**, and **GD&T symbols** from engineering drawings automatically using deep learning.

## 🎯 Features

- βœ… **Table Extraction** - Title blocks, revision tables, bill of materials
- βœ… **GD&T Recognition** - Geometric dimensioning and tolerancing symbols
- βœ… **Dimension Detection** - Measurements with tolerances
- βœ… **Multi-format Support** - JPG, PNG, PDF
- βœ… **Structured Output** - JSON and CSV export
- βœ… **Visual Annotation** - Highlighted detection results

## πŸš€ How to Use

1. **Upload** your engineering drawing (JPG, PNG, or PDF)
2. **Click** "Process Drawing" button
3. **View** annotated results and extracted data
4. **Download** complete results as ZIP file

## πŸ“Š What Gets Extracted

### Tables
- Title blocks with part information
- Revision history tables
- Bill of materials (BOM)
- General notes and specifications

### GD&T Symbols
- Geometric tolerancing symbols
- Feature control frames
- Datum references

### Dimensions
- Linear dimensions
- Angular dimensions
- Tolerance values
- Diameter and radius callouts

## πŸ”§ Technology Stack

- **Deep Learning Models**: Custom-trained Keras OCR models
- **Text Detection**: CRAFT-based detector
- **Text Recognition**: CRNN-based recognizer
- **Symbol Matching**: Template matching algorithms
- **Framework**: Gradio for web interface

## πŸ“š Research

This tool is based on the research paper:

**"eDOCr2: Automated Extraction of Information from Engineering Drawings"**  
[http://dx.doi.org/10.2139/ssrn.5045921](http://dx.doi.org/10.2139/ssrn.5045921)

## πŸ’‘ Tips for Best Results

- Use **high-resolution** scans (300 DPI or higher)
- Ensure **clear text** and symbols
- Avoid **skewed** or rotated images
- Use **clean** drawings without handwritten annotations

## πŸ› οΈ Local Installation

To run this locally:

```bash
# Clone repository
git clone https://github.com/javvi51/edocr2.git
cd edocr2

# Install dependencies
pip install -r requirements.txt

# Download models (see releases)
# Place in edocr2/models/

# Run app
python app.py
```

## πŸ“¦ Model Files

The pre-trained models are automatically loaded from the repository:
- `recognizer_gdts.keras` (67.2 MB) - GD&T symbol recognition
- `recognizer_dimensions_2.keras` (67.2 MB) - Dimension recognition

Download from: [GitHub Releases](https://github.com/javvi51/edocr2/releases/tag/v1.0.0)

## πŸ”— Links

- **GitHub Repository**: [github.com/javvi51/edocr2](https://github.com/javvi51/edocr2)
- **Research Paper**: [DOI:10.2139/ssrn.5045921](http://dx.doi.org/10.2139/ssrn.5045921)
- **Original Author**: Javier Villena Toro
- **Deployed by**: Jeyanthan GJ

## πŸ“ License

MIT License - See LICENSE file for details

## 🀝 Citation

If you use this tool in your research, please cite:

```bibtex
@article{villena2024edocr2,
  title={eDOCr2: Automated Extraction of Information from Engineering Drawings},
  author={Villena Toro, Javier},
  year={2024},
  doi={10.2139/ssrn.5045921}
}
```

## ⚠️ Limitations

- Works best with mechanical/production drawings
- Requires clear, high-quality scans
- May struggle with handwritten annotations
- Processing time: 10-30 seconds per drawing

## πŸ› Known Issues

- PDF support limited to first page only
- Very large images (>10MB) may timeout
- Some custom GD&T symbols may not be recognized

## πŸ“§ Contact

For issues and questions:
- Open an issue on [GitHub](https://github.com/javvi51/edocr2/issues)
- Check the [documentation](https://github.com/javvi51/edocr2/blob/main/docs/examples.md)

---

**Enjoy using eDOCr2! πŸš€**