--- title: Hand Written Text Recognition emoji: 📝 colorFrom: indigo colorTo: green sdk: gradio sdk_version: 6.12.0 app_file: app.py pinned: false --- # ✍️ Handwritten Paragraph to Typed Text This project is a robust **Handwritten Text Recognition (HTR)** pipeline that converts full paragraphs of handwriting into digital, editable text. It bridges the gap between traditional Computer Vision and modern Deep Learning by combining **OpenCV** with **Transformer-based models**. ## 🚀 Live Demo You can try the live application on Hugging Face Spaces: [https://prabhatgupta786-hand-written-text-recognition.hf.space] --- ## 🛠️ Technical Architecture The system follows a three-stage pipeline to ensure high accuracy on multi-line text: ### 1. Image Pre-processing (OpenCV) Standard OCR models perform best on single lines. This project uses a custom pre-processing engine to handle full paragraphs: * **Thresholding:** Converts images to binary (B&W) to isolate ink from paper. * **Morphological Dilation:** Uses a horizontal kernel `(5, 100)` to "smear" letters into line-level blobs. * **Contour Detection:** Identifies these blobs as individual lines and segments them. ### 2. Deep Learning Inference (TrOCR) The segmented lines are processed using **TrOCR (Transformer-based Optical Character Recognition)**: * **Encoder:** A **Vision Transformer (ViT)** that processes the image patches. * **Decoder:** A **RoBERTa** language model that generates text based on visual features and linguistic context. * **Framework:** Powered by **Hugging Face Transformers** and **PyTorch**. ### 3. User Interface (Gradio) The logic is wrapped in a **Gradio** web interface, allowing users to upload images and receive text outputs in real-time. --- ## 🧰 Tech Stack * **Language:** Python * **Computer Vision:** OpenCV, NumPy, Pillow * **Deep Learning:** PyTorch, Transformers (TrOCR-Large) * **Deployment:** Gradio, Hugging Face Spaces * **Version Control:** Git LFS (Large File Storage for model binaries) --- ## 📂 Project Structure * `app.py`: The main entry point containing the OpenCV segmentation logic and Gradio UI. * `requirements.txt`: List of dependencies (torch, opencv-python, transformers, etc.). * `.gitattributes`: Configuration for Git LFS to track large model files. * `README.md`: Documentation and project metadata. ---