Spaces:
Sleeping
Sleeping
File size: 3,254 Bytes
697659c ce2c75c 697659c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
---
title: GOT OCR Web App # Title of your app
emoji: π # You can choose any emoji that represents your app
colorFrom: blue # Start color for the gradient background
colorTo: green # End color for the gradient background
sdk: streamlit # Your app uses Streamlit
sdk_version: "1.21.0" # Version of Streamlit you are using
app_file: app.py # Entry point of your application
pinned: false # Whether this Space should be pinned on your profile
---
# OCR Web Application
## Project Overview
This is a **web-based Optical Character Recognition (OCR) application** built using Streamlit. The app supports both English and Hindi languages, allowing users to upload images and extract text using advanced OCR models.
## How the Application Works
1. Choose Language: Select either English or Hindi using the sidebar instructions.
2. Upload Image: Use the file uploader to input an image in JPG, PNG, or JPEG format.
3. Text Extraction: For English, the app uses the GOT OCR 2.0 model to extract text, while for Hindi, it leverages EasyOCR.
4. Keyword Search: After text extraction, you can search for specific keywords within the extracted text. Matching keywords will be highlighted, and any missing keywords will be displayed in a warning message.
5. Reset: If needed, reset the session and upload a new image to start over.
## Installation and Setup
### Prerequisites:
- **Python 3.8 or higher**
- Required libraries listed in `requirements.txt`
### Installation Steps:
1. **Clone the repository**:
```bash
git clone https://github.com/Trisandhyadevi/OCR.git
2. **Navigate to the project directory**
```bash
cd OCR
3. **Install the required dependencies:**
```bash
pip install -r requirements.txt
4. **Run the application:**
```bash
streamlit run app.py
# Description
This web application supports converting images to text using the GOT OCR 2.0 Model. Below are some key features of the GOT OCR 2.0 model
# GOT OCR 2.0 Model Overview
The GOT OCR 2.0 Model is a state-of-the-art OCR system designed for accurate text extraction from images. Key features include:
- **Multi-task Learning**: The model supports various tasks beyond OCR, including layout analysis and object detection, making it versatile for diverse text recognition needs.
- **End-to-End Pipeline**: It efficiently processes entire images, identifying and extracting text without the need for additional preprocessing steps.
Note: Currently, the model does not support all languages. Fine-tuning is required for languages not included in the pre-trained model. For more information on fine-tuning, visit the [GOT OCR 2.0 Fine-tuning Guide](https://github.com/Ucas-HaoranWei/GOT-OCR2.0/?tab=readme-ov-file#fine-tune).
For more technical details about the model architecture and usage, visit the [GOT OCR 2.0 Model Documentation](https://github.com/Ucas-HaoranWei/GOT-OCR2.0/?tab=readme-ov-file#general-ocr-theory-towards-ocr-20-via-a-unified-end-to-end-model).
## Deployment
To deploy the application to a cloud platform(Hugging Face)
## Folder Structure
```bash
.
βββ app.py # Main application file
βββ requirements.txt # Python dependencies
βββ README.md # Projectdocumentation
|