Spaces:

AswinMathew
/

ocr-omr-backend

Sleeping

Deploy OCR/OMR backend to HF Spaces

b8548e4 verified about 1 month ago

2.43 kB

OCR Backend

Backend API for OCR on handwritten images.

Create a virtual environment:
```
python -m venv venv
```
Activate the environment:
- Windows:
```
.\venv\Scripts\activate
```
Install dependencies:
```
pip install -r requirements.txt
```
For Tesseract OCR: Install Tesseract on your system. Download from Tesseract GitHub.
- If pytesseract can't find Tesseract, you might need to set the path in app.py:
```
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
```
For PDF processing: Install Poppler. Download from Poppler for Windows.
- You will need to update the poppler_path in app.py to point to the bin directory of your Poppler installation (e.g., r'C:\Program Files\poppler-0.68.0\bin')

python app.py

API will be at http://127.0.0.1:5000.

Uses EasyOCR to extract text from images.

Request: multipart/form-data with images (one or more image files).

Example (curl):

curl -X POST -F "images=@/path/to/your/image1.png" http://127.0.0.1:5000/easyocr

Uses Tesseract OCR to extract text from images.

Request: multipart/form-data with images (one or more image files).

Example (curl):

curl -X POST -F "images=@/path/to/your/image1.png" http://127.0.0.1:5000/tesseract

Processes an image or PDF of a question paper to extract questions and answers.

Request: multipart/form-data with file (a single image or PDF file).

Example (curl for image):

curl -X POST -F "file=@/path/to/your/question_paper.png" http://127.0.0.1:5000/process_question_paper

Example (curl for PDF):

curl -X POST -F "file=@/path/to/your/question_paper.pdf" http://127.0.0.1:5000/process_question_paper

Compares OCR extracted texts with the answers from the last processed question paper.

Request: None (GET request).

Example (curl):

curl -X GET http://127.0.0.1:5000/evaluate_answers