Spaces:
Sleeping
Sleeping
OCR Backend
Backend API for OCR on handwritten images.
Setup
- Create a virtual environment:
python -m venv venv - Activate the environment:
- Windows:
.\venv\Scripts\activate
- Windows:
- Install dependencies:
pip install -r requirements.txt - For Tesseract OCR: Install Tesseract on your system. Download from Tesseract GitHub.
- If
pytesseractcan't find Tesseract, you might need to set the path inapp.py:pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
- If
- For PDF processing: Install Poppler. Download from Poppler for Windows.
- You will need to update the
poppler_pathinapp.pyto point to thebindirectory of your Poppler installation (e.g.,r'C:\Program Files\poppler-0.68.0\bin')
- You will need to update the
Run
python app.py
API will be at http://127.0.0.1:5000.
API Endpoints
POST /easyocr
Uses EasyOCR to extract text from images.
Request: multipart/form-data with images (one or more image files).
Example (curl):
curl -X POST -F "images=@/path/to/your/image1.png" http://127.0.0.1:5000/easyocr
POST /tesseract
Uses Tesseract OCR to extract text from images.
Request: multipart/form-data with images (one or more image files).
Example (curl):
curl -X POST -F "images=@/path/to/your/image1.png" http://127.0.0.1:5000/tesseract
POST /process_question_paper
Processes an image or PDF of a question paper to extract questions and answers.
Request: multipart/form-data with file (a single image or PDF file).
Example (curl for image):
curl -X POST -F "file=@/path/to/your/question_paper.png" http://127.0.0.1:5000/process_question_paper
Example (curl for PDF):
curl -X POST -F "file=@/path/to/your/question_paper.pdf" http://127.0.0.1:5000/process_question_paper
GET /evaluate_answers
Compares OCR extracted texts with the answers from the last processed question paper.
Request: None (GET request).
Example (curl):
curl -X GET http://127.0.0.1:5000/evaluate_answers