--- title: Homework Validation System emoji: 📚 colorFrom: blue colorTo: purple sdk: docker pinned: false --- <<<<<<< HEAD --- title: Homework Validation System sdk: docker app_port: 7860 --- hello ======= --- title: Homework Validation System sdk: docker app_port: 7860 --- # Homework Validation System (FastAPI) A backend API that validates student homework by extracting text from teacher and student files, comparing answers, and generating remarks using rule-based logic and optional AI. --- ## Features - Upload teacher and student homework files - OCR support for images and scanned PDFs - Text extraction from PDF and DOCX - Similarity matching using TF-IDF + cosine similarity - Optional AI-generated remarks (OpenAI / Gemini) - FastAPI Swagger documentation --- ## Tech Stack - FastAPI - Python - pytesseract - Pillow - pypdf / pdf2image - python-docx - scikit-learn - OpenAI / Gemini (optional) --- ## Project Structure --- homework_validation_system/ │ ├── app.py ├── requirements.txt ├── artifacts/ ├── uploads/ ├── src/ │ ├── extractors.py │ ├── similarity.py │ ├── llm_client.py │ └── utils.py └── README.md ## Installation ### 1. Create Virtual Environment python -m venv myenv ### 2. Install Requirements pip install -r requirements.txt ## OCR Setup (Required) ### Install Tesseract OCR This project uses **Tesseract OCR** for extracting text from images and scanned PDFs. #### Windows 1. Download and install Tesseract OCR. 2. Default installation path: 3. Add this path in your code: pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe" ### Run API uvicorn app:app --reload --host 0.0.0.0 --port 8000 ### Swagger UI: http://localhost:8000/docs ### Example API Response { "student_id": 1, "homework_id": 10, "status": "Needs Review", "match_percentage": 72, "teacher_extracted_text": "...", "student_extracted_text": "...", "ai_generated_remark": "Good attempt but missing key points.", "llm_used": true } >>>>>>> cdb5b148e5facdea1aec264a5b4d0b6293132b6e