| <div style="display: flex; align-items: center; justify-content: center;"> |
| <div style="margin-right: 20px;"> |
| <img src="https://cdn-lfs-us-1.hf.co/repos/de/fb/defb007867acd8852f4a283e9b06a933778826b18ed58ade01da945f5903795d/8b7831230df7d554c74f5e249e23be57165d143fea0ea7b5dde56dde5c13c95b?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27turing-test.gif%3B+filename%3D%22turing-test.gif%22%3B&response-content-type=image%2Fgif&Expires=1730008247&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTczMDAwODI0N319LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zL2RlL2ZiL2RlZmIwMDc4NjdhY2Q4ODUyZjRhMjgzZTliMDZhOTMzNzc4ODI2YjE4ZWQ1OGFkZTAxZGE5NDVmNTkwMzc5NWQvOGI3ODMxMjMwZGY3ZDU1NGM3NGY1ZTI0OWUyM2JlNTcxNjVkMTQzZmVhMGVhN2I1ZGRlNTZkZGU1YzEzYzk1Yj9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=GBUn-4z3PMBTqT0NdT3H-NyZxNMGcN4zDNzK8ql%7ESwLF8pXzkH783GSCZQQYWwE-v1g90JTulsOt7z5szigK49ApFju6bkS2zwUAYxNttcl3c-VYrxGuFWYnkHpTQ73qbs3ELF2-5LzDy1ARpj3BOlSEXtH9ShwCRm-R0llQJ6EDx2eOyBIDg-Pgrx%7EKIxrdAZCNln9tJk74TrSN5survdIvcSZrSIGXc3tpFLm-BwpY6qtID3ltrPEHYWDrQ5ALV8lXqKmpVlFSq3lOEFlSa-opFJwe%7E8FIIwP5mJgtCZzlQQylRhsVLxDQ2cJYpTbZSvEVkfjyTxOP4dc%7EDz1tVQ__&Key-Pair-Id=K24J24Z295AEI9" |
| alt="AI App Icon" width="100" height="50" |
| style="border-radius: 20px; border: 2px solid #333;"> |
| </div> |
| <div> |
| <p style="font-size: 50px; font-weight: bold; text-align: center; margin: 0;"> |
| Spacy Model Creator |
| </p> |
| </div> |
| </div> |
| <hr> |
| |
| <hr> |
|
|
| # Overview: |
| This project is a comprehensive Resume Parsing tool built using Python, |
| integrating the Mistral-Nemo-Instruct-2407 model for primary parsing. |
|
|
| # Installation Guide: |
|
|
| 1. Create and Activate a Virtual Environment |
| python -m venv venv |
| source venv/bin/activate # For Linux/Mac |
| # or |
| venv\Scripts\activate # For Windows |
| |
| # NOTE: If the virtual environment (venv) is already created, you can skip the creation step and just activate. |
| - For Linux/Mac: |
| source venv/bin/activate |
| - For Windows: |
| venv\Scripts\activate |
| |
| 2. Install Required Libraries |
| pip install -r requirements.txt |
| |
| # Ensure the following dependencies are included: |
| - Flask |
| - spaCy |
| - huggingface_hub |
| - PyMuPDF |
| - python-docx |
| - Tesseract-OCR (for image-based parsing) |
| |
| ; NOTE : If any model or library is not installed, you can install it using: |
| pip install <model_name> |
| _Replace <model_name> with the specific model or library you need to install_ |
| |
| 3. Set up Hugging Face Token |
| - Add your Hugging Face token to the .env file as: |
| HF_TOKEN=<your_huggingface_token> |
| |
|
|
| # File Structure Overview: |
| Spacy_Model_creator/ |
| β |
| βββ Models/ |
| β βββ ner_model_05_3 # Pretrained spaCy model directory for resume parsing |
| β |
| βββ data/ |
| β βββ Json_data.json |
| β βββ resume_text.txt |
| β βββ Spacy_data.spacy |
| β |
| βββ templates/ |
| β βββ anoter.html |
| β βββ result.html |
| β βββ guide.html |
| β βββ savejson.html |
| β βββ savespacy.html |
| β βββ text.html |
| β βββ upload.html |
| β βββ data_files.html |
| β |
| βββ JSON/ |
| β βββ Json_data.json |
| β |
| βββ utils/ |
| β βββ model.py # Code for calling Mistral API and handling responses |
| β βββ json_to_spacy.py # spaCy fallback model for parsing resumes |
| β βββ anoter_to_json.py # Error handling utilities |
| β βββ file_To_text.py # Functions to extract text from different file formats (PDF, DOCX, etc.) |
| β |
| βββ venv/ # Virtual environment |
| β |
| βββ .env # Environment variables file (contains Hugging Face token) |
| β |
| βββ app.py # Flask app handling API routes for uploading and processing resumes |
| β |
| βββ requirements.txt # Dependencies required for the project |
| |
| # References: |
|
|
| - [Flask Documentation](https://flask.palletsprojects.com/) |
| - [spaCy Documentation](https://spacy.io/usage) |
| - [Hugging Face Hub API](https://huggingface.co/docs/huggingface_hub/index) |
| - [PyMuPDF (MuPDF) Documentation](https://pymupdf.readthedocs.io/en/latest/) |
| - [python-docx Documentation](https://python-docx.readthedocs.io/en/latest/) |
| - [Tesseract OCR Documentation](https://github.com/UB-Mannheim/tesseract/wiki) |
| - [Virtual Environments in Python](https://docs.python.org/3/tutorial/venv.html) |