Spaces:
Sleeping
Sleeping
| # Project Definition - Quickstart | |
| This quickstart explains how to prepare the environment and reproduce core experiments and inference from the repository. | |
| 1) Create a Python environment and install dependencies: | |
| ``` | |
| python -m venv .venv | |
| source .venv/Scripts/activate | |
| pip install -r requirements.txt | |
| ``` | |
| 2) Inspect processed data and vectorizers (already available in repo): | |
| - `data/processed/` contains prepared train/valid/test CSVs and labels. | |
| - `data/vectorizers/` contains the fitted TF-IDF vectorizer and sparse matrices. | |
| 3) Run notebooks (recommended order): | |
| - `notebooks/01_data_acquisition.ipynb` | |
| - `notebooks/02_eda.ipynb` | |
| - `notebooks/03_data_preprocessing.ipynb` | |
| - `notebooks/04_feature_engineering.ipynb` | |
| - modeling notebooks `05_*.ipynb` → `14_comparsion.ipynb` | |
| 4) Run the demo/app: | |
| ``` | |
| streamlit run app.py | |
| ``` | |
| Notes: | |
| - Preprocessed artifacts and trained model joblib files are stored under `data/processed`, `data/vectorizers`, and `data/models` to speed up reproduction. | |