# Project Definition - Quickstart This quickstart explains how to prepare the environment and reproduce core experiments and inference from the repository. 1) Create a Python environment and install dependencies: ``` python -m venv .venv source .venv/Scripts/activate pip install -r requirements.txt ``` 2) Inspect processed data and vectorizers (already available in repo): - `data/processed/` contains prepared train/valid/test CSVs and labels. - `data/vectorizers/` contains the fitted TF-IDF vectorizer and sparse matrices. 3) Run notebooks (recommended order): - `notebooks/01_data_acquisition.ipynb` - `notebooks/02_eda.ipynb` - `notebooks/03_data_preprocessing.ipynb` - `notebooks/04_feature_engineering.ipynb` - modeling notebooks `05_*.ipynb` → `14_comparsion.ipynb` 4) Run the demo/app: ``` streamlit run app.py ``` Notes: - Preprocessed artifacts and trained model joblib files are stored under `data/processed`, `data/vectorizers`, and `data/models` to speed up reproduction.