Spaces:
Sleeping
Sleeping
A newer version of the Streamlit SDK is available: 1.58.0
Project Definition - Quickstart
This quickstart explains how to prepare the environment and reproduce core experiments and inference from the repository.
- Create a Python environment and install dependencies:
python -m venv .venv
source .venv/Scripts/activate
pip install -r requirements.txt
- Inspect processed data and vectorizers (already available in repo):
data/processed/contains prepared train/valid/test CSVs and labels.data/vectorizers/contains the fitted TF-IDF vectorizer and sparse matrices.
- Run notebooks (recommended order):
notebooks/01_data_acquisition.ipynbnotebooks/02_eda.ipynbnotebooks/03_data_preprocessing.ipynbnotebooks/04_feature_engineering.ipynb- modeling notebooks
05_*.ipynb→14_comparsion.ipynb
- Run the demo/app:
streamlit run app.py
Notes:
- Preprocessed artifacts and trained model joblib files are stored under
data/processed,data/vectorizers, anddata/modelsto speed up reproduction.