Spaces:

elsayedelmandoh
/

sentiment-sleuth

Sleeping

update readme, datasets, and structure

413d3a1 3 months ago

1 kB

	# Project Definition - Quickstart

	This quickstart explains how to prepare the environment and reproduce core experiments and inference from the repository.

	1) Create a Python environment and install dependencies:

	```
	python -m venv .venv
	source .venv/Scripts/activate
	pip install -r requirements.txt
	```

	2) Inspect processed data and vectorizers (already available in repo):

	- `data/processed/` contains prepared train/valid/test CSVs and labels.
	- `data/vectorizers/` contains the fitted TF-IDF vectorizer and sparse matrices.

	3) Run notebooks (recommended order):

	- `notebooks/01_data_acquisition.ipynb`
	- `notebooks/02_eda.ipynb`
	- `notebooks/03_data_preprocessing.ipynb`
	- `notebooks/04_feature_engineering.ipynb`
	- modeling notebooks `05_*.ipynb` → `14_comparsion.ipynb`

	4) Run the demo/app:

	```
	streamlit run app.py
	```

	Notes:
	- Preprocessed artifacts and trained model joblib files are stored under `data/processed`, `data/vectorizers`, and `data/models` to speed up reproduction.