Spaces:

elsayedelmandoh
/

sentiment-sleuth

Sleeping

App Files Files Community

sentiment-sleuth / docs /01_project_definition /06_workflow.md

elsayedelmandoh's picture

elsayedelmandoh

create nitebooks and update project definition

9b59c06 3 months ago

|

history blame contribute delete

973 Bytes

A newer version of the Streamlit SDK is available: 1.58.0

Project Definition - Operational Workflow (JUST workflow based on stack and architecture files )

ML Engineering Workflow

We will follow an iterative, agile approach to this project:

Exploratory Data Analysis (EDA): Understand class distribution (Are there more positive reviews than negative?) and text length statistics.
Data Split: Hold out 20% of the data strictly for final testing to prevent data leakage.
The "Dummy" Baseline: First, train a Naive Bayes model. This sets our minimum performance threshold. If a complex Random Forest can't beat Naive Bayes, we don't use it.
Experimentation: Loop through our list of models (Logistic Regression, SVM, etc.). Log the F1-score and training time for each.
Hyperparameter Tuning: Take the top 2 performing models and use GridSearchCV to optimize their parameters.
Final Evaluation: Run the optimized models on the held-out test set and generate final visual reports.