elsayedelmandoh's picture
update structure
6285bf1

A newer version of the Streamlit SDK is available: 1.58.0

Upgrade

Project Definition - Project Structure (Where is the code?)

sentiment-analysis-of-amazon-reviews-using-machine-learning/
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ models/          # Saved model files (.joblib)
β”‚   β”œβ”€β”€ processed/       # Cleaned & feature-engineered datasets
β”‚   β”‚   β”œβ”€β”€ balanced_sample_train.csv
β”‚   β”‚   β”œβ”€β”€ feat_eng_train.csv
β”‚   β”‚   β”œβ”€β”€ processed_test.csv
β”‚   β”‚   β”œβ”€β”€ processed_train.csv
β”‚   β”‚   β”œβ”€β”€ processed_valid.csv
β”‚   β”‚   β”œβ”€β”€ y_test.csv
β”‚   β”‚   β”œβ”€β”€ y_train.csv
β”‚   β”‚   └── y_valid.csv
β”‚   β”œβ”€β”€ raw/             # Original immutable dataset
β”‚   β”‚   β”œβ”€β”€ readme.txt
β”‚   β”‚   β”œβ”€β”€ train.csv
β”‚   β”‚   └── test.csv
β”‚   β”œβ”€β”€ samples/         # Small sample files for quick testing
β”‚   |   β”œβ”€β”€ sample_test.csv
β”‚   |   β”œβ”€β”€ sample_train.csv
β”‚   |   └── sample_valid.csv
β”‚   └── vectorizers/     # Saved vectorizers and sparse matrices (TF-IDF)
β”‚       β”œβ”€β”€ tfidf_vectorizer.joblib
β”‚       β”œβ”€β”€ X_test_tfidf.npz
β”‚       β”œβ”€β”€ X_train_tfidf.npz
β”‚       └── X_valid_tfidf.npz
|
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ 00_research/
β”‚   β”‚   β”œβ”€β”€ datasets.md
β”‚   β”‚   β”œβ”€β”€ references.md
β”‚   β”‚   └── related_projects.md
β”‚   β”œβ”€β”€ 01_project_definition/
β”‚   |   β”œβ”€β”€ 00_quickstart.md
β”‚   |   β”œβ”€β”€ 01_problem.md
β”‚   |   β”œβ”€β”€ 02_goal.md
β”‚   |   β”œβ”€β”€ 03_solution.md
β”‚   |   β”œβ”€β”€ 04_stack.md
β”‚   |   β”œβ”€β”€ 05_architecture.md
β”‚   |   β”œβ”€β”€ 06_workflow.md
β”‚   |   β”œβ”€β”€ 07_structure.md    
β”‚   |   └── 08_report.md
β”‚   └── 02_results/     # Model prediction outputs
|
|
β”œβ”€β”€ notebooks/          
β”‚   β”œβ”€β”€ 00_quickstartt.ipynb
β”‚   β”œβ”€β”€ 01_data_acquisition.ipynb
β”‚   β”œβ”€β”€ 02_eda.ipynb
β”‚   β”œβ”€β”€ 03_data_preprocessing.ipynb
β”‚   β”œβ”€β”€ 04_feature_engineering.ipynb
β”‚   β”œβ”€β”€ 05_logistic_regression.ipynb
β”‚   β”œβ”€β”€ 06_naive_bayes.ipynb
β”‚   β”œβ”€β”€ 07_support_vector_machine.ipynb
β”‚   β”œβ”€β”€ 08_k_nearest_neighbors.ipynb
β”‚   β”œβ”€β”€ 09_decision_trees.ipynb
β”‚   β”œβ”€β”€ 10_random_forest.ipynb
β”‚   β”œβ”€β”€ 11_stochastic_gradient_descent.ipynb
β”‚   β”œβ”€β”€ 12_xgboost.ipynb
β”‚   β”œβ”€β”€ 13_lightgbm.ipynb
β”‚   └── 14_comparsion.ipynb
|
β”œβ”€β”€ src/                 # Production-style source code and helpers
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── settings.py   # configuration values and constants
β”‚   └── utils/
β”‚       β”œβ”€β”€ __init__.py
|       └── helpers.py     # Helper functions used by notebooks and app
|
β”œβ”€β”€ .env                 # Environment variables
β”œβ”€β”€ .env.example         # Example of environment variables
β”œβ”€β”€ .gitattributes
β”œβ”€β”€ .gitignore           # List of files to ignore by git
β”œβ”€β”€ app.py               # App/runner for model inference or demo
β”œβ”€β”€ README.md            # Project overview and instructions to run
└── requirements.txt     # List of dependencies (pandas, scikit-learn, etc.)