elsayedelmandoh's picture
update structure
6285bf1
# Project Definition - Project Structure (Where is the code?)
```text
sentiment-analysis-of-amazon-reviews-using-machine-learning/
β”œβ”€β”€ data/
β”‚ β”œβ”€β”€ models/ # Saved model files (.joblib)
β”‚ β”œβ”€β”€ processed/ # Cleaned & feature-engineered datasets
β”‚ β”‚ β”œβ”€β”€ balanced_sample_train.csv
β”‚ β”‚ β”œβ”€β”€ feat_eng_train.csv
β”‚ β”‚ β”œβ”€β”€ processed_test.csv
β”‚ β”‚ β”œβ”€β”€ processed_train.csv
β”‚ β”‚ β”œβ”€β”€ processed_valid.csv
β”‚ β”‚ β”œβ”€β”€ y_test.csv
β”‚ β”‚ β”œβ”€β”€ y_train.csv
β”‚ β”‚ └── y_valid.csv
β”‚ β”œβ”€β”€ raw/ # Original immutable dataset
β”‚ β”‚ β”œβ”€β”€ readme.txt
β”‚ β”‚ β”œβ”€β”€ train.csv
β”‚ β”‚ └── test.csv
β”‚ β”œβ”€β”€ samples/ # Small sample files for quick testing
β”‚ | β”œβ”€β”€ sample_test.csv
β”‚ | β”œβ”€β”€ sample_train.csv
β”‚ | └── sample_valid.csv
β”‚ └── vectorizers/ # Saved vectorizers and sparse matrices (TF-IDF)
β”‚ β”œβ”€β”€ tfidf_vectorizer.joblib
β”‚ β”œβ”€β”€ X_test_tfidf.npz
β”‚ β”œβ”€β”€ X_train_tfidf.npz
β”‚ └── X_valid_tfidf.npz
|
β”œβ”€β”€ docs/
β”‚ β”œβ”€β”€ 00_research/
β”‚ β”‚ β”œβ”€β”€ datasets.md
β”‚ β”‚ β”œβ”€β”€ references.md
β”‚ β”‚ └── related_projects.md
β”‚ β”œβ”€β”€ 01_project_definition/
β”‚ | β”œβ”€β”€ 00_quickstart.md
β”‚ | β”œβ”€β”€ 01_problem.md
β”‚ | β”œβ”€β”€ 02_goal.md
β”‚ | β”œβ”€β”€ 03_solution.md
β”‚ | β”œβ”€β”€ 04_stack.md
β”‚ | β”œβ”€β”€ 05_architecture.md
β”‚ | β”œβ”€β”€ 06_workflow.md
β”‚ | β”œβ”€β”€ 07_structure.md
β”‚ | └── 08_report.md
β”‚ └── 02_results/ # Model prediction outputs
|
|
β”œβ”€β”€ notebooks/
β”‚ β”œβ”€β”€ 00_quickstartt.ipynb
β”‚ β”œβ”€β”€ 01_data_acquisition.ipynb
β”‚ β”œβ”€β”€ 02_eda.ipynb
β”‚ β”œβ”€β”€ 03_data_preprocessing.ipynb
β”‚ β”œβ”€β”€ 04_feature_engineering.ipynb
β”‚ β”œβ”€β”€ 05_logistic_regression.ipynb
β”‚ β”œβ”€β”€ 06_naive_bayes.ipynb
β”‚ β”œβ”€β”€ 07_support_vector_machine.ipynb
β”‚ β”œβ”€β”€ 08_k_nearest_neighbors.ipynb
β”‚ β”œβ”€β”€ 09_decision_trees.ipynb
β”‚ β”œβ”€β”€ 10_random_forest.ipynb
β”‚ β”œβ”€β”€ 11_stochastic_gradient_descent.ipynb
β”‚ β”œβ”€β”€ 12_xgboost.ipynb
β”‚ β”œβ”€β”€ 13_lightgbm.ipynb
β”‚ └── 14_comparsion.ipynb
|
β”œβ”€β”€ src/ # Production-style source code and helpers
β”‚ β”œβ”€β”€ config/
β”‚ β”‚ β”œβ”€β”€ __init__.py
β”‚ β”‚ └── settings.py # configuration values and constants
β”‚ └── utils/
β”‚ β”œβ”€β”€ __init__.py
| └── helpers.py # Helper functions used by notebooks and app
|
β”œβ”€β”€ .env # Environment variables
β”œβ”€β”€ .env.example # Example of environment variables
β”œβ”€β”€ .gitattributes
β”œβ”€β”€ .gitignore # List of files to ignore by git
β”œβ”€β”€ app.py # App/runner for model inference or demo
β”œβ”€β”€ README.md # Project overview and instructions to run
└── requirements.txt # List of dependencies (pandas, scikit-learn, etc.)
```