Ameya729 commited on
Commit
1a042a6
Β·
1 Parent(s): 2dd7fe7

Fix: Ensure models train during HF build + Fix RAG pipeline loading

Browse files

- Created app.py entry point that trains models if they don't exist
- Updated README.md to use new app_file path
- Fixed RAG pipeline import paths for HF compatibility
- Added build.py and start_dashboard.py helper scripts
- Added comprehensive deployment documentation

This fixes the 'Models Not Loaded' and 'AI Q&A Inactive' errors on Hugging Face Spaces.

DEPLOYMENT.md ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hugging Face Spaces Deployment Guide
2
+
3
+ ## Current Issue Diagnosis
4
+
5
+ Based on the error screenshot showing "Models: ❌ Not Loaded" and "AI Q&A: ❌ Inactive", the issue is that **models are not being trained during the Hugging Face build process**.
6
+
7
+ ## Solution
8
+
9
+ ### 1. **Updated Files**
10
+ The following files have been updated to fix the deployment:
11
+
12
+ - **`README.md`**: Changed `app_file` from `fmcg_genai/src/dashboard_app_enhanced.py` to `fmcg_genai/app.py`
13
+ - **`fmcg_genai/app.py`**: New entry point that ensures models are trained before launching dashboard
14
+ - **`fmcg_genai/src/dashboard_app.py`**: Fixed RAG pipeline import paths for better compatibility
15
+
16
+ ### 2. **How It Works**
17
+
18
+ The new `app.py` entry point:
19
+ 1. Checks if models exist (`prophet.pkl`, `xgboost_sales.pkl`)
20
+ 2. Checks if vector store exists (`faiss_index.bin`)
21
+ 3. If missing, runs `run_pipeline.py` to train models (takes ~5-10 minutes on first build)
22
+ 4. Then launches the dashboard
23
+
24
+ ### 3. **Deployment Steps**
25
+
26
+ #### Option A: Push to Hugging Face (Recommended)
27
+ ```bash
28
+ # From the project root
29
+ cd c:\Users\91880\Downloads\archive\fmcg_demand_forecasting
30
+
31
+ # Add all changes
32
+ git add .
33
+
34
+ # Commit with a clear message
35
+ git commit -m "Fix: Ensure models are trained during HF build process"
36
+
37
+ # Push to Hugging Face
38
+ git push
39
+ ```
40
+
41
+ #### Option B: Manual Rebuild on Hugging Face
42
+ 1. Go to your Hugging Face Space settings
43
+ 2. Click "Factory reboot" to trigger a fresh build
44
+ 3. The new `app.py` will run and train models automatically
45
+
46
+ ### 4. **Expected Build Time**
47
+
48
+ - **First build**: ~10-15 minutes (includes model training)
49
+ - **Subsequent builds**: ~2-3 minutes (models are cached)
50
+
51
+ ### 5. **Verification**
52
+
53
+ After deployment, you should see:
54
+ - βœ… Models: Loaded
55
+ - βœ… AI Q&A: Active
56
+ - Dashboard loads without "Could not load data" error
57
+
58
+ ### 6. **Troubleshooting**
59
+
60
+ If the build fails:
61
+
62
+ 1. **Check build logs** on Hugging Face Spaces
63
+ 2. **Common issues**:
64
+ - Out of memory: Reduce batch size in `config.yaml`
65
+ - Timeout: Models take too long to train (HF has 1-hour build limit)
66
+ - Missing dependencies: Check `requirements.txt`
67
+
68
+ 3. **Quick fix**: If build times out, you can:
69
+ - Train models locally
70
+ - Upload trained models to Hugging Face using Git LFS
71
+ - Skip training in `app.py`
72
+
73
+ ### 7. **Git LFS Setup (If Needed)**
74
+
75
+ If you want to commit trained models instead of training during build:
76
+
77
+ ```bash
78
+ # Install Git LFS
79
+ git lfs install
80
+
81
+ # Track large model files
82
+ git lfs track "fmcg_genai/models/*.pkl"
83
+ git lfs track "fmcg_genai/vector_store/*.bin"
84
+ git lfs track "fmcg_genai/vector_store/*.pkl"
85
+
86
+ # Add .gitattributes
87
+ git add .gitattributes
88
+
89
+ # Commit and push
90
+ git add fmcg_genai/models/* fmcg_genai/vector_store/*
91
+ git commit -m "Add pre-trained models via Git LFS"
92
+ git push
93
+ ```
94
+
95
+ Then modify `app.py` to skip training if models exist.
96
+
97
+ ## Summary
98
+
99
+ The main fix is the new `app.py` entry point that ensures models are trained during the Hugging Face build process. Push the changes and rebuild your Space to fix the issue.
FIX_SUMMARY.md ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # FMCG Dashboard - Issue Diagnosis & Fix Summary
2
+
3
+ ## πŸ” **What Was Wrong**
4
+
5
+ ### Issue 1: Models Not Training During Build
6
+ **Problem**: The Hugging Face Space was trying to load pre-trained models that don't exist in the repository.
7
+
8
+ **Root Cause**:
9
+ - The `app_file` in README.md pointed directly to `dashboard_app_enhanced.py`
10
+ - This file expects models to already exist
11
+ - Models were never trained during the HF build process
12
+ - Result: "Models: ❌ Not Loaded" error
13
+
14
+ ### Issue 2: RAG Pipeline Not Loading
15
+ **Problem**: Even though vector store files exist locally, they weren't being created on HF deployment.
16
+
17
+ **Root Cause**:
18
+ - Vector store is generated by `run_pipeline.py`
19
+ - This script was never executed during HF build
20
+ - Import path issues in `dashboard_app.py` (hardcoded `from src.rag_pipeline`)
21
+ - Result: "AI Q&A: ❌ Inactive" error
22
+
23
+ ### Issue 3: Data Loading Error
24
+ **Problem**: "Could not load data. Please ensure data preprocessing has been completed."
25
+
26
+ **Root Cause**:
27
+ - Processed data files are generated by `run_pipeline.py`
28
+ - Without running the pipeline, `data/processed/cleaned.csv` doesn't exist on HF
29
+ - Dashboard can't load non-existent files
30
+
31
+ ## βœ… **What Was Fixed**
32
+
33
+ ### Fix 1: Created `app.py` Entry Point
34
+ **File**: `fmcg_genai/app.py`
35
+
36
+ **What it does**:
37
+ ```python
38
+ 1. Checks if models exist (prophet.pkl, xgboost_sales.pkl)
39
+ 2. Checks if vector store exists (faiss_index.bin)
40
+ 3. If missing β†’ runs run_pipeline.py to train everything
41
+ 4. Then launches the dashboard
42
+ ```
43
+
44
+ **Impact**: Models are now trained automatically during HF build
45
+
46
+ ### Fix 2: Updated README.md
47
+ **Change**: `app_file: fmcg_genai/src/dashboard_app_enhanced.py` β†’ `app_file: fmcg_genai/app.py`
48
+
49
+ **Impact**: HF Spaces now uses the new entry point that handles model training
50
+
51
+ ### Fix 3: Fixed RAG Pipeline Import
52
+ **File**: `fmcg_genai/src/dashboard_app.py`
53
+
54
+ **Changes**:
55
+ - Added fallback import paths for RAG pipeline
56
+ - Better error logging
57
+ - Handles both local and HF deployment paths
58
+
59
+ **Impact**: RAG pipeline loads correctly regardless of deployment environment
60
+
61
+ ### Fix 4: Created Helper Scripts
62
+ **Files**:
63
+ - `build.py`: For local builds, checks if models need training
64
+ - `start_dashboard.py`: For local testing, validates environment
65
+ - `DEPLOYMENT.md`: Comprehensive deployment guide
66
+
67
+ ## πŸ“‹ **Deployment Checklist**
68
+
69
+ - [x] Created `app.py` entry point with model training logic
70
+ - [x] Updated `README.md` to use new app_file
71
+ - [x] Fixed RAG pipeline import paths
72
+ - [x] Added comprehensive error logging
73
+ - [x] Created deployment documentation
74
+
75
+ ## πŸš€ **Next Steps**
76
+
77
+ ### To Deploy to Hugging Face:
78
+
79
+ ```bash
80
+ # 1. Navigate to project root
81
+ cd c:\Users\91880\Downloads\archive\fmcg_demand_forecasting
82
+
83
+ # 2. Stage all changes
84
+ git add .
85
+
86
+ # 3. Commit with descriptive message
87
+ git commit -m "Fix: Ensure models train during HF build + Fix RAG pipeline loading"
88
+
89
+ # 4. Push to Hugging Face
90
+ git push
91
+ ```
92
+
93
+ ### Expected Outcome:
94
+
95
+ After pushing and HF rebuilds:
96
+ 1. βœ… Build takes ~10-15 minutes (first time)
97
+ 2. βœ… Models are trained automatically
98
+ 3. βœ… Vector store is created
99
+ 4. βœ… Dashboard shows "Models: βœ… Loaded"
100
+ 5. βœ… Dashboard shows "AI Q&A: βœ… Active"
101
+ 6. βœ… No "Could not load data" error
102
+
103
+ ## πŸ”§ **Technical Details**
104
+
105
+ ### Build Process Flow (New):
106
+ ```
107
+ HF Starts Build
108
+ ↓
109
+ Runs app.py
110
+ ↓
111
+ Checks if models exist
112
+ ↓ (No)
113
+ Runs run_pipeline.py
114
+ ↓
115
+ 1. Data Preprocessing
116
+ 2. Feature Engineering
117
+ 3. Model Training (Prophet + XGBoost)
118
+ 4. Model Evaluation
119
+ 5. SHAP Explainability
120
+ 6. RAG Pipeline Setup
121
+ ↓
122
+ Models + Vector Store Created
123
+ ↓
124
+ Launches dashboard_app.py
125
+ ↓
126
+ βœ… Dashboard Ready
127
+ ```
128
+
129
+ ### Build Process Flow (Old - BROKEN):
130
+ ```
131
+ HF Starts Build
132
+ ↓
133
+ Runs dashboard_app_enhanced.py directly
134
+ ↓
135
+ Tries to load models
136
+ ↓ (Not found)
137
+ ❌ Models: Not Loaded
138
+ ❌ AI Q&A: Inactive
139
+ ❌ Could not load data
140
+ ```
141
+
142
+ ## πŸ“Š **File Changes Summary**
143
+
144
+ | File | Change Type | Purpose |
145
+ |------|-------------|---------|
146
+ | `README.md` | Modified | Updated app_file path |
147
+ | `fmcg_genai/app.py` | Created | New entry point with model training |
148
+ | `fmcg_genai/src/dashboard_app.py` | Modified | Fixed RAG import paths |
149
+ | `fmcg_genai/build.py` | Created | Local build helper |
150
+ | `fmcg_genai/start_dashboard.py` | Created | Local startup validator |
151
+ | `DEPLOYMENT.md` | Created | Deployment guide |
152
+
153
+ ## ⚠️ **Important Notes**
154
+
155
+ 1. **First build will be slow**: Training models takes time (~10-15 min)
156
+ 2. **Subsequent builds are fast**: Models are cached
157
+ 3. **Memory requirements**: Ensure HF Space has enough RAM (recommend 16GB tier)
158
+ 4. **Alternative approach**: Use Git LFS to commit pre-trained models (see DEPLOYMENT.md)
159
+
160
+ ## 🎯 **Success Criteria**
161
+
162
+ The deployment is successful when:
163
+ - [ ] HF build completes without errors
164
+ - [ ] Dashboard loads without "Could not load data" error
165
+ - [ ] System Status shows "Models: βœ… Loaded"
166
+ - [ ] System Status shows "AI Q&A: βœ… Active"
167
+ - [ ] Forecasting tab works and shows predictions
168
+ - [ ] AI Q&A Portal responds to queries
169
+ - [ ] All visualizations render correctly
README.md CHANGED
@@ -5,7 +5,7 @@ colorFrom: blue
5
  colorTo: purple
6
  sdk: streamlit
7
  sdk_version: "1.25.0"
8
- app_file: fmcg_genai/src/dashboard_app_enhanced.py
9
  pinned: false
10
  license: mit
11
  python_version: "3.10"
 
5
  colorTo: purple
6
  sdk: streamlit
7
  sdk_version: "1.25.0"
8
+ app_file: fmcg_genai/app.py
9
  pinned: false
10
  license: mit
11
  python_version: "3.10"
fmcg_genai/app.py ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Hugging Face Spaces Entry Point
3
+ This script ensures models are trained before launching the dashboard
4
+ """
5
+
6
+ import os
7
+ import sys
8
+ from pathlib import Path
9
+ import subprocess
10
+ import logging
11
+
12
+ # Setup logging
13
+ logging.basicConfig(level=logging.INFO)
14
+ logger = logging.getLogger(__name__)
15
+
16
+ def ensure_models_trained():
17
+ """Ensure models and vector store are created"""
18
+ project_root = Path(__file__).parent
19
+ models_dir = project_root / "models"
20
+ vector_store_dir = project_root / "vector_store"
21
+
22
+ # Check if models exist
23
+ prophet_exists = (models_dir / "prophet.pkl").exists()
24
+ xgboost_exists = (models_dir / "xgboost_sales.pkl").exists()
25
+ vector_store_exists = (vector_store_dir / "faiss_index.bin").exists()
26
+
27
+ if prophet_exists and xgboost_exists and vector_store_exists:
28
+ logger.info("Models and vector store already exist. Skipping training.")
29
+ return True
30
+
31
+ logger.info("Models not found. Running pipeline to train models...")
32
+ logger.info("This will take several minutes on first deployment...")
33
+
34
+ try:
35
+ # Run the pipeline
36
+ result = subprocess.run(
37
+ [sys.executable, "run_pipeline.py"],
38
+ cwd=project_root,
39
+ check=True,
40
+ capture_output=True,
41
+ text=True
42
+ )
43
+ logger.info("Pipeline completed successfully!")
44
+ logger.info(result.stdout)
45
+ return True
46
+ except subprocess.CalledProcessError as e:
47
+ logger.error(f"Pipeline failed: {e}")
48
+ logger.error(f"STDOUT: {e.stdout}")
49
+ logger.error(f"STDERR: {e.stderr}")
50
+ return False
51
+ except Exception as e:
52
+ logger.error(f"Unexpected error: {e}")
53
+ return False
54
+
55
+ if __name__ == "__main__":
56
+ logger.info("Starting FMCG Analytics Dashboard...")
57
+
58
+ # Ensure models are trained
59
+ if not ensure_models_trained():
60
+ logger.error("Failed to train models. Dashboard may not work correctly.")
61
+
62
+ # Import and run the dashboard
63
+ logger.info("Launching dashboard...")
64
+ sys.path.insert(0, str(Path(__file__).parent / "src"))
65
+
66
+ from dashboard_app import main
67
+ main()
fmcg_genai/build.py ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Build script for FMCG Dashboard
3
+ Ensures models are trained and all components are ready before deployment
4
+ """
5
+
6
+ import os
7
+ import sys
8
+ from pathlib import Path
9
+ import logging
10
+ import subprocess
11
+
12
+ # Setup logging
13
+ logging.basicConfig(
14
+ level=logging.INFO,
15
+ format='%(asctime)s - %(levelname)s - %(message)s'
16
+ )
17
+ logger = logging.getLogger(__name__)
18
+
19
+ def check_files_exist():
20
+ """Check if required files exist"""
21
+ project_root = Path(__file__).parent
22
+
23
+ # Check data
24
+ data_file = project_root / "data" / "raw" / "FMCG_2022_2024.csv"
25
+ if not data_file.exists():
26
+ logger.error(f"❌ Raw data file not found: {data_file}")
27
+ return False
28
+
29
+ logger.info(f"βœ… Raw data file found: {data_file}")
30
+ return True
31
+
32
+ def check_models_trained():
33
+ """Check if models are already trained"""
34
+ project_root = Path(__file__).parent
35
+ models_dir = project_root / "models"
36
+
37
+ required_models = ["prophet.pkl", "xgboost_sales.pkl"]
38
+ all_exist = all((models_dir / model).exists() for model in required_models)
39
+
40
+ if all_exist:
41
+ logger.info("βœ… All models already trained")
42
+ return True
43
+ else:
44
+ logger.warning("⚠️ Models not found or incomplete")
45
+ return False
46
+
47
+ def check_vector_store():
48
+ """Check if vector store exists"""
49
+ project_root = Path(__file__).parent
50
+ vector_store_dir = project_root / "vector_store"
51
+
52
+ required_files = ["faiss_index.bin", "documents.pkl", "embeddings.pkl"]
53
+ all_exist = all((vector_store_dir / file).exists() for file in required_files)
54
+
55
+ if all_exist:
56
+ logger.info("βœ… Vector store already exists")
57
+ return True
58
+ else:
59
+ logger.warning("⚠️ Vector store not found or incomplete")
60
+ return False
61
+
62
+ def run_pipeline():
63
+ """Run the full pipeline to train models and create vector store"""
64
+ logger.info("=" * 80)
65
+ logger.info("Running FMCG Pipeline - This may take several minutes...")
66
+ logger.info("=" * 80)
67
+
68
+ try:
69
+ result = subprocess.run(
70
+ [sys.executable, "run_pipeline.py"],
71
+ check=True,
72
+ capture_output=False,
73
+ text=True
74
+ )
75
+ logger.info("βœ… Pipeline completed successfully")
76
+ return True
77
+ except subprocess.CalledProcessError as e:
78
+ logger.error(f"❌ Pipeline failed with error: {e}")
79
+ return False
80
+ except Exception as e:
81
+ logger.error(f"❌ Unexpected error running pipeline: {e}")
82
+ return False
83
+
84
+ def main():
85
+ """Main build function"""
86
+ print("\n" + "=" * 80)
87
+ print("FMCG Analytics Dashboard - Build Script")
88
+ print("=" * 80 + "\n")
89
+
90
+ # Check if data exists
91
+ if not check_files_exist():
92
+ print("\n❌ BUILD FAILED: Required data files not found")
93
+ sys.exit(1)
94
+
95
+ # Check if models are trained
96
+ models_exist = check_models_trained()
97
+ vector_store_exists = check_vector_store()
98
+
99
+ if models_exist and vector_store_exists:
100
+ print("\nβœ… All components already built!")
101
+ print("\nπŸ’‘ To rebuild from scratch, delete the 'models' and 'vector_store' directories")
102
+ print(" Then run this script again.")
103
+ return
104
+
105
+ # Need to run pipeline
106
+ print("\nπŸ”„ Building models and vector store...")
107
+ print(" This will take several minutes. Please wait...\n")
108
+
109
+ if not run_pipeline():
110
+ print("\n❌ BUILD FAILED: Pipeline execution failed")
111
+ print("\nπŸ’‘ Check the logs above for error details")
112
+ sys.exit(1)
113
+
114
+ print("\n" + "=" * 80)
115
+ print("βœ… BUILD SUCCESSFUL!")
116
+ print("=" * 80)
117
+ print("\nYou can now start the dashboard with:")
118
+ print(" python start_dashboard.py")
119
+ print(" OR")
120
+ print(" streamlit run src/dashboard_app.py")
121
+ print()
122
+
123
+ if __name__ == "__main__":
124
+ main()
fmcg_genai/src/dashboard_app.py CHANGED
@@ -106,7 +106,19 @@ def get_models(config):
106
  def get_rag_pipeline():
107
  """Load and setup RAG pipeline with caching"""
108
  try:
109
- from src.rag_pipeline import FMCGRAGPipeline
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  config_path = project_root / "config.yaml"
112
  if not config_path.exists():
@@ -119,8 +131,10 @@ def get_rag_pipeline():
119
  vector_store_path = project_root / "vector_store" / "faiss_index.bin"
120
  if not vector_store_path.exists():
121
  logger.warning(f"Vector store not found at {vector_store_path}")
 
122
  return None
123
 
 
124
  if rag_pipeline.load_vector_store():
125
  logger.info("RAG pipeline loaded successfully")
126
  return rag_pipeline
 
106
  def get_rag_pipeline():
107
  """Load and setup RAG pipeline with caching"""
108
  try:
109
+ # Try different import paths
110
+ try:
111
+ from src.rag_pipeline import FMCGRAGPipeline
112
+ except ImportError:
113
+ try:
114
+ from rag_pipeline import FMCGRAGPipeline
115
+ except ImportError:
116
+ # Add src to path and try again
117
+ import sys
118
+ src_path = project_root / "src"
119
+ if str(src_path) not in sys.path:
120
+ sys.path.insert(0, str(src_path))
121
+ from rag_pipeline import FMCGRAGPipeline
122
 
123
  config_path = project_root / "config.yaml"
124
  if not config_path.exists():
 
131
  vector_store_path = project_root / "vector_store" / "faiss_index.bin"
132
  if not vector_store_path.exists():
133
  logger.warning(f"Vector store not found at {vector_store_path}")
134
+ logger.warning("Run 'python run_pipeline.py' to create the vector store")
135
  return None
136
 
137
+ logger.info(f"Loading vector store from {vector_store_path}")
138
  if rag_pipeline.load_vector_store():
139
  logger.info("RAG pipeline loaded successfully")
140
  return rag_pipeline
fmcg_genai/start_dashboard.py ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Startup script for FMCG Dashboard
3
+ Ensures all components are properly initialized before launching
4
+ """
5
+
6
+ import os
7
+ import sys
8
+ from pathlib import Path
9
+ import logging
10
+
11
+ # Setup logging
12
+ logging.basicConfig(
13
+ level=logging.INFO,
14
+ format='%(asctime)s - %(levelname)s - %(message)s'
15
+ )
16
+ logger = logging.getLogger(__name__)
17
+
18
+ def check_environment():
19
+ """Check if all required files and directories exist"""
20
+ logger.info("Checking environment...")
21
+
22
+ project_root = Path(__file__).parent
23
+ issues = []
24
+
25
+ # Check config
26
+ config_path = project_root / "config.yaml"
27
+ if not config_path.exists():
28
+ issues.append(f"❌ Config file not found: {config_path}")
29
+ else:
30
+ logger.info(f"βœ… Config file found: {config_path}")
31
+
32
+ # Check data
33
+ data_dir = project_root / "data" / "processed"
34
+ required_files = ["cleaned.csv", "test_features.csv"]
35
+ for file in required_files:
36
+ file_path = data_dir / file
37
+ if not file_path.exists():
38
+ issues.append(f"❌ Data file not found: {file_path}")
39
+ else:
40
+ logger.info(f"βœ… Data file found: {file_path}")
41
+
42
+ # Check models
43
+ models_dir = project_root / "models"
44
+ model_files = ["prophet.pkl", "xgboost_sales.pkl"]
45
+ for file in model_files:
46
+ file_path = models_dir / file
47
+ if not file_path.exists():
48
+ issues.append(f"❌ Model file not found: {file_path}")
49
+ else:
50
+ logger.info(f"βœ… Model file found: {file_path}")
51
+
52
+ # Check vector store
53
+ vector_store_dir = project_root / "vector_store"
54
+ vector_files = ["faiss_index.bin", "documents.pkl", "embeddings.pkl"]
55
+ for file in vector_files:
56
+ file_path = vector_store_dir / file
57
+ if not file_path.exists():
58
+ issues.append(f"❌ Vector store file not found: {file_path}")
59
+ else:
60
+ logger.info(f"βœ… Vector store file found: {file_path}")
61
+
62
+ # Check dashboard app
63
+ dashboard_path = project_root / "src" / "dashboard_app.py"
64
+ if not dashboard_path.exists():
65
+ issues.append(f"❌ Dashboard app not found: {dashboard_path}")
66
+ else:
67
+ logger.info(f"βœ… Dashboard app found: {dashboard_path}")
68
+
69
+ return issues
70
+
71
+ def main():
72
+ """Main startup function"""
73
+ print("=" * 80)
74
+ print("FMCG Analytics Dashboard - Startup Check")
75
+ print("=" * 80)
76
+
77
+ # Check environment
78
+ issues = check_environment()
79
+
80
+ if issues:
81
+ print("\n❌ STARTUP FAILED - Issues detected:\n")
82
+ for issue in issues:
83
+ print(f" {issue}")
84
+ print("\nπŸ’‘ Solution:")
85
+ print(" Run the pipeline first to generate all required files:")
86
+ print(" python run_pipeline.py")
87
+ sys.exit(1)
88
+
89
+ print("\nβœ… All checks passed! Starting dashboard...\n")
90
+ print("=" * 80)
91
+
92
+ # Launch dashboard
93
+ import subprocess
94
+ dashboard_path = Path(__file__).parent / "src" / "dashboard_app.py"
95
+
96
+ try:
97
+ subprocess.run([
98
+ sys.executable, "-m", "streamlit", "run",
99
+ str(dashboard_path),
100
+ "--server.port=8501",
101
+ "--server.address=localhost"
102
+ ], check=True)
103
+ except KeyboardInterrupt:
104
+ print("\n\nπŸ‘‹ Dashboard stopped by user")
105
+ except Exception as e:
106
+ print(f"\n❌ Error starting dashboard: {e}")
107
+ sys.exit(1)
108
+
109
+ if __name__ == "__main__":
110
+ main()