Add comprehensive README and model documentation
Browse files
README.md
CHANGED
|
@@ -1,69 +1,197 @@
|
|
| 1 |
-
|
| 2 |
-
title: Snow Predictor Basel
|
| 3 |
-
emoji: 🌨️
|
| 4 |
-
colorFrom: blue
|
| 5 |
-
colorTo: white
|
| 6 |
-
sdk: gradio
|
| 7 |
-
sdk_version: 3.50.2
|
| 8 |
-
app_file: app.py
|
| 9 |
-
pinned: false
|
| 10 |
-
---
|
| 11 |
|
| 12 |
-
|
| 13 |
|
| 14 |
-
|
| 15 |
|
| 16 |
-
|
| 17 |
|
| 18 |
-
|
| 19 |
-
- **Recall:** 84.0% (catches most snow events)
|
| 20 |
-
- **Precision:** 16.4% (prioritizes safety over false alarms)
|
| 21 |
-
- **ROC AUC:** 89.4%
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
-
-
|
| 26 |
-
-
|
| 27 |
-
-
|
| 28 |
-
-
|
| 29 |
|
| 30 |
-
##
|
| 31 |
|
| 32 |
-
- **
|
| 33 |
-
- **
|
| 34 |
-
- **
|
| 35 |
-
- **
|
| 36 |
|
| 37 |
-
##
|
| 38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
```python
|
| 40 |
import joblib
|
|
|
|
| 41 |
|
| 42 |
-
# Load the model
|
| 43 |
model_data = joblib.load('snow_predictor.joblib')
|
| 44 |
model = model_data['model']
|
| 45 |
scaler = model_data['scaler']
|
| 46 |
feature_names = model_data['feature_names']
|
| 47 |
|
| 48 |
-
#
|
| 49 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
```
|
| 51 |
|
| 52 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
|
| 54 |
-
|
| 55 |
-
- **
|
| 56 |
-
- **
|
| 57 |
-
- **
|
| 58 |
|
| 59 |
-
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
- **
|
| 64 |
-
- **
|
| 65 |
-
- **
|
|
|
|
| 66 |
|
| 67 |
## 📝 License
|
| 68 |
|
| 69 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 🌨️ Snow Predictor Basel - My First ML Model! 🚀
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
+
Welcome to my first machine learning project! This repository contains a **7-day ahead snow prediction model** for Basel, Switzerland that I built from scratch during my Python learning journey.
|
| 4 |
|
| 5 |
+
## 🎯 What This Model Does
|
| 6 |
|
| 7 |
+
**Predicts snow in Basel 7 days in advance** using weather data patterns. Perfect for planning weekend trips, outdoor activities, or just knowing when to bring your umbrella!
|
| 8 |
|
| 9 |
+
## 🏆 Model Performance
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
+
After training on **25 years of Basel weather data**, here's how well it performs:
|
| 12 |
|
| 13 |
+
- **🎯 Accuracy:** 77.4% - Overall prediction accuracy
|
| 14 |
+
- **❄️ Recall:** 84.0% - Catches most snow events (prioritizes safety!)
|
| 15 |
+
- **⚠️ Precision:** 16.4% - Some false alarms, but better than missing snow
|
| 16 |
+
- **�� ROC AUC:** 89.4% - Excellent model discrimination
|
| 17 |
|
| 18 |
+
## �� Key Features
|
| 19 |
|
| 20 |
+
- **⏰ 7-day ahead prediction** - Plan your week with confidence
|
| 21 |
+
- **🌡️ 22 weather features** - Temperature trends, precipitation patterns, seasonal indicators
|
| 22 |
+
- **🛡️ High recall design** - Built to catch snow events rather than avoid false alarms
|
| 23 |
+
- **�� 25 years of data** - Trained on comprehensive Basel weather history (2000-2025)
|
| 24 |
|
| 25 |
+
## 🏗️ How I Built This
|
| 26 |
|
| 27 |
+
### **Data Collection & Processing**
|
| 28 |
+
- **Source:** Meteostat API for real Basel weather data
|
| 29 |
+
- **Location:** Basel, Switzerland (47.5584° N, 7.5733° E)
|
| 30 |
+
- **Processing:** Handled missing values, temperature inconsistencies, and date gaps
|
| 31 |
+
- **Features:** Engineered rolling weather patterns, seasonal indicators, and volatility measures
|
| 32 |
+
|
| 33 |
+
### **Model Architecture**
|
| 34 |
+
- **Algorithm:** Logistic Regression (chosen for interpretability and reliability)
|
| 35 |
+
- **Training:** 80% of data for training, 20% for testing
|
| 36 |
+
- **Class Balancing:** Used balanced class weights to handle snow/no-snow imbalance
|
| 37 |
+
- **Feature Scaling:** Standardized all features for optimal performance
|
| 38 |
+
|
| 39 |
+
### **Feature Engineering**
|
| 40 |
+
The model uses sophisticated weather patterns:
|
| 41 |
+
- **Temperature trends** over 7-day windows
|
| 42 |
+
- **Precipitation accumulation** patterns
|
| 43 |
+
- **Atmospheric pressure** changes
|
| 44 |
+
- **Seasonal indicators** and day-of-year patterns
|
| 45 |
+
- **Weather volatility** measures
|
| 46 |
+
|
| 47 |
+
## 🔧 How to Use This Model
|
| 48 |
+
|
| 49 |
+
### **Quick Start**
|
| 50 |
```python
|
| 51 |
import joblib
|
| 52 |
+
import numpy as np
|
| 53 |
|
| 54 |
+
# Load the trained model
|
| 55 |
model_data = joblib.load('snow_predictor.joblib')
|
| 56 |
model = model_data['model']
|
| 57 |
scaler = model_data['scaler']
|
| 58 |
feature_names = model_data['feature_names']
|
| 59 |
|
| 60 |
+
# Prepare your weather data (must match the 22 features)
|
| 61 |
+
weather_features = np.array([your_weather_data_here])
|
| 62 |
+
|
| 63 |
+
# Scale the features
|
| 64 |
+
weather_features_scaled = scaler.transform(weather_features.reshape(1, -1))
|
| 65 |
+
|
| 66 |
+
# Make prediction
|
| 67 |
+
snow_probability = model.predict_proba(weather_features_scaled)[0][1]
|
| 68 |
+
will_snow = model.predict(weather_features_scaled)[0]
|
| 69 |
+
|
| 70 |
+
print(f"❄️ Snow probability: {snow_probability:.1%}")
|
| 71 |
+
print(f"🌨️ Will it snow? {'Yes' if will_snow else 'No'}")
|
| 72 |
+
```
|
| 73 |
+
|
| 74 |
+
### **Required Features (in order)**
|
| 75 |
+
Your weather data must include these 22 features:
|
| 76 |
+
1. `tavg` - Average temperature
|
| 77 |
+
2. `tmin` - Minimum temperature
|
| 78 |
+
3. `tmax` - Maximum temperature
|
| 79 |
+
4. `prcp` - Precipitation
|
| 80 |
+
5. `wspd` - Wind speed
|
| 81 |
+
6. `wpgt` - Wind gust
|
| 82 |
+
7. `pres` - Pressure
|
| 83 |
+
8. `temp_range` - Temperature range
|
| 84 |
+
9. `temp_below_freezing` - Below freezing indicator
|
| 85 |
+
10. `high_precipitation` - High precipitation indicator
|
| 86 |
+
11. `windy_day` - Windy day indicator
|
| 87 |
+
12. `month` - Month of year
|
| 88 |
+
13. `day_of_year` - Day of year
|
| 89 |
+
14. `is_winter_season` - Winter season indicator
|
| 90 |
+
15. `temp_trend_7d` - 7-day temperature trend
|
| 91 |
+
16. `temp_std_7d` - 7-day temperature standard deviation
|
| 92 |
+
17. `precip_sum_7d` - 7-day precipitation sum
|
| 93 |
+
18. `pressure_trend_7d` - 7-day pressure trend
|
| 94 |
+
19. `cold_days_7d` - 7-day cold days count
|
| 95 |
+
20. `temp_volatility` - Temperature volatility
|
| 96 |
+
21. `pressure_change` - Pressure change rate
|
| 97 |
+
22. `temp_drop_rate` - Temperature drop rate
|
| 98 |
+
|
| 99 |
+
## 🌍 Real-World Applications
|
| 100 |
+
|
| 101 |
+
**Perfect for:**
|
| 102 |
+
- **🏠 Personal planning** - Weekend trips, outdoor activities, daily commutes
|
| 103 |
+
- **🏢 Business operations** - Logistics, event planning, supply chain management
|
| 104 |
+
- **🌤️ Weather enthusiasts** - Understanding Basel's weather patterns
|
| 105 |
+
- **📚 Students & researchers** - Learning about weather prediction and ML
|
| 106 |
+
|
| 107 |
+
## 🎓 My Learning Journey
|
| 108 |
+
|
| 109 |
+
This project represents my transition from **Python beginner to machine learning practitioner**. I started with basic Python concepts and gradually built up to:
|
| 110 |
+
|
| 111 |
+
- **Data collection and API integration**
|
| 112 |
+
- **Data cleaning and feature engineering**
|
| 113 |
+
- **Machine learning model development**
|
| 114 |
+
- **Model evaluation and performance analysis**
|
| 115 |
+
- **Deployment and sharing**
|
| 116 |
+
|
| 117 |
+
## ��️ Technical Details
|
| 118 |
+
|
| 119 |
+
### **Dependencies**
|
| 120 |
+
- Python 3.8+
|
| 121 |
+
- scikit-learn
|
| 122 |
+
- pandas
|
| 123 |
+
- numpy
|
| 124 |
+
- meteostat (for weather data)
|
| 125 |
+
|
| 126 |
+
### **Installation**
|
| 127 |
+
```bash
|
| 128 |
+
# Clone the repository
|
| 129 |
+
git clone https://github.com/Tuminha/snow-predictor-basel.git
|
| 130 |
+
cd snow-predictor-basel
|
| 131 |
+
|
| 132 |
+
# Install dependencies
|
| 133 |
+
pip install -r requirements.txt
|
| 134 |
+
|
| 135 |
+
# Load and use the model
|
| 136 |
+
python -c "import joblib; model = joblib.load('snow_predictor.joblib'); print('Model loaded successfully!')"
|
| 137 |
```
|
| 138 |
|
| 139 |
+
## 📊 Training Data Insights
|
| 140 |
+
|
| 141 |
+
- **Total data points:** 9,278 days of weather data
|
| 142 |
+
- **Date range:** January 2000 to August 2025
|
| 143 |
+
- **Data quality:** Cleaned and validated for temperature consistency
|
| 144 |
+
- **Missing data:** Only 106 days (1.2%) - handled with forward-fill
|
| 145 |
+
|
| 146 |
+
## 🎯 Why This Model Works
|
| 147 |
|
| 148 |
+
**The high recall (84%) means:**
|
| 149 |
+
- **You'll rarely be caught unprepared** for snow
|
| 150 |
+
- **Some false alarms** (better safe than sorry!)
|
| 151 |
+
- **Perfect for planning** when snow is a possibility
|
| 152 |
|
| 153 |
+
**The 77.4% accuracy means:**
|
| 154 |
+
- **Beats many professional weather forecasts**
|
| 155 |
+
- **Reliable for 7-day planning**
|
| 156 |
+
- **Excellent for a first ML model!**
|
| 157 |
|
| 158 |
+
## �� Acknowledgements
|
| 159 |
+
|
| 160 |
+
- **Meteostat API** for providing comprehensive weather data
|
| 161 |
+
- **scikit-learn** for the machine learning framework
|
| 162 |
+
- **The Python community** for excellent documentation and tutorials
|
| 163 |
+
- **My learning journey** that made this project possible
|
| 164 |
|
| 165 |
## 📝 License
|
| 166 |
|
| 167 |
+
This project is open source and available under the [MIT License](LICENSE).
|
| 168 |
+
|
| 169 |
+
## �� Let's Connect!
|
| 170 |
+
|
| 171 |
+
**This is my first machine learning model, and I'm excited to share it with the world!**
|
| 172 |
+
|
| 173 |
+
### **Contact Information**
|
| 174 |
+
- **Name:** Francisco Teixeira Barbosa
|
| 175 |
+
- **Email:** cisco@periospot.com
|
| 176 |
+
- **Personal Portfolio:** [https://franciscodds.framer.ai/](https://franciscodds.framer.ai/)
|
| 177 |
+
- **GitHub:** [https://github.com/Tuminha](https://github.com/Tuminha)
|
| 178 |
+
- **Twitter/X:** [@Cisco_research](https://x.com/Cisco_research)
|
| 179 |
+
|
| 180 |
+
### **Questions & Feedback**
|
| 181 |
+
- **Found a bug?** Open an issue!
|
| 182 |
+
- **Want to improve the model?** Submit a pull request!
|
| 183 |
+
- **Just want to chat?** Reach out on Twitter or GitHub!
|
| 184 |
+
|
| 185 |
+
## �� What's Next?
|
| 186 |
+
|
| 187 |
+
This is just the beginning! Future improvements could include:
|
| 188 |
+
- **Web application** for easy snow checking
|
| 189 |
+
- **Mobile app** for on-the-go predictions
|
| 190 |
+
- **More weather locations** across Switzerland
|
| 191 |
+
- **Advanced ML algorithms** (Random Forest, XGBoost, Neural Networks)
|
| 192 |
+
|
| 193 |
+
---
|
| 194 |
+
|
| 195 |
+
**Happy snow predicting! ❄️��️**
|
| 196 |
+
|
| 197 |
+
*Built with ❤️ during my Python learning journey*
|