Buckets:
Resources Checklist
Central to-do hub for all data, models, and content needed before the portfolio app is ready for public deployment. Check off items as they are completed.
Configuration & Infrastructure
- GEE Service Account — create a service account in Google Cloud Console,
grant it Earth Engine access, download the JSON key, and add to:
- Local:
.streamlit/secrets.toml(not committed to git) - Deployed: Streamlit Cloud secrets dashboard
- Format:
[gee] service_account_email = "my-account@my-project.iam.gserviceaccount.com" private_key = "-----BEGIN RSA PRIVATE KEY-----\n..."
- Local:
- GitHub repo URL — update the
st.link_buttonon Page 4 with the real URL - streamlit-lottie installed:
pip install streamlit-lottie>=0.0.5 - Add
streamlit-foliumandstreamlit-lottietopyproject.toml
Hero Animation (Page 1)
- Find a bear/forest Lottie animation at https://lottiefiles.com
- Search: "bear", "forest", "tree", "nature", "fire prevention"
- Download as JSON → save to
resources/smokey_animation.json - Then uncomment the
streamlit-lottieblock inapp/streamlit_app.py(Page 1) - Install:
from streamlit_lottie import st_lottie
- Alt: Download a high-quality animated GIF →
resources/smokey.gifand embed viast.image
Content To-Dos (author before public deploy)
Page 1 — "Why This Matters"
- Replace placeholder text with real framing:
- International forest restoration commitments (e.g., scale of loss globally)
- Why satellite-based detection is better than ground surveys
- What makes this project novel or useful
Page 3 — "Why I Picked the Winner"
- Write real reasoning in the
st.expanderon Page 3:- Why PR-AUC over accuracy for 4.8% class imbalance
- Brier score as a calibration check (not just ranking)
- Recall vs. precision trade-off for environmental monitoring (false negatives = missed deforestation events = real-world consequence)
- What the Optuna tuning added vs. the baseline XGBoost
Page 4 — Architecture Node Descriptions
- Fill in the Sankey node descriptions in the
nodesdict (app/streamlit_app.py, ~line 700):- Google Earth Engine
- Parquet Export
- Feature Engineering
- Optuna Tuning
- XGBoost Model
- FastAPI
- Streamlit Dashboard
Page 4 — Key Decisions & Lessons
- Write real content in the 4
st.expanderblocks:- "Why cosine drift as the core signal"
- "Handling severe class imbalance (4.8%)"
- "Why geo-stratified train/test split"
- "Optuna over grid search"
AOI Fun Facts Research
Each AOI rectangle on the map (Pages 2 + 3) shows a fun fact on click. Replace the PLACEHOLDER entries in
AOI_FUN_FACTSinapp/streamlit_app.py.
- Indonesia-Malaysia — highest drift in dataset; research: peat fires, palm oil,
logging in Borneo/Sumatra. Add source URL.
Suggested sources: Global Forest Watch, Mongabay, Nature journal
- Amazon Basin — add a compelling fact + source
- Congo Basin DRC — add a compelling fact + source
- Guinea — add a compelling fact + source
- Canada — add a compelling fact + source
- Mekong Region — add a compelling fact + source
- Cerrado Brazil — add a compelling fact + source
Pre-Deploy Checklist
- All content to-dos above are filled in
- All pre-computed artifact boxes above are checked
- GEE service account is set up and tested (click a point on Page 3 map)
- GitHub link on Page 4 is updated
- Hero animation or image is in place
- App runs locally with
streamlit run app/streamlit_app.pywithout errors - All 4 pages load in < 2 seconds (no 800k-row parquet loads at runtime)
✅ Done
Pre-computed Data Artifacts
Run
python resources/precompute.pyto regenerate all artifacts.
-
resources/kpi_summary.json— KPI cards on Page 1 -
resources/mean_embedding_profile.parquet— embedding profile chart on Page 2 -
resources/drift_by_area_year.parquet— drift analysis chart + violin on Page 2 -
resources/target_distribution.json— class distribution charts on Page 2 -
resources/timelapse_sample.parquet— timelapse animation + map dots on Page 2 -
resources/canada_best_trial.json+amazon_basin_best_trial.json— model metrics on Page 3 -
resources/canada_feature_importance.json+amazon_basin_feature_importance.json— Page 3 side-by-side -
resources/canada_confusion_matrix.json+amazon_basin_confusion_matrix.json— Page 3 side-by-side -
resources/canada_data_sample.parquet+amazon_basin_data_sample.parquet— Page 2 area toggle -
resources/comparison_kpis.json— "Tale of two forests" on Page 1 -
resources/aoi_stats.json— sidebar Other Regions stats table -
resources/evaluations_clean.parquet— superseded by per-area artifacts; no longer loaded by app -
resources/feature_importance.json— superseded by per-area feature importance; no longer loaded -
resources/confusion_matrix.json— superseded by per-area confusion matrices; no longer loaded -
resources/pca_sample.parquet— PCA scatter removed from app; no longer loaded
Model Files
-
andmodels/xgboost_canada_optuna.pkl— final tuned winners — copied tomodels/xgboost_amazon_basin_optuna.pklresources/for HF Space -
— were used in old 4-model comparison table; that page has been replaced by the Canada vs Amazon Basin framing; no longer needed by the appmodels/rf_tuned.pkl,models/sgd_optuna.pkl,models/baseline_logistic_regression_*.pkl
streamlit-folium installed
-
streamlit-folium>=0.25.0installed and in use
Nav Bar Icons (optional polish)
-
SVG icons confirmed working — implementation commented out in. To re-enable:app/streamlit_app.py- Source per-page SVG icons (e.g., Lordicon, Heroicons, Phosphor) and save to
resources/ - Add
import base64to top-level imports inapp/streamlit_app.py - Uncomment
_load_nav_icon()and the button loop in the comment block above_render_nav() - Optionally remap SVG stroke colors to theme values (
#40916c,#1b4332) via string replace before base64 encoding
- Current placeholder:
resources/wired-outline-955-demand-hover-click.svg(Lordicon "demand" icon, tested ✓) -
Lottie animations in nav — not practical inside nav buttons (renders as full-height component block)
- Source per-page SVG icons (e.g., Lordicon, Heroicons, Phosphor) and save to
Theme Selection
-
Theme selected and applied: Canopy Light + Manrope —THEMEdict set inapp/streamlit_app.py
Xet Storage Details
- Size:
- 6.91 kB
- Xet hash:
- f8c0cfe0cea8187f9ff52ef25617404b2090221f8b27c656fc6c446589cc40de
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.