tkbarb10's picture
|
download
raw
6.91 kB

Resources Checklist

Central to-do hub for all data, models, and content needed before the portfolio app is ready for public deployment. Check off items as they are completed.


Configuration & Infrastructure

  • GEE Service Account — create a service account in Google Cloud Console, grant it Earth Engine access, download the JSON key, and add to:
    • Local: .streamlit/secrets.toml (not committed to git)
    • Deployed: Streamlit Cloud secrets dashboard
    • Format:
      [gee]
      service_account_email = "my-account@my-project.iam.gserviceaccount.com"
      private_key = "-----BEGIN RSA PRIVATE KEY-----\n..."
      
  • GitHub repo URL — update the st.link_button on Page 4 with the real URL
  • streamlit-lottie installed: pip install streamlit-lottie>=0.0.5
  • Add streamlit-folium and streamlit-lottie to pyproject.toml

Hero Animation (Page 1)

  • Find a bear/forest Lottie animation at https://lottiefiles.com
    • Search: "bear", "forest", "tree", "nature", "fire prevention"
    • Download as JSON → save to resources/smokey_animation.json
    • Then uncomment the streamlit-lottie block in app/streamlit_app.py (Page 1)
    • Install: from streamlit_lottie import st_lottie
  • Alt: Download a high-quality animated GIF → resources/smokey.gif and embed via st.image

Content To-Dos (author before public deploy)

Page 1 — "Why This Matters"

  • Replace placeholder text with real framing:
    • International forest restoration commitments (e.g., scale of loss globally)
    • Why satellite-based detection is better than ground surveys
    • What makes this project novel or useful

Page 3 — "Why I Picked the Winner"

  • Write real reasoning in the st.expander on Page 3:
    • Why PR-AUC over accuracy for 4.8% class imbalance
    • Brier score as a calibration check (not just ranking)
    • Recall vs. precision trade-off for environmental monitoring (false negatives = missed deforestation events = real-world consequence)
    • What the Optuna tuning added vs. the baseline XGBoost

Page 4 — Architecture Node Descriptions

  • Fill in the Sankey node descriptions in the nodes dict (app/streamlit_app.py, ~line 700):
    • Google Earth Engine
    • Parquet Export
    • Feature Engineering
    • Optuna Tuning
    • XGBoost Model
    • FastAPI
    • Streamlit Dashboard

Page 4 — Key Decisions & Lessons

  • Write real content in the 4 st.expander blocks:
    • "Why cosine drift as the core signal"
    • "Handling severe class imbalance (4.8%)"
    • "Why geo-stratified train/test split"
    • "Optuna over grid search"

AOI Fun Facts Research

Each AOI rectangle on the map (Pages 2 + 3) shows a fun fact on click. Replace the PLACEHOLDER entries in AOI_FUN_FACTS in app/streamlit_app.py.

  • Indonesia-Malaysia — highest drift in dataset; research: peat fires, palm oil, logging in Borneo/Sumatra. Add source URL.

    Suggested sources: Global Forest Watch, Mongabay, Nature journal

  • Amazon Basin — add a compelling fact + source
  • Congo Basin DRC — add a compelling fact + source
  • Guinea — add a compelling fact + source
  • Canada — add a compelling fact + source
  • Mekong Region — add a compelling fact + source
  • Cerrado Brazil — add a compelling fact + source

Pre-Deploy Checklist

  • All content to-dos above are filled in
  • All pre-computed artifact boxes above are checked
  • GEE service account is set up and tested (click a point on Page 3 map)
  • GitHub link on Page 4 is updated
  • Hero animation or image is in place
  • App runs locally with streamlit run app/streamlit_app.py without errors
  • All 4 pages load in < 2 seconds (no 800k-row parquet loads at runtime)


✅ Done

Pre-computed Data Artifacts

Run python resources/precompute.py to regenerate all artifacts.

  • resources/kpi_summary.json — KPI cards on Page 1
  • resources/mean_embedding_profile.parquet — embedding profile chart on Page 2
  • resources/drift_by_area_year.parquet — drift analysis chart + violin on Page 2
  • resources/target_distribution.json — class distribution charts on Page 2
  • resources/timelapse_sample.parquet — timelapse animation + map dots on Page 2
  • resources/canada_best_trial.json + amazon_basin_best_trial.json — model metrics on Page 3
  • resources/canada_feature_importance.json + amazon_basin_feature_importance.json — Page 3 side-by-side
  • resources/canada_confusion_matrix.json + amazon_basin_confusion_matrix.json — Page 3 side-by-side
  • resources/canada_data_sample.parquet + amazon_basin_data_sample.parquet — Page 2 area toggle
  • resources/comparison_kpis.json — "Tale of two forests" on Page 1
  • resources/aoi_stats.json — sidebar Other Regions stats table
  • resources/evaluations_clean.parquet — superseded by per-area artifacts; no longer loaded by app
  • resources/feature_importance.json — superseded by per-area feature importance; no longer loaded
  • resources/confusion_matrix.json — superseded by per-area confusion matrices; no longer loaded
  • resources/pca_sample.parquet — PCA scatter removed from app; no longer loaded

Model Files

  • models/xgboost_canada_optuna.pkl and models/xgboost_amazon_basin_optuna.pklfinal tuned winners — copied to resources/ for HF Space
  • models/rf_tuned.pkl, models/sgd_optuna.pkl, models/baseline_logistic_regression_*.pkl — were used in old 4-model comparison table; that page has been replaced by the Canada vs Amazon Basin framing; no longer needed by the app

streamlit-folium installed

  • streamlit-folium>=0.25.0 installed and in use

Nav Bar Icons (optional polish)

  • SVG icons confirmed working — implementation commented out in app/streamlit_app.py. To re-enable:
    1. Source per-page SVG icons (e.g., Lordicon, Heroicons, Phosphor) and save to resources/
    2. Add import base64 to top-level imports in app/streamlit_app.py
    3. Uncomment _load_nav_icon() and the button loop in the comment block above _render_nav()
    4. Optionally remap SVG stroke colors to theme values (#40916c, #1b4332) via string replace before base64 encoding
    • Current placeholder: resources/wired-outline-955-demand-hover-click.svg (Lordicon "demand" icon, tested ✓)
    • Lottie animations in nav — not practical inside nav buttons (renders as full-height component block)

Theme Selection

  • Theme selected and applied: Canopy Light + ManropeTHEME dict set in app/streamlit_app.py

Xet Storage Details

Size:
6.91 kB
·
Xet hash:
f8c0cfe0cea8187f9ff52ef25617404b2090221f8b27c656fc6c446589cc40de

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.