Buckets:
| # Resources Checklist | |
| Central to-do hub for all data, models, and content needed before the portfolio app | |
| is ready for public deployment. Check off items as they are completed. | |
| --- | |
| ## Configuration & Infrastructure | |
| - [ ] **GEE Service Account** — create a service account in Google Cloud Console, | |
| grant it Earth Engine access, download the JSON key, and add to: | |
| - Local: `.streamlit/secrets.toml` (not committed to git) | |
| - Deployed: Streamlit Cloud secrets dashboard | |
| - Format: | |
| ```toml | |
| [gee] | |
| service_account_email = "my-account@my-project.iam.gserviceaccount.com" | |
| private_key = "-----BEGIN RSA PRIVATE KEY-----\n..." | |
| ``` | |
| - [ ] **GitHub repo URL** — update the `st.link_button` on Page 4 with the real URL | |
| - [ ] **streamlit-lottie** installed: `pip install streamlit-lottie>=0.0.5` | |
| - [ ] Add `streamlit-folium` and `streamlit-lottie` to `pyproject.toml` | |
| --- | |
| ## Hero Animation (Page 1) | |
| - [ ] Find a bear/forest Lottie animation at https://lottiefiles.com | |
| - Search: "bear", "forest", "tree", "nature", "fire prevention" | |
| - Download as JSON → save to `resources/smokey_animation.json` | |
| - Then uncomment the `streamlit-lottie` block in `app/streamlit_app.py` (Page 1) | |
| - Install: `from streamlit_lottie import st_lottie` | |
| - [ ] Alt: Download a high-quality animated GIF → `resources/smokey.gif` and embed via `st.image` | |
| --- | |
| ## Content To-Dos (author before public deploy) | |
| ### Page 1 — "Why This Matters" | |
| - [ ] Replace placeholder text with real framing: | |
| - International forest restoration commitments (e.g., scale of loss globally) | |
| - Why satellite-based detection is better than ground surveys | |
| - What makes this project novel or useful | |
| ### Page 3 — "Why I Picked the Winner" | |
| - [ ] Write real reasoning in the `st.expander` on Page 3: | |
| - Why PR-AUC over accuracy for 4.8% class imbalance | |
| - Brier score as a calibration check (not just ranking) | |
| - Recall vs. precision trade-off for environmental monitoring | |
| (false negatives = missed deforestation events = real-world consequence) | |
| - What the Optuna tuning added vs. the baseline XGBoost | |
| ### Page 4 — Architecture Node Descriptions | |
| - [ ] Fill in the Sankey node descriptions in the `nodes` dict (`app/streamlit_app.py`, ~line 700): | |
| - Google Earth Engine | |
| - Parquet Export | |
| - Feature Engineering | |
| - Optuna Tuning | |
| - XGBoost Model | |
| - FastAPI | |
| - Streamlit Dashboard | |
| ### Page 4 — Key Decisions & Lessons | |
| - [ ] Write real content in the 4 `st.expander` blocks: | |
| - "Why cosine drift as the core signal" | |
| - "Handling severe class imbalance (4.8%)" | |
| - "Why geo-stratified train/test split" | |
| - "Optuna over grid search" | |
| --- | |
| ## AOI Fun Facts Research | |
| > Each AOI rectangle on the map (Pages 2 + 3) shows a fun fact on click. | |
| > Replace the PLACEHOLDER entries in `AOI_FUN_FACTS` in `app/streamlit_app.py`. | |
| - [ ] **Indonesia-Malaysia** — highest drift in dataset; research: peat fires, palm oil, | |
| logging in Borneo/Sumatra. Add source URL. | |
| > Suggested sources: Global Forest Watch, Mongabay, Nature journal | |
| - [ ] **Amazon Basin** — add a compelling fact + source | |
| - [ ] **Congo Basin DRC** — add a compelling fact + source | |
| - [ ] **Guinea** — add a compelling fact + source | |
| - [ ] **Canada** — add a compelling fact + source | |
| - [ ] **Mekong Region** — add a compelling fact + source | |
| - [ ] **Cerrado Brazil** — add a compelling fact + source | |
| --- | |
| ## Pre-Deploy Checklist | |
| - [ ] All content to-dos above are filled in | |
| - [ ] All pre-computed artifact boxes above are checked | |
| - [ ] GEE service account is set up and tested (click a point on Page 3 map) | |
| - [ ] GitHub link on Page 4 is updated | |
| - [ ] Hero animation or image is in place | |
| - [ ] App runs locally with `streamlit run app/streamlit_app.py` without errors | |
| - [ ] All 4 pages load in < 2 seconds (no 800k-row parquet loads at runtime) | |
| --- | |
| --- | |
| ## ✅ Done | |
| ### ~~Pre-computed Data Artifacts~~ | |
| > Run `python resources/precompute.py` to regenerate all artifacts. | |
| - [x] ~~`resources/kpi_summary.json` — KPI cards on Page 1~~ | |
| - [x] ~~`resources/mean_embedding_profile.parquet` — embedding profile chart on Page 2~~ | |
| - [x] ~~`resources/drift_by_area_year.parquet` — drift analysis chart + violin on Page 2~~ | |
| - [x] ~~`resources/target_distribution.json` — class distribution charts on Page 2~~ | |
| - [x] ~~`resources/timelapse_sample.parquet` — timelapse animation + map dots on Page 2~~ | |
| - [x] ~~`resources/canada_best_trial.json` + `amazon_basin_best_trial.json` — model metrics on Page 3~~ | |
| - [x] ~~`resources/canada_feature_importance.json` + `amazon_basin_feature_importance.json` — Page 3 side-by-side~~ | |
| - [x] ~~`resources/canada_confusion_matrix.json` + `amazon_basin_confusion_matrix.json` — Page 3 side-by-side~~ | |
| - [x] ~~`resources/canada_data_sample.parquet` + `amazon_basin_data_sample.parquet` — Page 2 area toggle~~ | |
| - [x] ~~`resources/comparison_kpis.json` — "Tale of two forests" on Page 1~~ | |
| - [x] ~~`resources/aoi_stats.json` — sidebar Other Regions stats table~~ | |
| - [x] ~~`resources/evaluations_clean.parquet` — superseded by per-area artifacts; no longer loaded by app~~ | |
| - [x] ~~`resources/feature_importance.json` — superseded by per-area feature importance; no longer loaded~~ | |
| - [x] ~~`resources/confusion_matrix.json` — superseded by per-area confusion matrices; no longer loaded~~ | |
| - [x] ~~`resources/pca_sample.parquet` — PCA scatter removed from app; no longer loaded~~ | |
| ### ~~Model Files~~ | |
| - [x] ~~`models/xgboost_canada_optuna.pkl`~~ and ~~`models/xgboost_amazon_basin_optuna.pkl`~~ — **final tuned winners** — copied to `resources/` for HF Space | |
| - [x] ~~`models/rf_tuned.pkl`, `models/sgd_optuna.pkl`, `models/baseline_logistic_regression_*.pkl`~~ — were used in old 4-model comparison table; that page has been replaced by the Canada vs Amazon Basin framing; no longer needed by the app | |
| ### ~~streamlit-folium installed~~ | |
| - [x] ~~`streamlit-folium>=0.25.0` installed and in use~~ | |
| ### ~~Nav Bar Icons (optional polish)~~ | |
| - [x] ~~SVG icons confirmed working — implementation commented out in `app/streamlit_app.py`~~. To re-enable: | |
| 1. Source per-page SVG icons (e.g., [Lordicon](https://lordicon.com), [Heroicons](https://heroicons.com), [Phosphor](https://phosphoricons.com)) and save to `resources/` | |
| 2. Add `import base64` to top-level imports in `app/streamlit_app.py` | |
| 3. Uncomment `_load_nav_icon()` and the button loop in the comment block above `_render_nav()` | |
| 4. Optionally remap SVG stroke colors to theme values (`#40916c`, `#1b4332`) via string replace before base64 encoding | |
| - Current placeholder: `resources/wired-outline-955-demand-hover-click.svg` (Lordicon "demand" icon, tested ✓) | |
| - [x] ~~**Lottie animations in nav** — not practical inside nav buttons (renders as full-height component block)~~ | |
| ### ~~Theme Selection~~ | |
| - [x] ~~Theme selected and applied: **Canopy Light** + **Manrope** — `THEME` dict set in `app/streamlit_app.py`~~ | |
Xet Storage Details
- Size:
- 6.91 kB
- Xet hash:
- f8c0cfe0cea8187f9ff52ef25617404b2090221f8b27c656fc6c446589cc40de
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.