Spaces:
Sleeping
Sleeping
File size: 2,334 Bytes
b84f100 c85b7f7 b84f100 c9c9100 f373e28 c9c9100 f373e28 c85b7f7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | ---
title: Algae Yield Predictor
emoji: 🌱
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.46.1
app_file: app.py
pinned: false
license: cc-by-nc-nd-4.0
---
# 🌱 Algae Yield Predictor
This Space provides an interactive interface to **predict algal biomass, lipid, protein, and carbohydrate yields** under different culture conditions.
It uses augmented datasets (200k synthetic rows per target) combined with ensemble ML models (CatBoost, XGBoost, LightGBM, ExtraTrees) and a meta-stacking approach.
[Full description continues here...]
# 🌱 Algae Yield Predictor
This Space provides an interactive interface to **predict algal biomass, lipid, protein, and carbohydrate yields** under different culture conditions.
It uses augmented datasets (200k synthetic rows per target) combined with ensemble ML models (CatBoost, XGBoost, LightGBM, ExtraTrees) and a meta-stacking approach.
---
## ✨ Features
- **Targets:** biomass, lipid, protein, carbohydrate
- **Species–Media aware:** dropdowns restrict valid species–medium combinations
- **Curated suggestions:** shows recommended conditions for each species/target
- **Uncertainty estimates:** KNN-based local intervals (10–90%) from augmented data
- **Response plots:** sweep one variable (light, days, pH, etc.) and visualize prediction curve + uncertainty band
- **DOI references:** retrieves closest experimental setups from `doi.csv` (if provided)
---
## 🚀 How to Use
1. Select a **target** (biomass, lipid, protein, carb).
2. Choose **species** and valid **growth medium**.
3. Adjust culture conditions:
- Light intensity
- Day/Night exposure
- Temperature
- pH
- Days of culture
4. Click **Predict + Plot** to:
- Get yield prediction with uncertainty band
- See response curve for chosen variable
5. Optionally click **Find Closest DOI Matches** to explore related literature.
---
## 🧩 Models & Data
- **Training data:** real experimental CSV (`ai_al.csv`) + augmented synthetic sets (20k/200k).
- **Models:** CatBoost, XGBoost, LightGBM, ExtraTrees → stacked with RidgeCV.
- **Uncertainty:** derived from nearest neighbors in augmented dataset.
If `doi.csv` is provided with experimental metadata + DOI links, the app will display closest literature matches.
---
## 📂 File Structure |