Spaces:
Sleeping
Sleeping
| title: Algae Yield Predictor | |
| emoji: 🌱 | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: 5.46.1 | |
| app_file: app.py | |
| pinned: false | |
| license: cc-by-nc-nd-4.0 | |
| # 🌱 Algae Yield Predictor | |
| This Space provides an interactive interface to **predict algal biomass, lipid, protein, and carbohydrate yields** under different culture conditions. | |
| It uses augmented datasets (200k synthetic rows per target) combined with ensemble ML models (CatBoost, XGBoost, LightGBM, ExtraTrees) and a meta-stacking approach. | |
| [Full description continues here...] | |
| # 🌱 Algae Yield Predictor | |
| This Space provides an interactive interface to **predict algal biomass, lipid, protein, and carbohydrate yields** under different culture conditions. | |
| It uses augmented datasets (200k synthetic rows per target) combined with ensemble ML models (CatBoost, XGBoost, LightGBM, ExtraTrees) and a meta-stacking approach. | |
| --- | |
| ## ✨ Features | |
| - **Targets:** biomass, lipid, protein, carbohydrate | |
| - **Species–Media aware:** dropdowns restrict valid species–medium combinations | |
| - **Curated suggestions:** shows recommended conditions for each species/target | |
| - **Uncertainty estimates:** KNN-based local intervals (10–90%) from augmented data | |
| - **Response plots:** sweep one variable (light, days, pH, etc.) and visualize prediction curve + uncertainty band | |
| - **DOI references:** retrieves closest experimental setups from `doi.csv` (if provided) | |
| --- | |
| ## 🚀 How to Use | |
| 1. Select a **target** (biomass, lipid, protein, carb). | |
| 2. Choose **species** and valid **growth medium**. | |
| 3. Adjust culture conditions: | |
| - Light intensity | |
| - Day/Night exposure | |
| - Temperature | |
| - pH | |
| - Days of culture | |
| 4. Click **Predict + Plot** to: | |
| - Get yield prediction with uncertainty band | |
| - See response curve for chosen variable | |
| 5. Optionally click **Find Closest DOI Matches** to explore related literature. | |
| --- | |
| ## 🧩 Models & Data | |
| - **Training data:** real experimental CSV (`ai_al.csv`) + augmented synthetic sets (20k/200k). | |
| - **Models:** CatBoost, XGBoost, LightGBM, ExtraTrees → stacked with RidgeCV. | |
| - **Uncertainty:** derived from nearest neighbors in augmented dataset. | |
| If `doi.csv` is provided with experimental metadata + DOI links, the app will display closest literature matches. | |
| --- | |
| ## 📂 File Structure |