--- title: Algae Yield Predictor emoji: 🌱 colorFrom: green colorTo: blue sdk: gradio sdk_version: 5.46.1 app_file: app.py pinned: false license: cc-by-nc-nd-4.0 --- # 🌱 Algae Yield Predictor This Space provides an interactive interface to **predict algal biomass, lipid, protein, and carbohydrate yields** under different culture conditions. It uses augmented datasets (200k synthetic rows per target) combined with ensemble ML models (CatBoost, XGBoost, LightGBM, ExtraTrees) and a meta-stacking approach. [Full description continues here...] # 🌱 Algae Yield Predictor This Space provides an interactive interface to **predict algal biomass, lipid, protein, and carbohydrate yields** under different culture conditions. It uses augmented datasets (200k synthetic rows per target) combined with ensemble ML models (CatBoost, XGBoost, LightGBM, ExtraTrees) and a meta-stacking approach. --- ## ✨ Features - **Targets:** biomass, lipid, protein, carbohydrate - **Species–Media aware:** dropdowns restrict valid species–medium combinations - **Curated suggestions:** shows recommended conditions for each species/target - **Uncertainty estimates:** KNN-based local intervals (10–90%) from augmented data - **Response plots:** sweep one variable (light, days, pH, etc.) and visualize prediction curve + uncertainty band - **DOI references:** retrieves closest experimental setups from `doi.csv` (if provided) --- ## 🚀 How to Use 1. Select a **target** (biomass, lipid, protein, carb). 2. Choose **species** and valid **growth medium**. 3. Adjust culture conditions: - Light intensity - Day/Night exposure - Temperature - pH - Days of culture 4. Click **Predict + Plot** to: - Get yield prediction with uncertainty band - See response curve for chosen variable 5. Optionally click **Find Closest DOI Matches** to explore related literature. --- ## 🧩 Models & Data - **Training data:** real experimental CSV (`ai_al.csv`) + augmented synthetic sets (20k/200k). - **Models:** CatBoost, XGBoost, LightGBM, ExtraTrees → stacked with RidgeCV. - **Uncertainty:** derived from nearest neighbors in augmented dataset. If `doi.csv` is provided with experimental metadata + DOI links, the app will display closest literature matches. --- ## 📂 File Structure