ashaddams's picture
Update README.md
c85b7f7 verified

A newer version of the Gradio SDK is available: 6.10.0

Upgrade
metadata
title: Algae Yield Predictor
emoji: 🌱
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.46.1
app_file: app.py
pinned: false
license: cc-by-nc-nd-4.0

🌱 Algae Yield Predictor

This Space provides an interactive interface to predict algal biomass, lipid, protein, and carbohydrate yields under different culture conditions.
It uses augmented datasets (200k synthetic rows per target) combined with ensemble ML models (CatBoost, XGBoost, LightGBM, ExtraTrees) and a meta-stacking approach.

[Full description continues here...]

🌱 Algae Yield Predictor

This Space provides an interactive interface to predict algal biomass, lipid, protein, and carbohydrate yields under different culture conditions.
It uses augmented datasets (200k synthetic rows per target) combined with ensemble ML models (CatBoost, XGBoost, LightGBM, ExtraTrees) and a meta-stacking approach.


✨ Features

  • Targets: biomass, lipid, protein, carbohydrate
  • Species–Media aware: dropdowns restrict valid species–medium combinations
  • Curated suggestions: shows recommended conditions for each species/target
  • Uncertainty estimates: KNN-based local intervals (10–90%) from augmented data
  • Response plots: sweep one variable (light, days, pH, etc.) and visualize prediction curve + uncertainty band
  • DOI references: retrieves closest experimental setups from doi.csv (if provided)

🚀 How to Use

  1. Select a target (biomass, lipid, protein, carb).
  2. Choose species and valid growth medium.
  3. Adjust culture conditions:
    • Light intensity
    • Day/Night exposure
    • Temperature
    • pH
    • Days of culture
  4. Click Predict + Plot to:
    • Get yield prediction with uncertainty band
    • See response curve for chosen variable
  5. Optionally click Find Closest DOI Matches to explore related literature.

🧩 Models & Data

  • Training data: real experimental CSV (ai_al.csv) + augmented synthetic sets (20k/200k).
  • Models: CatBoost, XGBoost, LightGBM, ExtraTrees → stacked with RidgeCV.
  • Uncertainty: derived from nearest neighbors in augmented dataset.

If doi.csv is provided with experimental metadata + DOI links, the app will display closest literature matches.


📂 File Structure