microbe-model / artifacts /external_benchmark_status.md
Miyu Horiuchi
Deploy app from main@a3254bf (no paper/ binaries)
0ed74db

External Tool Benchmark Status

This file tracks the apples-to-apples benchmark setup for external tools on the same held-out BacDive/MediaDive strains used by the dry-lab media recommender benchmark.

Held-Out Manifest

  • Manifest: artifacts/external_benchmark_manifest.parquet
  • Rows: 25,728
  • Unique genome accessions: 16,154
  • Media labels retained: 40
  • Fold counts: {"0": 5146, "1": 5146, "2": 5146, "3": 5145, "4": 5145}

Label coverage:

Target Labeled rows
Temperature 25,727
pH 2,984
Salt 2,486
Oxygen 9,283
Medium 21,050

Local Requirements

  • FASTA directory: data/external_benchmark_fastas
  • FASTAs present: 8 / 16,154 (0.05%)
  • FASTA download smoke run: {"attempted": 0, "downloaded": 0, "failed": 0}
Tool Local command Status
GenomeSPOT uv run python -m genome_spot.genome_spot available
CarveMe uv run --with carveme carve available
gapseq `` missing

Verdict

External baseline execution is not ready on this machine yet: the full held-out FASTA set and one or more external tool binaries/databases are missing.

Next Commands

Use the manifest to run each external tool against the same rows and folds. The medium-feasibility tools should be scored by whether at least one known MediaDive medium is feasible or closest among the tool's predicted feasible media/metabolite environments.

PYTHONPATH=src uv run --python 3.11 python scripts/42_prepare_external_benchmarks.py \
  --download-fastas 10

For the full benchmark, download the complete FASTA set into the FASTA directory above, install the external tools plus their databases, then run tool-specific inference using the bacdive_id, fold, and genome_accession columns from the manifest.