MidAtBest commited on
Commit
2cd0864
·
1 Parent(s): 4c80381

fix: fix typo in description

Browse files
Files changed (1) hide show
  1. src/streamlit_app.py +2 -3
src/streamlit_app.py CHANGED
@@ -138,17 +138,16 @@ and functional-regulatory prediction, which includes diverse experimental tracks
138
  and translation (Ribo-seq).
139
 
140
  Data are drawn from a phylogenetically diverse set of species, including organisms seen during post-training
141
- (human, chicken, Arabidopsis, rice, maize) and entirely unseen species (cattle, tomato), with careful curation
142
  to avoid data leakage. This design allows the dataset to probe long-range sequence-to-function mapping,
143
  cross-species generalization, and transfer across heterogeneous regulatory modalities,
144
  including assays not present in prior multispecies training corpora. By standardizing sequence length,
145
- resolution, and evaluation metrics across all tracks, \brandbenchmark provides a controlled dataset
146
  for comparing representation quality across genomic foundation models.
147
 
148
  The metrics used are:
149
  - **Pearson correlations (multi-assay)**: per-dataset scores across species and models for functional tracks.
150
  - **MCC (bed tracks)**: per-track MCC values across species and models for gene annotation tracks.
151
-
152
  """
153
 
154
  HERE = os.path.dirname(os.path.abspath(__file__)) # /app/src
 
138
  and translation (Ribo-seq).
139
 
140
  Data are drawn from a phylogenetically diverse set of species, including organisms seen during post-training
141
+ (human, chicken, arabidopsis, rice, maize) and entirely unseen species (cattle, tomato), with careful curation
142
  to avoid data leakage. This design allows the dataset to probe long-range sequence-to-function mapping,
143
  cross-species generalization, and transfer across heterogeneous regulatory modalities,
144
  including assays not present in prior multispecies training corpora. By standardizing sequence length,
145
+ resolution, and evaluation metrics across all tracks, the NTv3 Benchmark provides a controlled dataset
146
  for comparing representation quality across genomic foundation models.
147
 
148
  The metrics used are:
149
  - **Pearson correlations (multi-assay)**: per-dataset scores across species and models for functional tracks.
150
  - **MCC (bed tracks)**: per-track MCC values across species and models for gene annotation tracks.
 
151
  """
152
 
153
  HERE = os.path.dirname(os.path.abspath(__file__)) # /app/src