damlab
/

HIV_BERT

damlab commited on Feb 23, 2022

Commit

1801ae2

1 Parent(s): ad000e8

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -5,7 +5,6 @@ license: mit
 # Model Card for [HIV-BERT]
 ## Table of Contents
-- [Table of Contents](#table-of-contents)
 - [Summary](#model-summary)
 - [Model Description](#model-description)
 - [Intended Uses & Limitations](#intended-uses-&-limitations)
@@ -23,7 +22,7 @@ license: mit
 ## Model Description
-[Like the original ProtBert-BFD model, this model encodes each amino acid as an individual token. This model was trained using Masked Language Modeling: a process in which a random set of tokens are masked with the model trained on their prediction. This model was trained using the damlab/hiv_flt dataset with 256 amino acid chunks and a 15% mask rate.]
 ## Intended Uses & Limitations
@@ -35,7 +34,7 @@ license: mit
 ## Training Data
-[The dataset damlab/HIV_FLT was used to refine the original rostlab/Prot-bert-bfd. This dataset contains 1790 full HIV genomes from across the globe. When translated, these genomes contain approximately 3.9 million amino-acid tokens.]
 ## Training Procedure

 # Model Card for [HIV-BERT]
 ## Table of Contents
 - [Summary](#model-summary)
 - [Model Description](#model-description)
 - [Intended Uses & Limitations](#intended-uses-&-limitations)
 ## Model Description
+[Like the original ProtBert-BFD model, this model encodes each amino acid as an individual token. This model was trained using Masked Language Modeling: a process in which a random set of tokens are masked with the model trained on their prediction. This model was trained using the damlab/hiv-flt dataset with 256 amino acid chunks and a 15% mask rate.]
 ## Intended Uses & Limitations
 ## Training Data
+[The dataset damlab/HIV-FLT was used to refine the original rostlab/Prot-bert-bfd. This dataset contains 1790 full HIV genomes from across the globe. When translated, these genomes contain approximately 3.9 million amino-acid tokens.]
 ## Training Procedure