Update README.md
Browse files
README.md
CHANGED
|
@@ -41,7 +41,8 @@ the `rescaling_factor` of the Rotary Embedding layer in the esm model `num_dna_
|
|
| 41 |
(i.e 6669 for a sequence of 40008 base pairs) and `max_num_tokens_nt` is the max number of tokens on which the backbone nucleotide-transformer was trained on, i.e `2048`.
|
| 42 |
|
| 43 |
[](https://colab.research.google.com/#fileId=https%3A//huggingface.co/InstaDeepAI/segment_nt/blob/main/inference_segment_nt.ipynb)
|
| 44 |
-
The `./inference_segment_nt.ipynb` can be run in Google Colab by clicking on the icon and shows how to
|
|
|
|
| 45 |
|
| 46 |
```python
|
| 47 |
# Load model and tokenizer
|
|
|
|
| 41 |
(i.e 6669 for a sequence of 40008 base pairs) and `max_num_tokens_nt` is the max number of tokens on which the backbone nucleotide-transformer was trained on, i.e `2048`.
|
| 42 |
|
| 43 |
[](https://colab.research.google.com/#fileId=https%3A//huggingface.co/InstaDeepAI/segment_nt/blob/main/inference_segment_nt.ipynb)
|
| 44 |
+
The `./inference_segment_nt.ipynb` can be run in Google Colab by clicking on the icon and shows how to handle inference on sequence lengths require changing
|
| 45 |
+
the rescaling factor and sequence lengths that do not. One can run the notebook and reproduce Fig.1 and Fig.3 from the SegmentNT paper.
|
| 46 |
|
| 47 |
```python
|
| 48 |
# Load model and tokenizer
|