Update README.md
Browse files
README.md
CHANGED
|
@@ -34,12 +34,6 @@ model-index:
|
|
| 34 |
<div align="center">
|
| 35 |
|
| 36 |
|
| 37 |
-
<p align="center" width="100%">
|
| 38 |
-
<img src="https://raw.githubusercontent.com/CStanKonrad/long_llama/main/assets/results.png" alt="LongLLaMA" style="width: 70%; min-width: 300px; display: block; margin: auto;">
|
| 39 |
-
</p>
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
<table>
|
| 44 |
|
| 45 |
<tr>
|
|
@@ -73,6 +67,7 @@ model-index:
|
|
| 73 |
</div>
|
| 74 |
|
| 75 |
|
|
|
|
| 76 |
## TLDR
|
| 77 |
This repository contains the research preview of **LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more**.
|
| 78 |
|
|
@@ -89,6 +84,10 @@ LongLLaMA Code is built upon the foundation of [Code Llama](https://huggingface.
|
|
| 89 |
with three layers used for context extension. **Crucially, LongLLaMA is able to extrapolate much beyond the context length seen in training: 8k. E.g., in the passkey retrieval task, it can handle inputs of length 256k**.
|
| 90 |
**LongLLaMA Code** is a [Code Llama](https://huggingface.co/codellama/CodeLlama-7b-hf) model finetuned with the FoT method.
|
| 91 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
|
| 93 |
<div align="center">
|
| 94 |
|
|
|
|
| 34 |
<div align="center">
|
| 35 |
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
<table>
|
| 38 |
|
| 39 |
<tr>
|
|
|
|
| 67 |
</div>
|
| 68 |
|
| 69 |
|
| 70 |
+
|
| 71 |
## TLDR
|
| 72 |
This repository contains the research preview of **LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more**.
|
| 73 |
|
|
|
|
| 84 |
with three layers used for context extension. **Crucially, LongLLaMA is able to extrapolate much beyond the context length seen in training: 8k. E.g., in the passkey retrieval task, it can handle inputs of length 256k**.
|
| 85 |
**LongLLaMA Code** is a [Code Llama](https://huggingface.co/codellama/CodeLlama-7b-hf) model finetuned with the FoT method.
|
| 86 |
|
| 87 |
+
<p align="center" width="100%">
|
| 88 |
+
<img src="https://raw.githubusercontent.com/CStanKonrad/long_llama/main/assets/results.png" alt="LongLLaMA" style="width: 70%; min-width: 300px; display: block; margin: auto;">
|
| 89 |
+
</p>
|
| 90 |
+
|
| 91 |
|
| 92 |
<div align="center">
|
| 93 |
|