Update README.md
Browse files
README.md
CHANGED
|
@@ -34,6 +34,9 @@ model-index:
|
|
| 34 |
<div align="center">
|
| 35 |
|
| 36 |
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
|
| 39 |
|
|
@@ -69,13 +72,13 @@ model-index:
|
|
| 69 |
|
| 70 |
</div>
|
| 71 |
|
|
|
|
| 72 |
## TLDR
|
| 73 |
This repository contains the research preview of **LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more**.
|
| 74 |
|
| 75 |
LongLLaMA is built upon the foundation of [OpenLLaMA](https://github.com/openlm-research/open_llama) and fine-tuned using the [Focused Transformer (FoT)](https://arxiv.org/abs/2307.03170) method.
|
| 76 |
LongLLaMA Code is built upon the foundation of [Code Llama](https://huggingface.co/codellama/CodeLlama-7b-hf).
|
| 77 |
|
| 78 |
-
|
| 79 |
## Overview
|
| 80 |
|
| 81 |
### Base models
|
|
@@ -98,6 +101,10 @@ with three layers used for context extension. **Crucially, LongLLaMA is able to
|
|
| 98 |
|
| 99 |
</div>
|
| 100 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 101 |
|
| 102 |
## Usage
|
| 103 |
|
|
|
|
| 34 |
<div align="center">
|
| 35 |
|
| 36 |
|
| 37 |
+
<p align="center" width="100%">
|
| 38 |
+
<img src="https://raw.githubusercontent.com/CStanKonrad/long_llama/main/assets/results.png" alt="LongLLaMA" style="width: 70%; min-width: 300px; display: block; margin: auto;">
|
| 39 |
+
</p>
|
| 40 |
|
| 41 |
|
| 42 |
|
|
|
|
| 72 |
|
| 73 |
</div>
|
| 74 |
|
| 75 |
+
|
| 76 |
## TLDR
|
| 77 |
This repository contains the research preview of **LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more**.
|
| 78 |
|
| 79 |
LongLLaMA is built upon the foundation of [OpenLLaMA](https://github.com/openlm-research/open_llama) and fine-tuned using the [Focused Transformer (FoT)](https://arxiv.org/abs/2307.03170) method.
|
| 80 |
LongLLaMA Code is built upon the foundation of [Code Llama](https://huggingface.co/codellama/CodeLlama-7b-hf).
|
| 81 |
|
|
|
|
| 82 |
## Overview
|
| 83 |
|
| 84 |
### Base models
|
|
|
|
| 101 |
|
| 102 |
</div>
|
| 103 |
|
| 104 |
+
## Results
|
| 105 |
+
|
| 106 |
+
|
| 107 |
+
|
| 108 |
|
| 109 |
## Usage
|
| 110 |
|