Update README.md
Browse files
README.md
CHANGED
|
@@ -17,6 +17,8 @@ model-index:
|
|
| 17 |
|
| 18 |
# Suzume
|
| 19 |
|
|
|
|
|
|
|
| 20 |
This Suzume 8B, a multilingual finetune of Llama 3 ([meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)).
|
| 21 |
|
| 22 |
Llama 3 has exhibited excellent performance on many English language benchmarks.
|
|
@@ -262,6 +264,21 @@ The following hyperparameters were used during training:
|
|
| 262 |
- Datasets 2.18.0
|
| 263 |
- Tokenizers 0.15.0
|
| 264 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 265 |
# Developer
|
| 266 |
|
| 267 |
Peter Devine - ([ptrdvn](https://huggingface.co/ptrdvn))
|
|
|
|
| 17 |
|
| 18 |
# Suzume
|
| 19 |
|
| 20 |
+
[[Paper](https://arxiv.org/abs/2405.12612)] [[Dataset](https://huggingface.co/datasets/lightblue/tagengo-gpt4)]
|
| 21 |
+
|
| 22 |
This Suzume 8B, a multilingual finetune of Llama 3 ([meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)).
|
| 23 |
|
| 24 |
Llama 3 has exhibited excellent performance on many English language benchmarks.
|
|
|
|
| 264 |
- Datasets 2.18.0
|
| 265 |
- Tokenizers 0.15.0
|
| 266 |
|
| 267 |
+
# How to cite
|
| 268 |
+
|
| 269 |
+
Please cite [this paper](https://arxiv.org/abs/2405.12612) when referencing this model.
|
| 270 |
+
|
| 271 |
+
```tex
|
| 272 |
+
@misc{devine2024tagengo,
|
| 273 |
+
title={Tagengo: A Multilingual Chat Dataset},
|
| 274 |
+
author={Peter Devine},
|
| 275 |
+
year={2024},
|
| 276 |
+
eprint={2405.12612},
|
| 277 |
+
archivePrefix={arXiv},
|
| 278 |
+
primaryClass={cs.CL}
|
| 279 |
+
}
|
| 280 |
+
```
|
| 281 |
+
|
| 282 |
# Developer
|
| 283 |
|
| 284 |
Peter Devine - ([ptrdvn](https://huggingface.co/ptrdvn))
|