Update README.md
Browse files
README.md
CHANGED
|
@@ -2,39 +2,36 @@
|
|
| 2 |
base_model: google/gemma-2-2B
|
| 3 |
license: cc-by-nc-sa-4.0
|
| 4 |
language:
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
library_name: transformers
|
| 28 |
-
datasets:
|
| 29 |
-
- Widn/TowerBlocks-v4-250205
|
| 30 |
-
- Widn/TowerDPO-v4-anthill-250227
|
| 31 |
---
|
| 32 |
|
| 33 |
# Model Description:
|
| 34 |
|
| 35 |
-
**Tower
|
| 36 |
|
| 37 |
-
This approach makes Tower
|
| 38 |
|
| 39 |
- **Developed by:** Widn
|
| 40 |
- **Model type:** A 2B parameter model fine-tuned on a mix of _translation-related tasks_ as well as _general instruction-following_ datasets that include reasoning, code instructions, etc.
|
|
@@ -80,7 +77,7 @@ sampling_params = SamplingParams(
|
|
| 80 |
temperature=0,
|
| 81 |
max_tokens=8192,
|
| 82 |
)
|
| 83 |
-
llm = LLM(model="
|
| 84 |
messages = [{"role": "user", "content": "Translate: Hello, world! into Portuguese."}]
|
| 85 |
outputs = llm.chat(messages, sampling_params)
|
| 86 |
# Make sure your prompt_token_ids look like this
|
|
@@ -97,10 +94,10 @@ print (outputs[0].outputs[0].text)
|
|
| 97 |
import torch
|
| 98 |
from transformers import pipeline
|
| 99 |
|
| 100 |
-
pipe = pipeline("text-generation", model="
|
| 101 |
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
|
| 102 |
messages = [{"role": "user", "content": "Translate: Hello, world! into Portuguese."}]
|
| 103 |
input_ids = pipe.tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True)
|
| 104 |
outputs = pipe(messages, max_new_tokens=256, do_sample=False)
|
| 105 |
print(outputs[0]["generated_text"])
|
| 106 |
-
```
|
|
|
|
| 2 |
base_model: google/gemma-2-2B
|
| 3 |
license: cc-by-nc-sa-4.0
|
| 4 |
language:
|
| 5 |
+
- de
|
| 6 |
+
- nl
|
| 7 |
+
- is
|
| 8 |
+
- es
|
| 9 |
+
- fr
|
| 10 |
+
- pt
|
| 11 |
+
- uk
|
| 12 |
+
- hi
|
| 13 |
+
- zh
|
| 14 |
+
- ru
|
| 15 |
+
- cs
|
| 16 |
+
- ko
|
| 17 |
+
- ja
|
| 18 |
+
- it
|
| 19 |
+
- en
|
| 20 |
+
- da
|
| 21 |
+
- pl
|
| 22 |
+
- hu
|
| 23 |
+
- sv
|
| 24 |
+
- 'no'
|
| 25 |
+
- ro
|
| 26 |
+
- fi
|
| 27 |
library_name: transformers
|
|
|
|
|
|
|
|
|
|
| 28 |
---
|
| 29 |
|
| 30 |
# Model Description:
|
| 31 |
|
| 32 |
+
**Tower+ 2B** is build on top of Gemma 2 2B. The model goes through the Continuous Pretraining (CPT), Instruction Tuning (IT), Weighted Preference Optimization (WPO) and GRPO with verifiable rewards. During all stages we include parallel and multilingual data (covering 22 languages).
|
| 33 |
|
| 34 |
+
This approach makes Tower+ 2B one of the best multilingual LLMs under 3B parameters.
|
| 35 |
|
| 36 |
- **Developed by:** Widn
|
| 37 |
- **Model type:** A 2B parameter model fine-tuned on a mix of _translation-related tasks_ as well as _general instruction-following_ datasets that include reasoning, code instructions, etc.
|
|
|
|
| 77 |
temperature=0,
|
| 78 |
max_tokens=8192,
|
| 79 |
)
|
| 80 |
+
llm = LLM(model="Unbabel/Tower-Plus-2B", tensor_parallel_size=1)
|
| 81 |
messages = [{"role": "user", "content": "Translate: Hello, world! into Portuguese."}]
|
| 82 |
outputs = llm.chat(messages, sampling_params)
|
| 83 |
# Make sure your prompt_token_ids look like this
|
|
|
|
| 94 |
import torch
|
| 95 |
from transformers import pipeline
|
| 96 |
|
| 97 |
+
pipe = pipeline("text-generation", model="Unbabel/Tower-Plus-2B", device_map="auto")
|
| 98 |
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
|
| 99 |
messages = [{"role": "user", "content": "Translate: Hello, world! into Portuguese."}]
|
| 100 |
input_ids = pipe.tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True)
|
| 101 |
outputs = pipe(messages, max_new_tokens=256, do_sample=False)
|
| 102 |
print(outputs[0]["generated_text"])
|
| 103 |
+
```
|