davda54 commited on
Commit
130b9ab
·
verified ·
1 Parent(s): e90a2b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -5
README.md CHANGED
@@ -20,6 +20,9 @@ license: apache-2.0
20
  <img src="https://huggingface.co/ltg/norbert3-base/resolve/main/norbert.png" width=12.5%>
21
 
22
  The fourth generation of NorBERT models mainly improves their efficiency, but also performance and flexibility.
 
 
 
23
  - **Made to encode long texts**: these models were trained on 16384-token-long texts, the sliding-window attention can then generalize to even longer sequences.
24
  - **Fast and memory-efficient training and inference**: using FlashAttention2 with unpadding, the new generation of NorBERT models can process the long texts with ease.
25
  - **Better performance**: better quality of training corpora and carefully tuned training settings leads to an improved performance over NorBERT 3.
@@ -30,7 +33,6 @@ The fourth generation of NorBERT models mainly improves their efficiency, but al
30
  > [!TIP]
31
  > We recommend installing Flash Attention 2 and `torch.compile`-ing your models to get the highest training and inference efficiency.
32
 
33
- <img src="https://huggingface.co/ltg/norbert4-xlarge/resolve/main/model_performance.png" width=100%>
34
 
35
 
36
  ## All sizes of the NorBERT4 family:
@@ -50,8 +52,13 @@ import torch
50
  from transformers import AutoTokenizer, AutoModelForMaskedLM
51
 
52
  # Import model
53
- tokenizer = AutoTokenizer.from_pretrained("ltg/norbert4-xlarge")
54
- model = AutoModelForMaskedLM.from_pretrained("ltg/norbert4-xlarge", trust_remote_code=True)
 
 
 
 
 
55
 
56
  # Tokenize text (with a mask token inside)
57
  input_text = tokenizer(
@@ -83,8 +90,13 @@ import torch
83
  from transformers import AutoTokenizer, AutoModelForCausalLM
84
 
85
  # Import model
86
- tokenizer = AutoTokenizer.from_pretrained("ltg/norbert4-xlarge")
87
- model = AutoModelForCausalLM.from_pretrained("ltg/norbert4-xlarge", trust_remote_code=True)
 
 
 
 
 
88
 
89
  # Define zero-shot translation prompt template
90
  prompt = """Engelsk: {0}
 
20
  <img src="https://huggingface.co/ltg/norbert3-base/resolve/main/norbert.png" width=12.5%>
21
 
22
  The fourth generation of NorBERT models mainly improves their efficiency, but also performance and flexibility.
23
+
24
+ <img src="https://huggingface.co/ltg/norbert4-xlarge/resolve/main/model_performance.png" width=100%>
25
+
26
  - **Made to encode long texts**: these models were trained on 16384-token-long texts, the sliding-window attention can then generalize to even longer sequences.
27
  - **Fast and memory-efficient training and inference**: using FlashAttention2 with unpadding, the new generation of NorBERT models can process the long texts with ease.
28
  - **Better performance**: better quality of training corpora and carefully tuned training settings leads to an improved performance over NorBERT 3.
 
33
  > [!TIP]
34
  > We recommend installing Flash Attention 2 and `torch.compile`-ing your models to get the highest training and inference efficiency.
35
 
 
36
 
37
 
38
  ## All sizes of the NorBERT4 family:
 
52
  from transformers import AutoTokenizer, AutoModelForMaskedLM
53
 
54
  # Import model
55
+ tokenizer = AutoTokenizer.from_pretrained(
56
+ "ltg/norbert4-xlarge"
57
+ )
58
+ model = AutoModelForMaskedLM.from_pretrained(
59
+ "ltg/norbert4-xlarge",
60
+ trust_remote_code=True
61
+ )
62
 
63
  # Tokenize text (with a mask token inside)
64
  input_text = tokenizer(
 
90
  from transformers import AutoTokenizer, AutoModelForCausalLM
91
 
92
  # Import model
93
+ tokenizer = AutoTokenizer.from_pretrained(
94
+ "ltg/norbert4-xlarge"
95
+ )
96
+ model = AutoModelForCausalLM.from_pretrained(
97
+ "ltg/norbert4-xlarge",
98
+ trust_remote_code=True
99
+ )
100
 
101
  # Define zero-shot translation prompt template
102
  prompt = """Engelsk: {0}