Summarization
Transformers
TensorBoard
Safetensors
Arabic
m2m_100
text2text-generation
Dialects Conversion
Text Correction
Punctiating
Diacretization
En-Ar Transtaltion
Instructions to use HamzaNaser/Dialects-to-MSA-Transformer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HamzaNaser/Dialects-to-MSA-Transformer with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "summarization" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("summarization", model="HamzaNaser/Dialects-to-MSA-Transformer")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("HamzaNaser/Dialects-to-MSA-Transformer") model = AutoModelForSeq2SeqLM.from_pretrained("HamzaNaser/Dialects-to-MSA-Transformer") - Notebooks
- Google Colab
- Kaggle
Upload tokenizer
Browse files- special_tokens_map.json +35 -5
special_tokens_map.json
CHANGED
|
@@ -101,9 +101,39 @@
|
|
| 101 |
"__zh__",
|
| 102 |
"__zu__"
|
| 103 |
],
|
| 104 |
-
"bos_token":
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
}
|
|
|
|
| 101 |
"__zh__",
|
| 102 |
"__zu__"
|
| 103 |
],
|
| 104 |
+
"bos_token": {
|
| 105 |
+
"content": "<s>",
|
| 106 |
+
"lstrip": false,
|
| 107 |
+
"normalized": false,
|
| 108 |
+
"rstrip": false,
|
| 109 |
+
"single_word": false
|
| 110 |
+
},
|
| 111 |
+
"eos_token": {
|
| 112 |
+
"content": "</s>",
|
| 113 |
+
"lstrip": false,
|
| 114 |
+
"normalized": false,
|
| 115 |
+
"rstrip": false,
|
| 116 |
+
"single_word": false
|
| 117 |
+
},
|
| 118 |
+
"pad_token": {
|
| 119 |
+
"content": "<pad>",
|
| 120 |
+
"lstrip": false,
|
| 121 |
+
"normalized": false,
|
| 122 |
+
"rstrip": false,
|
| 123 |
+
"single_word": false
|
| 124 |
+
},
|
| 125 |
+
"sep_token": {
|
| 126 |
+
"content": "</s>",
|
| 127 |
+
"lstrip": false,
|
| 128 |
+
"normalized": false,
|
| 129 |
+
"rstrip": false,
|
| 130 |
+
"single_word": false
|
| 131 |
+
},
|
| 132 |
+
"unk_token": {
|
| 133 |
+
"content": "<unk>",
|
| 134 |
+
"lstrip": false,
|
| 135 |
+
"normalized": false,
|
| 136 |
+
"rstrip": false,
|
| 137 |
+
"single_word": false
|
| 138 |
+
}
|
| 139 |
}
|