Improve language tag

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show

README.md +40 -28

README.md CHANGED Viewed

@@ -1,29 +1,41 @@
----
-license: apache-2.0
-language:
-- en
-base_model:
-- Qwen/Qwen2.5-0.5B
-datasets:
-- alamios/DeepSeek-R1-Distill-Qwen-32B-Conversations
-pipeline_tag: text-generation
-library_name: transformers
-tags:
-- qwen
-- qwen2.5
-- deepseek
----
-# DeepSeek-R1-DRAFT-Qwen2.5-0.5B
-**Updated to v1**
-This model is trained on outputs of <a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B">deepseek-ai/DeepSeek-R1-Distill-Qwen-32B</a> and is meant to be used only as draft model for speculative decoding.
-It's specifically intended for users of 3090/4090, allowing you to run the DeepSeek-R1-Distill-Qwen-32B-Q4_K_M GGUF version with 16k context and speeding up generation without sacrificing more context length or model quality.
-# Data info
-The data consists of code, math, reasoning and general knowledge tasks collected from various datasets. It has been trained for 2 epochs on 7k unique examples, for a total of 26 million tokens per epoch.
 Since data generation was done using spare GPU time, I may publish a further trained version later.

+---
+license: apache-2.0
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+base_model:
+- Qwen/Qwen2.5-0.5B
+datasets:
+- alamios/DeepSeek-R1-Distill-Qwen-32B-Conversations
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- qwen
+- qwen2.5
+- deepseek
+---
+# DeepSeek-R1-DRAFT-Qwen2.5-0.5B
+**Updated to v1**
+This model is trained on outputs of <a href="https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B">deepseek-ai/DeepSeek-R1-Distill-Qwen-32B</a> and is meant to be used only as draft model for speculative decoding.
+It's specifically intended for users of 3090/4090, allowing you to run the DeepSeek-R1-Distill-Qwen-32B-Q4_K_M GGUF version with 16k context and speeding up generation without sacrificing more context length or model quality.
+# Data info
+The data consists of code, math, reasoning and general knowledge tasks collected from various datasets. It has been trained for 2 epochs on 7k unique examples, for a total of 26 million tokens per epoch.
 Since data generation was done using spare GPU time, I may publish a further trained version later.