Add pipeline tag, library name, and paper link to model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +11 -9
README.md CHANGED
@@ -1,7 +1,15 @@
1
  ---
 
 
 
 
2
  language:
3
  - tr
4
  license: apache-2.0
 
 
 
 
5
  tags:
6
  - turkish
7
  - diffusion
@@ -9,30 +17,24 @@ tags:
9
  - non-autoregressive
10
  - foundation-model
11
  - dllm
12
- datasets:
13
- - turkish-nlp-suite/Havadis
14
- - turkish-nlp-suite/temiz-OSCAR
15
- - wikimedia/wikipedia
16
- metrics:
17
- - perplexity
18
  ---
19
 
20
  # DiffutronLM-0.3B-Base
21
 
22
  **DiffutronLM-0.3B-Base** is the foundational Masked Diffusion Language Model (MDLM) of the Diffutron series, tailored specifically for the Turkish language.
23
 
24
- This model represents the completion of the **Continual Pre-training (CPT)** phase. It has successfully adapted the multilingual representations of its backbone to the agglutinative complexity and morphological nuances of Turkish.
25
 
26
  โš ๏ธ **Note:** This is a base foundation model. It has **not** been instruction-tuned or aligned for chat capabilities. If you are looking for a model that follows prompts and answers questions, please use `DiffutronLM-0.3B-Instruct`.
27
 
28
  ## ๐Ÿ“Œ Model Details
29
 
30
  * **Model Type:** Masked Diffusion Language Model (MDLM) Base
31
- * **Base Architecture:** `jhu-clsp/mmBERT-base` (Multilingual Encoder)
32
  * **Language:** Turkish
33
  * **Parameter Count:** 307M (0.3B)
34
  * **Context Length:** 512 tokens
35
- * **Training Libraries:** `dllm`, PyTorch
36
  * **Status:** Foundation / Base Model (Post-CPT)
37
 
38
  ## ๐Ÿš€ Architecture & Continual Pre-training (CPT)
 
1
  ---
2
+ datasets:
3
+ - turkish-nlp-suite/Havadis
4
+ - turkish-nlp-suite/temiz-OSCAR
5
+ - wikimedia/wikipedia
6
  language:
7
  - tr
8
  license: apache-2.0
9
+ metrics:
10
+ - perplexity
11
+ library_name: transformers
12
+ pipeline_tag: text-generation
13
  tags:
14
  - turkish
15
  - diffusion
 
17
  - non-autoregressive
18
  - foundation-model
19
  - dllm
 
 
 
 
 
 
20
  ---
21
 
22
  # DiffutronLM-0.3B-Base
23
 
24
  **DiffutronLM-0.3B-Base** is the foundational Masked Diffusion Language Model (MDLM) of the Diffutron series, tailored specifically for the Turkish language.
25
 
26
+ This model is presented in the paper [Diffutron: A Masked Diffusion Language Model for Turkish Language](https://huggingface.co/papers/2603.20466). It represents the completion of the **Continual Pre-training (CPT)** phase, where it successfully adapted the multilingual representations of its backbone to the agglutinative complexity and morphological nuances of Turkish.
27
 
28
  โš ๏ธ **Note:** This is a base foundation model. It has **not** been instruction-tuned or aligned for chat capabilities. If you are looking for a model that follows prompts and answers questions, please use `DiffutronLM-0.3B-Instruct`.
29
 
30
  ## ๐Ÿ“Œ Model Details
31
 
32
  * **Model Type:** Masked Diffusion Language Model (MDLM) Base
33
+ * **Base Architecture:** `jhu-clsp/mmBERT-base` (ModernBERT-based architecture)
34
  * **Language:** Turkish
35
  * **Parameter Count:** 307M (0.3B)
36
  * **Context Length:** 512 tokens
37
+ * **Training Libraries:** `dllm`, PyTorch, `transformers`
38
  * **Status:** Foundation / Base Model (Post-CPT)
39
 
40
  ## ๐Ÿš€ Architecture & Continual Pre-training (CPT)