serda-dev commited on
Commit
9bee494
·
verified ·
1 Parent(s): 98057ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -16
README.md CHANGED
@@ -16,27 +16,31 @@ tags:
16
  - continued-pretraining
17
  ---
18
 
19
- ## Notice / Announcement (13.02.2026 11PM (UTC+3))
 
20
 
21
- Please read this notice regarding a few known issues affecting **both** `mamba-130m-hf-turkish` and `mamba-370m-hf-turkish`.
22
- This announcement is **identical for both repositories**, so you do **not** need to check the other model’s repo separately.
23
 
24
- ### Known issue: text generation behavior (130M)
25
- - Due to an **embedding-related incompatibility** in `mamba-130m-hf-turkish`, the current **text generation** functionality may behave **buggily** (unstable or incorrect outputs).
26
- - This issue has been **fixed** in `mamba-370m-hf-turkish`, and we will apply the same fix to `mamba-130m-hf-turkish` **as soon as possible**.
27
 
28
- ### Current model quality and roadmap
29
- - Despite the dataset limitations, both models generally produce **good Turkish surface form** (usage patterns, grammar alignment, and fluency).
30
- - However, there are still **logical / contextual consistency** issues (reasoning coherence, long-range consistency, factual reliability, etc.).
31
- - We will **keep the current approach** and continue improving the dataset pipeline (ongoing web scraping + cleaning).
32
- - When `mamba-2.8b-hf-turkish` is ready, we plan to **retrain and re-release** the full set of Turkish checkpoints together using the improved dataset.
 
 
 
 
 
 
 
 
 
33
 
34
- ### In the meantime
35
- - Until that release, we recommend using these models primarily by **fine-tuning** them for your specific tasks.
36
- - Please don’t hesitate to **report additional issues** (generation bugs, tokenizer/embedding mismatches, edge cases, reproducibility problems, etc.).
37
 
38
- ---
39
- ---
40
  ---
41
 
42
  # Turkish Continued Pretraining of `mamba-130m-hf`
 
16
  - continued-pretraining
17
  ---
18
 
19
+ ## ⚠️ Notice / Duyuru (applies to both repos / iki repo için geçerli)
20
+ #### (13.02.2026 11PM (UTC+3))
21
 
22
+ > **EN:** This announcement is identical for **`mamba-130m-hf-turkish`** and **`mamba-370m-hf-turkish`** — you don’t need to check the other repository separately.
23
+ > **TR:** Bu duyuru **`mamba-130m-hf-turkish`** ve **`mamba-370m-hf-turkish`** için aynıdır diğer repoyu ayrıca kontrol etmenize gerek yoktur.
24
 
25
+ <details>
26
+ <summary><b>EN Details</b></summary>
 
27
 
28
+ Due to an **embedding-related incompatibility** in `mamba-130m-hf-turkish`, the current **text generation** behavior may be **buggy** (unstable or inconsistent outputs). This issue has been **fixed** in `mamba-370m-hf-turkish`, and we will port the same fix to `mamba-130m-hf-turkish` **as soon as possible**.
29
+
30
+ Overall, Turkish fluency and grammar are generally solid, but **logical/contextual consistency** issues remain because of current dataset limitations. We are continuing to improve the dataset pipeline (ongoing web scraping and cleaning). When `mamba-2.8b-hf-turkish` is ready, we plan to **retrain and re-release** the Turkish checkpoints together using the improved dataset. Until then, we recommend using these models mainly via **fine-tuning**, and we appreciate any additional bug reports.
31
+
32
+ </details>
33
+
34
+ <details>
35
+ <summary><b>TR — Detaylar</b></summary>
36
+
37
+ `mamba-130m-hf-turkish` modelinde **embedding tarafındaki bir uyumsuzluk** nedeniyle mevcut **text generation** davranışı zaman zaman **buglu** çalışabiliyor (çıktılar tutarsız/kararsız olabiliyor). Bu sorun `mamba-370m-hf-turkish` modelinde **çözüldü** ve aynı düzeltmeyi `mamba-130m-hf-turkish` reposuna da **en kısa sürede** aktaracağız.
38
+
39
+ Genel olarak Türkçe akıcılık ve gramer tarafı iyi; ancak mevcut dataset kısıtları nedeniyle **mantıksal bağlam ve tutarlılık** problemleri hâlâ görülebilir. Dataset hattını (web scrape + temizlik) iyileştirmeye devam ediyoruz. `mamba-2.8b-hf-turkish` hazır olduğunda, geliştirilmiş dataset ile Türkçe checkpoint’leri **birlikte yeniden eğitip yeniden yayınlamayı** planlıyoruz. O zamana kadar modelleri ağırlıklı olarak **fine-tune ederek** kullanmanızı öneririz; ek hataları bildirmekten çekinmeyin.
40
+
41
+ </details>
42
 
 
 
 
43
 
 
 
44
  ---
45
 
46
  # Turkish Continued Pretraining of `mamba-130m-hf`