Text Generation
Transformers
Safetensors
qwen3_5
image-text-to-text
chat
lst
language-selection-tuning
language-bias
bias-mitigation
language-confusion-mitigation
chinese-suppression
korean
qwen3.5
mamba-hybrid
vision-language
composite-vision-language
conversational
Instructions to use dataslab/DSLM-LST-9B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dataslab/DSLM-LST-9B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="dataslab/DSLM-LST-9B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("dataslab/DSLM-LST-9B") model = AutoModelForImageTextToText.from_pretrained("dataslab/DSLM-LST-9B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use dataslab/DSLM-LST-9B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "dataslab/DSLM-LST-9B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dataslab/DSLM-LST-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/dataslab/DSLM-LST-9B
- SGLang
How to use dataslab/DSLM-LST-9B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "dataslab/DSLM-LST-9B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dataslab/DSLM-LST-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "dataslab/DSLM-LST-9B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dataslab/DSLM-LST-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use dataslab/DSLM-LST-9B with Docker Model Runner:
docker model run hf.co/dataslab/DSLM-LST-9B
Update README.md
Browse files
README.md
CHANGED
|
@@ -33,8 +33,8 @@ tags:
|
|
| 33 |
- composite-vision-language
|
| 34 |
---
|
| 35 |
|
| 36 |
-
#
|
| 37 |
-
**
|
| 38 |
The goal is to suppress unwanted Chinese-character generation when the model is used to serve non-Chinese (English / Korean / Japanese etc.) users.
|
| 39 |
The adjustment is intentionally minimal in scope; most of the network — including vision and multimodal components — is preserved bit-identical to the base model.
|
| 40 |
Vision and multimodal capabilities are preserved unchanged.
|
|
@@ -59,7 +59,7 @@ so the effect tends to **persist through downstream full-parameter SFT / RLHF st
|
|
| 59 |
The recommended serving path is **vLLM**, which is also what we used in our evaluation pipeline.
|
| 60 |
|
| 61 |
```bash
|
| 62 |
-
vllm serve dataslab/
|
| 63 |
--port 8000 \
|
| 64 |
--dtype bfloat16 \
|
| 65 |
--gpu-memory-utilization 0.90 \
|
|
@@ -78,7 +78,7 @@ vllm serve dataslab/DLM-LST-9B \
|
|
| 78 |
import torch
|
| 79 |
from transformers import AutoTokenizer, AutoModelForImageTextToText
|
| 80 |
|
| 81 |
-
REPO = "dataslab/
|
| 82 |
|
| 83 |
tokenizer = AutoTokenizer.from_pretrained(REPO)
|
| 84 |
model = AutoModelForImageTextToText.from_pretrained(
|
|
@@ -261,7 +261,7 @@ Korean prompts → Korean answers expected; any Chinese token leaked into the an
|
|
| 261 |
|
| 262 |
### Chinese Suppression (**Thinking mode**)
|
| 263 |
|
| 264 |
-
Evaluated with `enable_thinking=True`. The
|
| 265 |
|
| 266 |
<table style="table-layout: fixed; width: 100%;">
|
| 267 |
<colgroup>
|
|
@@ -277,7 +277,7 @@ Evaluated with `enable_thinking=True`. The DLM-LST-9B column is calibrated with
|
|
| 277 |
<th>Qwen3.5-9B (base)</th>
|
| 278 |
<th>LST-L1</th>
|
| 279 |
<th>LST-L2</th>
|
| 280 |
-
<th style="color:#EAB308;"><b>
|
| 281 |
</tr>
|
| 282 |
</thead>
|
| 283 |
<tbody>
|
|
@@ -296,14 +296,14 @@ Evaluated with `enable_thinking=True`. The DLM-LST-9B column is calibrated with
|
|
| 296 |
</tbody>
|
| 297 |
</table>
|
| 298 |
|
| 299 |
-
**
|
| 300 |
while still cutting unintended Chinese leakage to the level of `chin_total ≈ 0.99`.
|
| 301 |
Downstream reasoning (`acc_*`, HumanEval, GSM8K) is comparable to, or in some cases even better than, the base model.
|
| 302 |
|
| 303 |
|
| 304 |
### Chinese Suppression (**Non-Thinking mode**)
|
| 305 |
|
| 306 |
-
Evaluated with `enable_thinking=False`. The
|
| 307 |
|
| 308 |
<table style="table-layout: fixed; width: 100%;">
|
| 309 |
<colgroup>
|
|
@@ -319,7 +319,7 @@ Evaluated with `enable_thinking=False`. The DLM-LST-9B column here is a **separa
|
|
| 319 |
<th>Qwen3.5-9B (base)</th>
|
| 320 |
<th>LST-L1</th>
|
| 321 |
<th>LST-L2</th>
|
| 322 |
-
<th style="color:#EAB308;"><b>
|
| 323 |
</tr>
|
| 324 |
</thead>
|
| 325 |
<tbody>
|
|
@@ -341,7 +341,7 @@ Evaluated with `enable_thinking=False`. The DLM-LST-9B column here is a **separa
|
|
| 341 |
### Suppression Persistence after SFT-stage (**Non-Thinking mode**)
|
| 342 |
|
| 343 |
Each pipeline was fine-tuned via full-parameter SFT (all weights trainable, no PEFT / LoRA) on the beomi/KoAlpaca-v1.1a dataset.
|
| 344 |
-
After the SFT stage,
|
| 345 |
|
| 346 |
<table style="table-layout: fixed; width: 100%;">
|
| 347 |
<colgroup>
|
|
@@ -353,7 +353,7 @@ After the SFT stage, DLM-LST-9B keeps both its Chinese-leak suppression (`SRR
|
|
| 353 |
<tr>
|
| 354 |
<th>Metric</th>
|
| 355 |
<th>Qwen3.5-9B → SFT</th>
|
| 356 |
-
<th style="color:#EAB308;"><b>
|
| 357 |
</tr>
|
| 358 |
</thead>
|
| 359 |
<tbody>
|
|
@@ -380,7 +380,7 @@ After the SFT stage, DLM-LST-9B keeps both its Chinese-leak suppression (`SRR
|
|
| 380 |
<th>Metric</th>
|
| 381 |
<th>Qwen3.5-9B (base)</th>
|
| 382 |
<th>Qwen3.5-9B → SFT</th>
|
| 383 |
-
<th style="color:#EAB308;"><b>
|
| 384 |
</tr>
|
| 385 |
</thead>
|
| 386 |
<tbody>
|
|
@@ -400,12 +400,12 @@ After the SFT stage, DLM-LST-9B keeps both its Chinese-leak suppression (`SRR
|
|
| 400 |
</table>
|
| 401 |
|
| 402 |
The base model's selectivity shifts substantially after full-parameter SFT (`chin_refusal` 0.037 → 0.128),
|
| 403 |
-
while
|
| 404 |
This shows that LST does not act as a thin surface patch — its effect is encoded in a way that **survives downstream fine-tuning**.
|
| 405 |
|
| 406 |
### English Suppression (**Non-Thinking mode**) — generalization check
|
| 407 |
To confirm LST is not tied to a specific language pair, we applied the same approach to `Llama-3.1-8B-Instruct` for *English* leakage suppression.
|
| 408 |
-
The
|
| 409 |
|
| 410 |
<table style="table-layout: fixed; width: 100%;">
|
| 411 |
<colgroup>
|
|
@@ -417,7 +417,7 @@ The DLM-LST configuration is the only variant that keeps coding (HumanEval) and
|
|
| 417 |
<tr>
|
| 418 |
<th>Metric</th>
|
| 419 |
<th>Llama-3.1-8B-Instruct (base)</th>
|
| 420 |
-
<th style="color:#EAB308;"><b>
|
| 421 |
</tr>
|
| 422 |
</thead>
|
| 423 |
<tbody>
|
|
@@ -440,12 +440,12 @@ The DLM-LST configuration is the only variant that keeps coding (HumanEval) and
|
|
| 440 |
## Example Outputs
|
| 441 |
|
| 442 |
<p align="center">
|
| 443 |
-
<img src="assets/banner.png" alt="
|
| 444 |
</p>
|
| 445 |
|
| 446 |
Asked in Korean about the most common clay mineral on the Korean
|
| 447 |
Peninsula, Qwen3.5-9B leaks 9 Chinese / mixed-script tokens (`伊利石`,
|
| 448 |
-
`кao린`, `的`) into its answer.
|
| 449 |
entirely in Korean (0 Chinese tokens).
|
| 450 |
|
| 451 |
|
|
@@ -464,7 +464,7 @@ entirely in Korean (0 Chinese tokens).
|
|
| 464 |
<thead>
|
| 465 |
<tr>
|
| 466 |
<th>Qwen3.5-9B (leaks <code>才开始</code>)</th>
|
| 467 |
-
<th style="color:#EAB308;">
|
| 468 |
</tr>
|
| 469 |
</thead>
|
| 470 |
<tbody>
|
|
@@ -505,7 +505,7 @@ entirely in Korean (0 Chinese tokens).
|
|
| 505 |
<thead>
|
| 506 |
<tr>
|
| 507 |
<th>Qwen3.5-9B (leaks <code>积压</code>)</th>
|
| 508 |
-
<th style="color:#EAB308;">
|
| 509 |
</tr>
|
| 510 |
</thead>
|
| 511 |
<tbody>
|
|
@@ -545,7 +545,7 @@ entirely in Korean (0 Chinese tokens).
|
|
| 545 |
<thead>
|
| 546 |
<tr>
|
| 547 |
<th>Qwen3.5-9B (leaks <code>享有的</code>)</th>
|
| 548 |
-
<th style="color:#EAB308;">
|
| 549 |
</tr>
|
| 550 |
</thead>
|
| 551 |
<tbody>
|
|
@@ -570,14 +570,14 @@ entirely in Korean (0 Chinese tokens).
|
|
| 570 |
|
| 571 |
### Cross-lingual Selectivity
|
| 572 |
|
| 573 |
-
When the user **explicitly asks for Chinese**,
|
| 574 |
produces it. The previous examples showed the model *avoiding* unwanted
|
| 575 |
Chinese inside an otherwise-Korean answer; the example below shows it
|
| 576 |
emitting Chinese fluently when the user's instruction calls for it.
|
| 577 |
|
| 578 |
**Prompt:** 피보나치 수열의 n번째 항을 반환하는 파이썬 함수를 작성해주세요. 설명은 중국어로 해주세요.
|
| 579 |
|
| 580 |
-
**
|
| 581 |
|
| 582 |
```
|
| 583 |
다음은 파이썬을 사용하여 피보나치 수열의 n 번째 항을 계산하는 함수입니다.
|
|
@@ -604,7 +604,7 @@ def fibonacci(n):
|
|
| 604 |
|
| 605 |
Qwen3.5-9B's `<think>` block leaks Chinese even more severely than its
|
| 606 |
final answer, often slipping into Chinese once the reasoning gets stuck.
|
| 607 |
-
|
| 608 |
|
| 609 |
**Prompt:** 업무 협조 요청을 받은 기관이 협조 요청 문서에 흠이 있음을 발견한 때에는 접수한 날부터 몇 일 이내에 보완을 요구하여야 하는가? (사무관리규정 개정으로 제외된 문제입니다. 정답은 3번 입니다.)
|
| 610 |
|
|
@@ -620,7 +620,7 @@ DLM-LST-9B suppresses that leakage inside the thinking block too.
|
|
| 620 |
<tr>
|
| 621 |
<th>Metric</th>
|
| 622 |
<th>Qwen3.5-9B</th>
|
| 623 |
-
<th style="color:#EAB308;">
|
| 624 |
</tr>
|
| 625 |
</thead>
|
| 626 |
<tbody>
|
|
@@ -643,9 +643,9 @@ DLM-LST-9B suppresses that leakage inside the thinking block too.
|
|
| 643 |
</table>
|
| 644 |
|
| 645 |
In the base model's trace, every cycle ends with `(Wait, I need to write in Korean). Okay, I will write in Korean.` — yet the very next token is Chinese again, and the trace slides right back into the same fragment.
|
| 646 |
-
This loop fires **484 times** before the token budget runs out.
|
| 647 |
Chinese tokens being chosen even right after the model says they should not be.
|
| 648 |
-
On the same prompt,
|
| 649 |
and the final user-facing answer is in clean Korean.
|
| 650 |
|
| 651 |
|
|
|
|
| 33 |
- composite-vision-language
|
| 34 |
---
|
| 35 |
|
| 36 |
+
# DSLM-LST-9B
|
| 37 |
+
**DSLM-LST-9B** is a `Qwen3.5-9B` derivative refined with our in-house **Language Selection Tuning (LST)** technique.
|
| 38 |
The goal is to suppress unwanted Chinese-character generation when the model is used to serve non-Chinese (English / Korean / Japanese etc.) users.
|
| 39 |
The adjustment is intentionally minimal in scope; most of the network — including vision and multimodal components — is preserved bit-identical to the base model.
|
| 40 |
Vision and multimodal capabilities are preserved unchanged.
|
|
|
|
| 59 |
The recommended serving path is **vLLM**, which is also what we used in our evaluation pipeline.
|
| 60 |
|
| 61 |
```bash
|
| 62 |
+
vllm serve dataslab/DSLM-LST-9B \
|
| 63 |
--port 8000 \
|
| 64 |
--dtype bfloat16 \
|
| 65 |
--gpu-memory-utilization 0.90 \
|
|
|
|
| 78 |
import torch
|
| 79 |
from transformers import AutoTokenizer, AutoModelForImageTextToText
|
| 80 |
|
| 81 |
+
REPO = "dataslab/DSLM-LST-9B"
|
| 82 |
|
| 83 |
tokenizer = AutoTokenizer.from_pretrained(REPO)
|
| 84 |
model = AutoModelForImageTextToText.from_pretrained(
|
|
|
|
| 261 |
|
| 262 |
### Chinese Suppression (**Thinking mode**)
|
| 263 |
|
| 264 |
+
Evaluated with `enable_thinking=True`. The DSLM-LST-9B column is calibrated with thinking enabled.
|
| 265 |
|
| 266 |
<table style="table-layout: fixed; width: 100%;">
|
| 267 |
<colgroup>
|
|
|
|
| 277 |
<th>Qwen3.5-9B (base)</th>
|
| 278 |
<th>LST-L1</th>
|
| 279 |
<th>LST-L2</th>
|
| 280 |
+
<th style="color:#EAB308;"><b>DSLM-LST-9B</b><br/></th>
|
| 281 |
</tr>
|
| 282 |
</thead>
|
| 283 |
<tbody>
|
|
|
|
| 296 |
</tbody>
|
| 297 |
</table>
|
| 298 |
|
| 299 |
+
**DSLM-LST-9B keeps `chin_refusal` at 0.065.** It preserves the ability to generate Chinese when the user explicitly asks for it,
|
| 300 |
while still cutting unintended Chinese leakage to the level of `chin_total ≈ 0.99`.
|
| 301 |
Downstream reasoning (`acc_*`, HumanEval, GSM8K) is comparable to, or in some cases even better than, the base model.
|
| 302 |
|
| 303 |
|
| 304 |
### Chinese Suppression (**Non-Thinking mode**)
|
| 305 |
|
| 306 |
+
Evaluated with `enable_thinking=False`. The DSLM-LST-9B column here is a **separate think-OFF-calibrated checkpoint** (not this release).
|
| 307 |
|
| 308 |
<table style="table-layout: fixed; width: 100%;">
|
| 309 |
<colgroup>
|
|
|
|
| 319 |
<th>Qwen3.5-9B (base)</th>
|
| 320 |
<th>LST-L1</th>
|
| 321 |
<th>LST-L2</th>
|
| 322 |
+
<th style="color:#EAB308;"><b>DSLM-LST-9B</b><br/></th>
|
| 323 |
</tr>
|
| 324 |
</thead>
|
| 325 |
<tbody>
|
|
|
|
| 341 |
### Suppression Persistence after SFT-stage (**Non-Thinking mode**)
|
| 342 |
|
| 343 |
Each pipeline was fine-tuned via full-parameter SFT (all weights trainable, no PEFT / LoRA) on the beomi/KoAlpaca-v1.1a dataset.
|
| 344 |
+
After the SFT stage, DSLM-LST-9B keeps both its Chinese-leak suppression (`SRR ≈ 1.000`) and its selectivity (`|Δ_selectivity| ≈ 0.08`) almost unchanged.
|
| 345 |
|
| 346 |
<table style="table-layout: fixed; width: 100%;">
|
| 347 |
<colgroup>
|
|
|
|
| 353 |
<tr>
|
| 354 |
<th>Metric</th>
|
| 355 |
<th>Qwen3.5-9B → SFT</th>
|
| 356 |
+
<th style="color:#EAB308;"><b>DSLM-LST-9B → SFT</b></th>
|
| 357 |
</tr>
|
| 358 |
</thead>
|
| 359 |
<tbody>
|
|
|
|
| 380 |
<th>Metric</th>
|
| 381 |
<th>Qwen3.5-9B (base)</th>
|
| 382 |
<th>Qwen3.5-9B → SFT</th>
|
| 383 |
+
<th style="color:#EAB308;"><b>DSLM-LST-9B → SFT</b></th>
|
| 384 |
</tr>
|
| 385 |
</thead>
|
| 386 |
<tbody>
|
|
|
|
| 400 |
</table>
|
| 401 |
|
| 402 |
The base model's selectivity shifts substantially after full-parameter SFT (`chin_refusal` 0.037 → 0.128),
|
| 403 |
+
while DSLM-LST-9B's suppression behavior remains nearly invariant before and after full-parameter SFT.
|
| 404 |
This shows that LST does not act as a thin surface patch — its effect is encoded in a way that **survives downstream fine-tuning**.
|
| 405 |
|
| 406 |
### English Suppression (**Non-Thinking mode**) — generalization check
|
| 407 |
To confirm LST is not tied to a specific language pair, we applied the same approach to `Llama-3.1-8B-Instruct` for *English* leakage suppression.
|
| 408 |
+
The DSLM-LST configuration is the only variant that keeps coding (HumanEval) and math (GSM8K) usable while still meaningfully reducing leakage.
|
| 409 |
|
| 410 |
<table style="table-layout: fixed; width: 100%;">
|
| 411 |
<colgroup>
|
|
|
|
| 417 |
<tr>
|
| 418 |
<th>Metric</th>
|
| 419 |
<th>Llama-3.1-8B-Instruct (base)</th>
|
| 420 |
+
<th style="color:#EAB308;"><b>DSLM-LST (Llama-3.1-8B)</b></th>
|
| 421 |
</tr>
|
| 422 |
</thead>
|
| 423 |
<tbody>
|
|
|
|
| 440 |
## Example Outputs
|
| 441 |
|
| 442 |
<p align="center">
|
| 443 |
+
<img src="assets/banner.png" alt="DSLM-LST-9B vs Qwen3.5-9B on a Korean KMMLU prompt: base leaks 9 Chinese tokens (伊利石, кaо린, 的), DSLM-LST-9B emits 0 Chinese tokens." width="640" />
|
| 444 |
</p>
|
| 445 |
|
| 446 |
Asked in Korean about the most common clay mineral on the Korean
|
| 447 |
Peninsula, Qwen3.5-9B leaks 9 Chinese / mixed-script tokens (`伊利石`,
|
| 448 |
+
`кao린`, `的`) into its answer. DSLM-LST-9B answers the same prompt
|
| 449 |
entirely in Korean (0 Chinese tokens).
|
| 450 |
|
| 451 |
|
|
|
|
| 464 |
<thead>
|
| 465 |
<tr>
|
| 466 |
<th>Qwen3.5-9B (leaks <code>才开始</code>)</th>
|
| 467 |
+
<th style="color:#EAB308;">DSLM-LST-9B (clean Korean)</th>
|
| 468 |
</tr>
|
| 469 |
</thead>
|
| 470 |
<tbody>
|
|
|
|
| 505 |
<thead>
|
| 506 |
<tr>
|
| 507 |
<th>Qwen3.5-9B (leaks <code>积压</code>)</th>
|
| 508 |
+
<th style="color:#EAB308;">DSLM-LST-9B (clean Korean)</th>
|
| 509 |
</tr>
|
| 510 |
</thead>
|
| 511 |
<tbody>
|
|
|
|
| 545 |
<thead>
|
| 546 |
<tr>
|
| 547 |
<th>Qwen3.5-9B (leaks <code>享有的</code>)</th>
|
| 548 |
+
<th style="color:#EAB308;">DSLM-LST-9B (clean Korean)</th>
|
| 549 |
</tr>
|
| 550 |
</thead>
|
| 551 |
<tbody>
|
|
|
|
| 570 |
|
| 571 |
### Cross-lingual Selectivity
|
| 572 |
|
| 573 |
+
When the user **explicitly asks for Chinese**, DSLM-LST-9B readily
|
| 574 |
produces it. The previous examples showed the model *avoiding* unwanted
|
| 575 |
Chinese inside an otherwise-Korean answer; the example below shows it
|
| 576 |
emitting Chinese fluently when the user's instruction calls for it.
|
| 577 |
|
| 578 |
**Prompt:** 피보나치 수열의 n번째 항을 반환하는 파이썬 함수를 작성해주세요. 설명은 중국어로 해주세요.
|
| 579 |
|
| 580 |
+
**DSLM-LST-9B (code in Python, explanation in Chinese):**
|
| 581 |
|
| 582 |
```
|
| 583 |
다음은 파이썬을 사용하여 피보나치 수열의 n 번째 항을 계산하는 함수입니다.
|
|
|
|
| 604 |
|
| 605 |
Qwen3.5-9B's `<think>` block leaks Chinese even more severely than its
|
| 606 |
final answer, often slipping into Chinese once the reasoning gets stuck.
|
| 607 |
+
DSLM-LST-9B suppresses that leakage inside the thinking block too.
|
| 608 |
|
| 609 |
**Prompt:** 업무 협조 요청을 받은 기관이 협조 요청 문서에 흠이 있음을 발견한 때에는 접수한 날부터 몇 일 이내에 보완을 요구하여야 하는가? (사무관리규정 개정으로 제외된 문제입니다. 정답은 3번 입니다.)
|
| 610 |
|
|
|
|
| 620 |
<tr>
|
| 621 |
<th>Metric</th>
|
| 622 |
<th>Qwen3.5-9B</th>
|
| 623 |
+
<th style="color:#EAB308;">DSLM-LST-9B</th>
|
| 624 |
</tr>
|
| 625 |
</thead>
|
| 626 |
<tbody>
|
|
|
|
| 643 |
</table>
|
| 644 |
|
| 645 |
In the base model's trace, every cycle ends with `(Wait, I need to write in Korean). Okay, I will write in Korean.` — yet the very next token is Chinese again, and the trace slides right back into the same fragment.
|
| 646 |
+
This loop fires **484 times** before the token budget runs out. DSLM-LST-9B targets exactly this failure:
|
| 647 |
Chinese tokens being chosen even right after the model says they should not be.
|
| 648 |
+
On the same prompt, DSLM-LST-9B's `<think>` block contains **0 Chinese characters** and terminates naturally,
|
| 649 |
and the final user-facing answer is in clean Korean.
|
| 650 |
|
| 651 |
|