Danetki MLC — Qwen3-0.6B for "Данетки" (WebGPU / in-browser)
This is the MLC/WebGPU quantized version of lifeart/danetki-qwen3-0.6b — a Qwen3-0.6B model fine-tuned with QLoRA for playing Данетки (Russian lateral thinking puzzles).
The model is compiled with MLC-LLM for in-browser inference via WebGPU using the @mlc-ai/web-llm library. It runs entirely client-side with no server required.
Model Details
| Property | Value |
|---|---|
| Source model | lifeart/danetki-qwen3-0.6b |
| Base model | Qwen/Qwen3-0.6B |
| Fine-tuning method | QLoRA (r=16, alpha=32, dropout=0.15) |
| Quantization | q4f16_1 (4-bit weights, 16-bit activations) |
| Runtime | MLC-LLM WebGPU WASM |
| Context length | 4096 tokens |
| Language | Russian |
| Task | Classification (Да / Нет / Не важно) |
Task
Given a puzzle context (condition + answer) and a player's question, respond with exactly one word:
- Да (Yes)
- Нет (No)
- Не важно (Irrelevant)
System Prompt
Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.
Input Format
Контекст: {condition}
Разгадка: {answer}
Вопрос: {question}
Usage with web-llm
import { CreateMLCEngine } from "@mlc-ai/web-llm";
const engine = await CreateMLCEngine("danetki-qwen3-0.6B-q4f16_1", {
appConfig: {
model_list: [{
model: "https://huggingface.co/lifeart/danetki-mlc/resolve/main/",
model_id: "danetki-qwen3-0.6B-q4f16_1",
model_lib:
"https://huggingface.co/lifeart/danetki-mlc/resolve/main/Qwen3-0.6B-q4f16_1-ctx4k_cs1k-webgpu.wasm",
}],
},
});
const reply = await engine.chat.completions.create({
messages: [
{
role: "system",
content:
"Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.",
},
{
role: "user",
content:
"Контекст: Человек зашёл в бар и попросил стакан воды. Бармен достал пистолет и направил на него. Человек поблагодарил и ушёл.\n\nРазгадка: У человека была икота, и бармен испугал его, чтобы она прошла.\n\nВопрос: Человек хотел пить воду?",
},
],
temperature: 0.1,
max_tokens: 10,
});
console.log(reply.choices[0].message.content); // "Нет"
Training Details
See the full training details on the source model card.
Key highlights:
- ~183K training examples from scraped Данетки Q&A, DaNetQA reformatted, and context perturbation augmentation
- Rebalanced class distribution: ~36% Да / ~36% Нет / ~28% Не важно
- Cleaned NV labels: removed uncertainty responses ("не знаю", "некорректно") misclassified as irrelevant
- Completion-only loss with prompt/completion format — only assistant classification tokens contribute to the gradient
- Explicit chatml template override (no
<think>tags) to match MLC production inference - Label smoothing (0.05) for regularization
- LR 2e-5, cosine schedule, bf16, effective batch size 16, early stopping (patience=5), best checkpoint by eval_loss
Benchmark
190 test questions across 25 puzzles (variant B — with facts for long answers):
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Да | 71.8% | 91.4% | 80.4% |
| Нет | 77.8% | 59.3% | 67.3% |
| НВ | 83.3% | 70.0% | 76.1% |
| Overall | 75.8% | 74.6% |
Model Files
This repository contains the MLC-LLM compiled model:
params_shard_*.bin— quantized model weight shards (q4f16_1)Qwen3-0.6B-q4f16_1-ctx4k_cs1k-webgpu.wasm— WebGPU WASM runtimemlc-chat-config.json— MLC-LLM configurationtokenizer.json,tokenizer_config.json— tokenizer files
Limitations
- Only responds in Russian
- Limited to three response categories: Да / Нет / Не важно
- The model has a slight "Да" bias (recall 91% vs 64% for other classes)
- Quality depends on how well the puzzle context covers the player's question
- May produce incorrect answers for ambiguous or edge-case questions
- Requires a WebGPU-capable browser for in-browser inference
- Downloads last month
- 79