Danetki MLC — Qwen3-0.6B for "Данетки" (WebGPU / in-browser)

This is the MLC/WebGPU quantized version of lifeart/danetki-qwen3-0.6b — a Qwen3-0.6B model fine-tuned with QLoRA for playing Данетки (Russian lateral thinking puzzles).

The model is compiled with MLC-LLM for in-browser inference via WebGPU using the @mlc-ai/web-llm library. It runs entirely client-side with no server required.

Model Details

Property Value
Source model lifeart/danetki-qwen3-0.6b
Base model Qwen/Qwen3-0.6B
Fine-tuning method QLoRA (r=16, alpha=32, dropout=0.15)
Quantization q4f16_1 (4-bit weights, 16-bit activations)
Runtime MLC-LLM WebGPU WASM
Context length 4096 tokens
Language Russian
Task Classification (Да / Нет / Не важно)

Task

Given a puzzle context (condition + answer) and a player's question, respond with exactly one word:

  • Да (Yes)
  • Нет (No)
  • Не важно (Irrelevant)

System Prompt

Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.

Input Format

Контекст: {condition}

Разгадка: {answer}

Вопрос: {question}

Usage with web-llm

import { CreateMLCEngine } from "@mlc-ai/web-llm";

const engine = await CreateMLCEngine("danetki-qwen3-0.6B-q4f16_1", {
  appConfig: {
    model_list: [{
      model: "https://huggingface.co/lifeart/danetki-mlc/resolve/main/",
      model_id: "danetki-qwen3-0.6B-q4f16_1",
      model_lib:
        "https://huggingface.co/lifeart/danetki-mlc/resolve/main/Qwen3-0.6B-q4f16_1-ctx4k_cs1k-webgpu.wasm",
    }],
  },
});

const reply = await engine.chat.completions.create({
  messages: [
    {
      role: "system",
      content:
        "Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.",
    },
    {
      role: "user",
      content:
        "Контекст: Человек зашёл в бар и попросил стакан воды. Бармен достал пистолет и направил на него. Человек поблагодарил и ушёл.\n\nРазгадка: У человека была икота, и бармен испугал его, чтобы она прошла.\n\nВопрос: Человек хотел пить воду?",
    },
  ],
  temperature: 0.1,
  max_tokens: 10,
});

console.log(reply.choices[0].message.content); // "Нет"

Training Details

See the full training details on the source model card.

Key highlights:

  • ~183K training examples from scraped Данетки Q&A, DaNetQA reformatted, and context perturbation augmentation
  • Rebalanced class distribution: ~36% Да / ~36% Нет / ~28% Не важно
  • Cleaned NV labels: removed uncertainty responses ("не знаю", "некорректно") misclassified as irrelevant
  • Completion-only loss with prompt/completion format — only assistant classification tokens contribute to the gradient
  • Explicit chatml template override (no <think> tags) to match MLC production inference
  • Label smoothing (0.05) for regularization
  • LR 2e-5, cosine schedule, bf16, effective batch size 16, early stopping (patience=5), best checkpoint by eval_loss

Benchmark

190 test questions across 25 puzzles (variant B — with facts for long answers):

Class Precision Recall F1
Да 71.8% 91.4% 80.4%
Нет 77.8% 59.3% 67.3%
НВ 83.3% 70.0% 76.1%
Overall 75.8% 74.6%

Model Files

This repository contains the MLC-LLM compiled model:

  • params_shard_*.bin — quantized model weight shards (q4f16_1)
  • Qwen3-0.6B-q4f16_1-ctx4k_cs1k-webgpu.wasm — WebGPU WASM runtime
  • mlc-chat-config.json — MLC-LLM configuration
  • tokenizer.json, tokenizer_config.json — tokenizer files

Limitations

  • Only responds in Russian
  • Limited to three response categories: Да / Нет / Не важно
  • The model has a slight "Да" bias (recall 91% vs 64% for other classes)
  • Quality depends on how well the puzzle context covers the player's question
  • May produce incorrect answers for ambiguous or edge-case questions
  • Requires a WebGPU-capable browser for in-browser inference
Downloads last month
79
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lifeart/danetki-mlc

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(1)
this model