YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Qwen3-0.6B for Burn
This repository contains Qwen3-0.6B weights in formats compatible with the Burn deep learning framework.
Model Details
- Base Model: Qwen/Qwen3-0.6B
- Parameters: 0.6B
- Architecture: Qwen3 (decoder-only transformer)
- License: Apache-2.0
Configuration
| Parameter | Value |
|---|---|
| hidden_size | 1024 |
| num_hidden_layers | 28 |
| num_attention_heads | 16 |
| num_key_value_heads | 8 |
| intermediate_size | 3072 |
| vocab_size | 151936 |
| max_position_embeddings | 40960 |
| rope_theta | 1000000 |
| rms_norm_eps | 1e-6 |
Available Formats
| File | Format | Size | Description |
|---|---|---|---|
model.safetensors |
HuggingFace SafeTensors | 1.4 GB | Original BF16 weights from Qwen |
model.bpk |
Burn Burnpack | 1.4 GB | Converted for Burn (BF16) |
tokenizer.json |
HuggingFace Tokenizers | 11 MB | Tokenizer file |
Usage with qwen3-burn
Using .bpk format (recommended)
use qwen3_burn::{Qwen3Config, Qwen3ForCausalLM, Qwen3Tokenizer};
use burn::backend::candle::{Candle, CandleDevice};
use half::bf16;
type Backend = Candle<bf16, i64>;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let device = CandleDevice::metal(0); // or CandleDevice::Cpu
// Load tokenizer
let tokenizer = Qwen3Tokenizer::from_file("tokenizer.json")?;
// Initialize model with 0.6B config preset
let model = Qwen3Config::qwen3_0_6b()
.init_causal_lm::<Backend>(&device)
.with_weights("model.bpk")?; // or "model.safetensors"
// Generate text
let (input_ids, _) = tokenizer.encode("Hello, world!")?;
let input_tensor = Tensor::from_data(&input_ids, &device).unsqueeze();
let output = model.generate_with_cache(
input_tensor,
50, // max_new_tokens
0.0, // temperature (0 = greedy, >0 = sampling)
0.9, // top_p
50, // top_k
);
let text = tokenizer.decode(&output.to_data().to_vec())?;
println!("{}", text);
Ok(())
}
Performance
On Apple M-series with Metal backend:
- ~25 tokens/sec with greedy decoding (temperature=0)
- Model loading: ~2-3 seconds
Acknowledgments
- Original model by Alibaba Qwen Team
- Burn framework by Tracel AI
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support