Bielik 1.5B v3.0 Instruct β€” ExecuTorch (.pte)

ExecuTorch export of speakleash/Bielik-1.5B-v3.0-Instruct for on-device inference on mobile (iOS/Android) via react-native-executorch.

Model Details

Property Value
Architecture LlamaForCausalLM
Parameters 1.5B
Layers 32
Attention heads 12 (GQA with 2 KV heads)
Context length 8192 tokens
Vocabulary 32,000 (SentencePiece BPE)
Chat template ChatML (<|im_start|>, <|im_end|>)
License Apache 2.0

Quantization

Exported using optimum-executorch with the XNNPACK backend:

  • Scheme: 8da4w (8-bit dynamic activations, 4-bit weights)
  • Output size: ~1 GB .pte file
  • Backend: XNNPACK (CPU, cross-platform)
pip install optimum-executorch
optimum-executorch export \
  --model speakleash/Bielik-1.5B-v3.0-Instruct \
  --recipe xnnpack \
  --output_dir bielik_pte

Files

File Description
model.pte ExecuTorch model (~1 GB)
tokenizer.json HuggingFace tokenizer (Metaspace pre_tokenizer removed for ExecuTorch compatibility)
tokenizer_config.json Tokenizer configuration

Tokenizer Note

The Metaspace pre_tokenizer was removed from tokenizer.json to ensure compatibility with ExecuTorch's HFTokenizer C++ implementation, which does not support Metaspace decoding. Without this change, the tokenizer fails to load at runtime.

Use Cases

  • Polish instruction following β€” Bielik is trained on Polish data and excels at Polish language tasks
  • Email summarization β€” compact enough for on-device summarization of email threads
  • On-device mobile AI β€” runs entirely on-device, no network required after download

Usage with react-native-executorch

import { useLLM } from 'react-native-executorch';

const bielikModel = {
  modelSource:
    'https://huggingface.co/jash90/bielik-1.5b-v3.0-instruct-executorch/resolve/main/model.pte',
  tokenizerSource:
    'https://huggingface.co/jash90/bielik-1.5b-v3.0-instruct-executorch/resolve/main/tokenizer.json',
  tokenizerConfigSource:
    'https://huggingface.co/jash90/bielik-1.5b-v3.0-instruct-executorch/resolve/main/tokenizer_config.json',
};

function MyComponent() {
  const llm = useLLM({
    model: bielikModel,
    chatConfig: {
      chatTemplate: 'chatml',
    },
  });

  const handleGenerate = async () => {
    const response = await llm.generate([
      { role: 'system', content: 'Jestes pomocnym asystentem.' },
      { role: 'user', content: 'Podsumuj ten email w 2 zdaniach.' },
    ]);
    console.log(response);
  };
}

Base Model

This is an ExecuTorch conversion of speakleash/Bielik-1.5B-v3.0-Instruct, a Polish language model developed by the SpeakLeash project.

Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for jash90/bielik-1.5b-v3.0-instruct-executorch

Quantized
(13)
this model