NLLB-200 Core ML 128 (Mac, Palettized)

Core ML conversion of NLLB-200 (No Language Left Behind, ~600M parameters) for macOS, with palettized weights to reduce memory and improve on-device performance. Fixed sequence length 128 tokens (short text / chat).

  • Base: facebook/nllb-200-distilled-600M
  • Variant: 128-token encoder/decoder, palettized, Mac-optimized
  • Use case: Short sentences, chat, ~80–100 words per segment

Contents

  • NLLB_Encoder_128.mlpackage – NLLB encoder (input_ids, attention_mask β†’ hidden states)
  • NLLB_Decoder_128.mlpackage – NLLB decoder (input_ids, encoder_hidden_states, encoder_attention_mask β†’ logits)
  • tokenizer/ – Tokenizer (tokenizer.json, tokenizer_config.json, sentencepiece)
  • config.json – Model config

Device

Mac only. For iPhone use nllb200-coreml-128-iphone-palettized; for iPad use nllb200-coreml-128-ipad-palettized.

Usage (macOS / TranslateBlue)

  1. Download this repo (e.g. via Hugging Face Hub or TranslateBlue in-app download).
  2. Load NLLB_Encoder_128.mlpackage and NLLB_Decoder_128.mlpackage with Core ML; tokenize with the included tokenizer.
  3. Run encoder then decoder in a loop (argmax next token until EOS or max length 128).

Source/target language is controlled by the NLLB-200 language code prefix (e.g. eng_Latn, jpn_Jpan).

Related repos

License

CC-BY-NC-4.0 (inherited from NLLB-200).

Downloads last month
46
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train aoiandroid/nllb200-coreml-128-mac-palettized