BERT MLM Multilingual - CoreML

This is a CoreML conversion of bert-base-multilingual-cased with the Masked Language Model (MLM) head, optimized for iOS deployment.

Model Description

  • Base Model: bert-base-multilingual-cased by Google
  • Task: Masked Language Modeling (MLM)
  • Languages: 104 languages including English, Chinese, Cantonese
  • Format: CoreML (.mlmodelc)
  • Size: ~340MB compiled

Intended Use

This model is designed for grammar correction that preserves code-switching (mixed language text). Unlike sequence-to-sequence models that may normalize mixed language text to a single language, BERT MLM predicts masked tokens based on surrounding context, naturally preserving the original language mix.

Example Use Case

  • Input: "我今日去咗office開meeting" (Cantonese-English mix)
  • BERT MLM corrects grammar errors while keeping both languages intact

Model Files

  • vocab.txt - WordPiece vocabulary (119,547 tokens)
  • bert_mlm.mlmodelc/ - Compiled CoreML model for iOS

Technical Details

  • Architecture: BERT-base (12 layers, 768 hidden, 12 attention heads)
  • Parameters: ~110M
  • Max Sequence Length: 128 tokens
  • Compute Units: CPU (for iOS background app compatibility)
  • Minimum iOS: 15.0

Usage in iOS

import CoreML

// Load model
let config = MLModelConfiguration()
config.computeUnits = .cpuOnly
let model = try MLModel(contentsOf: modelURL, configuration: config)

// Prepare inputs
let inputIds: MLMultiArray = // tokenized input with [MASK] tokens
let attentionMask: MLMultiArray = // attention mask
let tokenTypeIds: MLMultiArray = // token type ids (all zeros for single sentence)

// Run inference
let input = try MLDictionaryFeatureProvider(dictionary: [
    "input_ids": MLFeatureValue(multiArray: inputIds),
    "attention_mask": MLFeatureValue(multiArray: attentionMask),
    "token_type_ids": MLFeatureValue(multiArray: tokenTypeIds)
])
let output = try model.prediction(from: input)
let logits = output.featureValue(for: "logits")?.multiArrayValue

License

This model is released under the Apache 2.0 License, same as the original bert-base-multilingual-cased model.

Attribution

Citation

If you use this model, please cite the original BERT paper:

@article{devlin2018bert,
  title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
  author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1810.04805},
  year={2018}
}

Limitations

  • Fixed sequence length of 128 tokens
  • CPU-only inference (Neural Engine requires fixed shapes which BERT's attention doesn't support well)
  • Grammar correction is limited to function words and particles to avoid semantic changes
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for goodpixelltd/bert-mlm-multilingual-coreml

Finetuned
(935)
this model

Paper for goodpixelltd/bert-mlm-multilingual-coreml