BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper
•
1810.04805
•
Published
•
25
This is a CoreML conversion of bert-base-multilingual-cased with the Masked Language Model (MLM) head, optimized for iOS deployment.
This model is designed for grammar correction that preserves code-switching (mixed language text). Unlike sequence-to-sequence models that may normalize mixed language text to a single language, BERT MLM predicts masked tokens based on surrounding context, naturally preserving the original language mix.
vocab.txt - WordPiece vocabulary (119,547 tokens)bert_mlm.mlmodelc/ - Compiled CoreML model for iOSimport CoreML
// Load model
let config = MLModelConfiguration()
config.computeUnits = .cpuOnly
let model = try MLModel(contentsOf: modelURL, configuration: config)
// Prepare inputs
let inputIds: MLMultiArray = // tokenized input with [MASK] tokens
let attentionMask: MLMultiArray = // attention mask
let tokenTypeIds: MLMultiArray = // token type ids (all zeros for single sentence)
// Run inference
let input = try MLDictionaryFeatureProvider(dictionary: [
"input_ids": MLFeatureValue(multiArray: inputIds),
"attention_mask": MLFeatureValue(multiArray: attentionMask),
"token_type_ids": MLFeatureValue(multiArray: tokenTypeIds)
])
let output = try model.prediction(from: input)
let logits = output.featureValue(for: "logits")?.multiArrayValue
This model is released under the Apache 2.0 License, same as the original bert-base-multilingual-cased model.
If you use this model, please cite the original BERT paper:
@article{devlin2018bert,
title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
journal={arXiv preprint arXiv:1810.04805},
year={2018}
}
Base model
google-bert/bert-base-multilingual-cased