bluolightning
/

bert-tiny-japanese-char

text-generation

Model card Files Files and versions

Custom Japanese BERT (4-layer)

This model is a tiny Japanese BERT model with 4 layers, optimized for speed.

Model Background

Architecture: BERT (4 layers, 256 hidden size, 4 heads, 1024 FFN)
Distillation: Distilled from a fine-tuned version of tohoku-nlp/bert-base-japanese-char-v2.
Initialization: The student model was randomly initialized.
Tokenizer: Japanese Character-level tokenizer, shared with the teacher.

Downloads last month: 7

Safetensors

Model size

4.84M params

Tensor type

F32

·

Datasets used to train bluolightning/bert-tiny-japanese-char