bluolightning's picture
Update README.md
c7fab37 verified
metadata
license: apache-2.0
datasets:
  - OmniAICreator/Japanese-Novels-23M
  - NilanE/ParallelFiction-Ja_En-100k
  - globis-university/aozorabunko-clean
  - joujiboi/Galgame-VisualNovel-Reupload
  - CC100
  - AnimeText
language:
  - ja
pipeline_tag: fill-mask
library_name: transformers

Custom Japanese BERT (4-layer)

This model is a tiny Japanese BERT model with 4 layers, optimized for speed.

Model Background

  • Architecture: BERT (4 layers, 256 hidden size, 4 heads, 1024 FFN)
  • Distillation: Distilled from a fine-tuned version of tohoku-nlp/bert-base-japanese-char-v2.
  • Initialization: The student model was randomly initialized.
  • Tokenizer: Japanese Character-level tokenizer, shared with the teacher.