Japanese Subject Insertion Model
BERT based Token Classification model based on tohoku-nlp/bert-base-japanse and trained to predict in a Japanese sentence without an explicit subject where the subject would be.
Model Uses
This model was trained as part of a bigger project to predict implicit subjects in Japanese text. You can find whole project here [https://github.com/Romi212/Japanese-Subject-Predictor-System]
Training Details
Training Data
Model was trained using dataset https://github.com/UniversalDependencies/UD_Japanese-GSDLUW
The dataset was reduced only to sentences with a subject, and the subject was removed from the sentence saving the position to train the model to predict where the subject should go.
- Downloads last month
- 16
Model tree for Romi121/subject-insertion-model
Base model
tohoku-nlp/bert-base-japanese