Japanese Subject Insertion Model

BERT based Token Classification model based on tohoku-nlp/bert-base-japanse and trained to predict in a Japanese sentence without an explicit subject where the subject would be.

Model Uses

This model was trained as part of a bigger project to predict implicit subjects in Japanese text. You can find whole project here [https://github.com/Romi212/Japanese-Subject-Predictor-System]

Training Details

Training Data

Model was trained using dataset https://github.com/UniversalDependencies/UD_Japanese-GSDLUW

The dataset was reduced only to sentences with a subject, and the subject was removed from the sentence saving the position to train the model to predict where the subject should go.

Downloads last month
16
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Romi121/subject-insertion-model

Finetuned
(10)
this model