--- license: mit datasets: - oscar-corpus/OSCAR-2301 language: - tr metrics: - exact_match base_model: - google-bert/bert-base-uncased pipeline_tag: text-generation tags: - lemmatization - lemma - turkish-lemma --- This model has been trained on the data that was provided by turkish-nlp-suite/temiz-OSCAR and was later chunked into a smaller piece in order to lemmatize each and every word accurately. In total 300k words have been pulled from this dataset with some unfit for lemmatization or morpheme segmentation (such as non-spesifik, baba-oğul,