scoup123
/

affixIdentifier

Text Classification

feature-extraction

text-embeddings-inference

Model card Files Files and versions

scoup123 commited on Jan 10, 2024

Commit

3c7ff9e

·

verified ·

1 Parent(s): 3b0785d

Create README.md

Files changed (1) hide show

README.md +66 -0

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+datasets:
+- scoup123/AffixIdentifier
+language:
+- tr
+metrics:
+- accuracy
+pipeline_tag: text-classification
+---
+Model Description
+Given 2 words in Turkish, the model predicts whether they share an affix or not. Fine-tuned on dbmdz/bert-base-turkish-cased, fine-tuned on a task similar to NLI, but on word level and with 2 labels. It was created as a final project for one of my classes.
+Developed by: Scoup123
+Model type: BERT
+Language(s) (NLP): Turkish
+Finetuned from model [optional]: dbmdz/bert-base-turkish-cased
+Model Sources [optional]
+Repository: [More Information Needed]
+Paper [optional]: in-works
+Uses
+It can be used in morphological analyzing tasks.
+Direct Use
+It can probably be used without additional finetuning on Turkish.
+Training Details
+Training Data
+scoup123/affixfinder
+The dataset used was generated from a generated dataset mentioned in the paper titled Turkish language resources: Morphological parser, morphological disambiguator and web corpus.
+Evaluation
+Test Accuracy: 0.9874 Precision: 0.9874 Recall: 0.9874 F1 Score: 0.9874
+**It should be used with caution as these scores are too high.
+Testing Data, Factors & Metrics
+Testing Data
+A testing split data was created from the training data
+Summary
+This model aims to create an affix identifier for Turkish.
+Model Examination [optional]
+I have just created it, so further testing needed to check if it actually works. Additionally, you should check it if it works before using it.
+[More Information Needed]
+Environmental Impact
+Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
+Hardware Type: Free Colab T4 GPU
+Hours used: ~2.5 hours
+Cloud Provider: Google
+Compute Region: Europe
+Carbon Emitted: [More Information Needed]
+Citation [optional]
+APA:
+Sak, H., Güngör, T., & Saraçlar, M. (2008). Turkish language resources: Morphological parser, morphological disambiguator and web corpus. In Advances in natural language processing (pp. 417-427). Springer Berlin Heidelberg.
+Model Card Authors [optional]
+Kaan Bayar
+Model Card Contact
+kaan.bayar13@gmail.com