| language: id | |
| license: apache-2.0 | |
| tags: | |
| - indobert | |
| - text-classification | |
| - plagiarism-detection | |
| - indonesian | |
| - fine-tuned | |
| pipeline_tag: text-classification | |
| widget: | |
| - text: "Apa pengganti y?" | |
| example_title: "Contoh Plagiarisme" | |
| - text: "Bagaimana cara belajar Python?" | |
| example_title: "Contoh Paraphrase" | |
| # IndoBERT Plagiarisme Detector | |
| Model **IndoBERT-base-p1** fine-tuned untuk **deteksi plagiarisme teks bahasa Indonesia** (3 kelas): | |
| - `LABEL_0` β π’ Tidak Mirip | |
| - `LABEL_1` β π‘ Paraphrase | |
| - `LABEL_2` β π΄ Plagiarisme (literal/copy-paste) | |
| **Input model**: Masukkan dua teks/kalimat (question1 dan question2), model akan prediksi kemiripannya. | |
| ### Performa | |
| - Accuracy: **78.33%** (test set 300 data) | |
| - Dataset: 3000 data balanced + augmentasi sintetik | |
| ### Cara Pakai di Python | |
| ```python | |
| from transformers import pipeline | |
| detector = pipeline("text-classification", model="putraharifin/tubes_deep_learning") | |
| result = detector("Apa pengganti y?", "Apa pengganti y dong") | |
| print(result) # {'label': 'LABEL_2', 'score': 0.99} |