sinhala-nlp
/

NSINA-Media-sinbert-large

Text Classification

Model card Files Files and versions

tharindu commited on Mar 18, 2024

Commit

22b2bc9

·

verified ·

1 Parent(s): 7fab195

Update README.md

Files changed (1) hide show

README.md +30 -1

README.md CHANGED Viewed

@@ -5,4 +5,33 @@ datasets:
 - sinhala-nlp/NSINA-Media
 language:
 - si
----

 - sinhala-nlp/NSINA-Media
 language:
 - si
+---
+# Sinhala News Media Identification
+This is a text classification task created with the [NSINA dataset](https://github.com/Sinhala-NLP/NSINA). This dataset is also released with the same license as NSINA.
+## Data
+Data can be loaded into pandas dataframes using the following code.
+```python
+from datasets import Dataset
+from datasets import load_dataset
+train = Dataset.to_pandas(load_dataset('sinhala-nlp/NSINA-Media', split='train'))
+test = Dataset.to_pandas(load_dataset('sinhala-nlp/NSINA-Media', split='test'))
+```
+## Citation
+If you are using the dataset or the models, please cite the following paper.
+~~~
+@inproceedings{Nsina2024,
+author={Hettiarachchi, Hansi and Premasiri, Damith and Uyangodage, Lasitha and Ranasinghe, Tharindu},
+title={{NSINA: A News Corpus for Sinhala}},
+booktitle={The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
+year={2024},
+month={May},
+}
+~~~