Taja Kuzman
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -128,6 +128,8 @@ base_model:
|
|
| 128 |
News topic classification model based on [`xlm-roberta-large`](https://huggingface.co/FacebookAI/xlm-roberta-large)
|
| 129 |
and fine-tuned on a [news corpus in 4 languages](http://hdl.handle.net/11356/1991) (Croatian, Slovenian, Catalan and Greek), annotated with the [top-level IPTC
|
| 130 |
Media Topic NewsCodes labels](https://www.iptc.org/std/NewsCodes/treeview/mediatopic/mediatopic-en-GB.html).
|
|
|
|
|
|
|
| 131 |
|
| 132 |
The model can be used for classification into topic labels from the
|
| 133 |
[IPTC NewsCodes schema](https://iptc.org/std/NewsCodes/guidelines/#_what_are_the_iptc_newscodes) and can be
|
|
@@ -316,15 +318,19 @@ model_args ={
|
|
| 316 |
|
| 317 |
## Citation
|
| 318 |
|
| 319 |
-
|
| 320 |
|
| 321 |
```
|
| 322 |
-
@
|
| 323 |
-
|
| 324 |
-
|
| 325 |
-
|
| 326 |
-
|
| 327 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
| 328 |
```
|
| 329 |
|
| 330 |
## Funding
|
|
|
|
| 128 |
News topic classification model based on [`xlm-roberta-large`](https://huggingface.co/FacebookAI/xlm-roberta-large)
|
| 129 |
and fine-tuned on a [news corpus in 4 languages](http://hdl.handle.net/11356/1991) (Croatian, Slovenian, Catalan and Greek), annotated with the [top-level IPTC
|
| 130 |
Media Topic NewsCodes labels](https://www.iptc.org/std/NewsCodes/treeview/mediatopic/mediatopic-en-GB.html).
|
| 131 |
+
The development and evaluation of the model is described in the paper
|
| 132 |
+
[LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification](https://doi.org/10.1109/ACCESS.2025.3544814) (Kuzman and Ljubešić, 2025).
|
| 133 |
|
| 134 |
The model can be used for classification into topic labels from the
|
| 135 |
[IPTC NewsCodes schema](https://iptc.org/std/NewsCodes/guidelines/#_what_are_the_iptc_newscodes) and can be
|
|
|
|
| 318 |
|
| 319 |
## Citation
|
| 320 |
|
| 321 |
+
If you use the model, please cite [this paper](https://doi.org/10.1109/ACCESS.2025.3544814):
|
| 322 |
|
| 323 |
```
|
| 324 |
+
@ARTICLE{10900365,
|
| 325 |
+
author={Kuzman, Taja and Ljubešić, Nikola},
|
| 326 |
+
journal={IEEE Access},
|
| 327 |
+
title={LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification},
|
| 328 |
+
year={2025},
|
| 329 |
+
volume={},
|
| 330 |
+
number={},
|
| 331 |
+
pages={1-1},
|
| 332 |
+
keywords={Data models;Annotations;Media;Manuals;Multilingual;Computational modeling;Training;Training data;Transformers;Text categorization;Multilingual text classification;IPTC;large language models;LLMs;news topic;topic classification;training data preparation;data annotation},
|
| 333 |
+
doi={10.1109/ACCESS.2025.3544814}}
|
| 334 |
```
|
| 335 |
|
| 336 |
## Funding
|