Instructions to use c-ho/2026-05-29-crf-classweights-clean with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use c-ho/2026-05-29-crf-classweights-clean with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="c-ho/2026-05-29-crf-classweights-clean")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("c-ho/2026-05-29-crf-classweights-clean") model = AutoModelForTokenClassification.from_pretrained("c-ho/2026-05-29-crf-classweights-clean") - Notebooks
- Google Colab
- Kaggle
2026-05-29-crf-classweights-clean
This model is a fine-tuned version of Davlan/bert-base-multilingual-cased-ner-hrl on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 26.0250
- Precision: 0.7676
- Recall: 0.8088
- F1: 0.7877
- Accuracy: 0.9662
- Academicdiscipline F1: 0.6667
- Ambiguouslydefinedconcept F1: 0.8383
- Discoursephenomenon F1: 0.7154
- Graphemicphenomenon F1: 0.0
- Languagerelatedterm F1: 0.8106
- Languageresourceinformation F1: 0.7816
- Lexicalphenomenon F1: 0.6877
- Morphologicalphenomenon F1: 0.8057
- Morphosyntacticphenomenon F1: 0.8595
- New Tag F1: 0.8218
- Otherlinguisticterm F1: 0.7259
- Phonologicalphenomenon F1: 0.8404
- Semanticphenomenon F1: 0.6452
- Syntacticphenomenon F1: 0.7594
- Topnode Dummy F1: 0.6845
- Unclassifiedlinguisticconcept F1: 0.8623
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
Trained on an annotated dataset , with academic publications in the Bibliography of Linguistic Literature (BLL), labelled by human annotator (c-ho/2026-05-28_ub_opus_bll_ner_annotation).
========== LABEL DISTRIBUTION ==========
B-AcademicDiscipline | 40 | 0.000103 B-AmbiguouslyDefinedConcept | 253 | 0.000649 B-DiscoursePhenomenon | 199 | 0.000511 B-GraphemicPhenomenon | 13 | 0.000033 B-LanguageRelatedTerm | 2293 | 0.005886 B-LanguageResourceInformation | 1207 | 0.003098 B-LexicalPhenomenon | 581 | 0.001491 B-MorphologicalPhenomenon | 1393 | 0.003576 B-MorphosyntacticPhenomenon | 5665 | 0.014542 B-NEW_TAG | 3249 | 0.008340 B-OtherLinguisticTerm | 2338 | 0.006002 B-PhonologicalPhenomenon | 3424 | 0.008790 B-SemanticPhenomenon | 900 | 0.002310 B-SyntacticPhenomenon | 2765 | 0.007098 B-TOPNODE_DUMMY | 4769 | 0.012242 B-UnclassifiedLinguisticConcept | 274 | 0.000703 I-AcademicDiscipline | 30 | 0.000077 I-AmbiguouslyDefinedConcept | 2 | 0.000005 I-DiscoursePhenomenon | 29 | 0.000074 I-GraphemicPhenomenon | 5 | 0.000013 I-LanguageRelatedTerm | 136 | 0.000349 I-LanguageResourceInformation | 73 | 0.000187 I-LexicalPhenomenon | 59 | 0.000151 I-MorphologicalPhenomenon | 23 | 0.000059 I-MorphosyntacticPhenomenon | 354 | 0.000909 I-NEW_TAG | 124 | 0.000318 I-OtherLinguisticTerm | 112 | 0.000288 I-PhonologicalPhenomenon | 206 | 0.000529 I-SemanticPhenomenon | 86 | 0.000221 I-SyntacticPhenomenon | 252 | 0.000647 I-TOPNODE_DUMMY | 525 | 0.001348 I-UnclassifiedLinguisticConcept | 23 | 0.000059 O | 358151 | 0.919390
========================================
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 0.1
- num_epochs: 15
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | Academicdiscipline F1 | Ambiguouslydefinedconcept F1 | Discoursephenomenon F1 | Graphemicphenomenon F1 | Languagerelatedterm F1 | Languageresourceinformation F1 | Lexicalphenomenon F1 | Morphologicalphenomenon F1 | Morphosyntacticphenomenon F1 | New Tag F1 | Otherlinguisticterm F1 | Phonologicalphenomenon F1 | Semanticphenomenon F1 | Syntacticphenomenon F1 | Topnode Dummy F1 | Unclassifiedlinguisticconcept F1 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 1.0 | 196 | 88.7338 | 0.0 | 0.0 | 0.0 | 0.9155 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| No log | 2.0 | 392 | 27.9599 | 0.6174 | 0.6089 | 0.6131 | 0.9460 | 0.0 | 0.0698 | 0.0 | 0.0 | 0.6775 | 0.5802 | 0.4863 | 0.7394 | 0.7285 | 0.6741 | 0.5367 | 0.7704 | 0.2400 | 0.5667 | 0.3722 | 0.2692 |
| 467.1807 | 3.0 | 588 | 19.9192 | 0.7356 | 0.7048 | 0.7199 | 0.9592 | 1.0 | 0.8156 | 0.0606 | 0.0 | 0.7185 | 0.6855 | 0.6418 | 0.8226 | 0.8036 | 0.7641 | 0.6416 | 0.8152 | 0.5830 | 0.6648 | 0.5576 | 0.8025 |
| 467.1807 | 4.0 | 784 | 18.3366 | 0.7491 | 0.7535 | 0.7513 | 0.9624 | 1.0 | 0.8608 | 0.6491 | 0.0 | 0.7304 | 0.7740 | 0.6737 | 0.8094 | 0.8145 | 0.8034 | 0.7044 | 0.8202 | 0.5980 | 0.6999 | 0.6328 | 0.8221 |
| 467.1807 | 5.0 | 980 | 18.2944 | 0.7686 | 0.7503 | 0.7593 | 0.9629 | 0.6667 | 0.8875 | 0.7458 | 0.0 | 0.7472 | 0.7321 | 0.7125 | 0.8274 | 0.8453 | 0.8007 | 0.6836 | 0.8327 | 0.6541 | 0.7058 | 0.6227 | 0.8205 |
| 53.9369 | 6.0 | 1176 | 18.9329 | 0.7585 | 0.7894 | 0.7736 | 0.9648 | 0.6667 | 0.7907 | 0.6833 | 0.0 | 0.7707 | 0.7665 | 0.7120 | 0.8384 | 0.8476 | 0.8157 | 0.6921 | 0.8371 | 0.6336 | 0.7322 | 0.6695 | 0.8690 |
| 53.9369 | 7.0 | 1372 | 19.5750 | 0.7704 | 0.7822 | 0.7763 | 0.9650 | 0.5 | 0.8287 | 0.7 | 0.0 | 0.7796 | 0.7485 | 0.7356 | 0.8130 | 0.8397 | 0.8107 | 0.7262 | 0.8349 | 0.6524 | 0.7426 | 0.6786 | 0.8161 |
| 20.5433 | 8.0 | 1568 | 21.3628 | 0.7546 | 0.8024 | 0.7778 | 0.9641 | 0.6667 | 0.8588 | 0.7287 | 0.0 | 0.7911 | 0.7765 | 0.6806 | 0.8156 | 0.8496 | 0.8152 | 0.7453 | 0.8285 | 0.6667 | 0.7269 | 0.6627 | 0.8136 |
| 20.5433 | 9.0 | 1764 | 22.3521 | 0.7415 | 0.8176 | 0.7777 | 0.9641 | 0.5 | 0.8304 | 0.6963 | 0.0 | 0.8113 | 0.7553 | 0.6927 | 0.8254 | 0.8470 | 0.7913 | 0.7244 | 0.8449 | 0.6528 | 0.7524 | 0.6684 | 0.8295 |
| 20.5433 | 10.0 | 1960 | 22.8445 | 0.7467 | 0.8057 | 0.7751 | 0.9641 | 0.4 | 0.8488 | 0.736 | 0.0 | 0.8021 | 0.7712 | 0.6825 | 0.8156 | 0.8485 | 0.7956 | 0.7211 | 0.8409 | 0.6508 | 0.7339 | 0.6658 | 0.8353 |
| 9.7332 | 11.0 | 2156 | 23.1898 | 0.7552 | 0.8109 | 0.7821 | 0.9649 | 0.4 | 0.8652 | 0.7438 | 0.0 | 0.8028 | 0.7825 | 0.6862 | 0.8149 | 0.8556 | 0.8177 | 0.7235 | 0.8374 | 0.6721 | 0.7401 | 0.6690 | 0.8795 |
| 9.7332 | 12.0 | 2352 | 24.5698 | 0.7510 | 0.8150 | 0.7817 | 0.9648 | 0.4 | 0.8353 | 0.7317 | 0.0 | 0.8040 | 0.7925 | 0.6918 | 0.8137 | 0.8582 | 0.8115 | 0.7303 | 0.8303 | 0.6510 | 0.7598 | 0.6615 | 0.8727 |
| 5.2110 | 13.0 | 2548 | 24.8242 | 0.7747 | 0.7978 | 0.7861 | 0.9660 | 0.6667 | 0.8284 | 0.7059 | 0.0 | 0.8094 | 0.7827 | 0.7086 | 0.8082 | 0.8520 | 0.8198 | 0.7372 | 0.8418 | 0.6630 | 0.7506 | 0.6799 | 0.8623 |
| 5.2110 | 14.0 | 2744 | 25.8318 | 0.7623 | 0.8054 | 0.7833 | 0.9655 | 0.6667 | 0.8353 | 0.6777 | 0.0 | 0.8038 | 0.7760 | 0.7059 | 0.7989 | 0.8582 | 0.8172 | 0.7279 | 0.8405 | 0.6368 | 0.7573 | 0.6708 | 0.8623 |
| 5.2110 | 15.0 | 2940 | 26.0250 | 0.7676 | 0.8088 | 0.7877 | 0.9662 | 0.6667 | 0.8383 | 0.7154 | 0.0 | 0.8106 | 0.7816 | 0.6877 | 0.8057 | 0.8595 | 0.8218 | 0.7259 | 0.8404 | 0.6452 | 0.7594 | 0.6845 | 0.8623 |
Framework versions
- Transformers 5.8.0
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.22.2
- Downloads last month
- 323
Model tree for c-ho/2026-05-29-crf-classweights-clean
Base model
Davlan/bert-base-multilingual-cased-ner-hrl