2026-05-29-crf-classweights-clean

This model is a fine-tuned version of Davlan/bert-base-multilingual-cased-ner-hrl on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 26.0250
  • Precision: 0.7676
  • Recall: 0.8088
  • F1: 0.7877
  • Accuracy: 0.9662
  • Academicdiscipline F1: 0.6667
  • Ambiguouslydefinedconcept F1: 0.8383
  • Discoursephenomenon F1: 0.7154
  • Graphemicphenomenon F1: 0.0
  • Languagerelatedterm F1: 0.8106
  • Languageresourceinformation F1: 0.7816
  • Lexicalphenomenon F1: 0.6877
  • Morphologicalphenomenon F1: 0.8057
  • Morphosyntacticphenomenon F1: 0.8595
  • New Tag F1: 0.8218
  • Otherlinguisticterm F1: 0.7259
  • Phonologicalphenomenon F1: 0.8404
  • Semanticphenomenon F1: 0.6452
  • Syntacticphenomenon F1: 0.7594
  • Topnode Dummy F1: 0.6845
  • Unclassifiedlinguisticconcept F1: 0.8623

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

Trained on an annotated dataset , with academic publications in the Bibliography of Linguistic Literature (BLL), labelled by human annotator (c-ho/2026-05-28_ub_opus_bll_ner_annotation).

========== LABEL DISTRIBUTION ==========

B-AcademicDiscipline | 40 | 0.000103 B-AmbiguouslyDefinedConcept | 253 | 0.000649 B-DiscoursePhenomenon | 199 | 0.000511 B-GraphemicPhenomenon | 13 | 0.000033 B-LanguageRelatedTerm | 2293 | 0.005886 B-LanguageResourceInformation | 1207 | 0.003098 B-LexicalPhenomenon | 581 | 0.001491 B-MorphologicalPhenomenon | 1393 | 0.003576 B-MorphosyntacticPhenomenon | 5665 | 0.014542 B-NEW_TAG | 3249 | 0.008340 B-OtherLinguisticTerm | 2338 | 0.006002 B-PhonologicalPhenomenon | 3424 | 0.008790 B-SemanticPhenomenon | 900 | 0.002310 B-SyntacticPhenomenon | 2765 | 0.007098 B-TOPNODE_DUMMY | 4769 | 0.012242 B-UnclassifiedLinguisticConcept | 274 | 0.000703 I-AcademicDiscipline | 30 | 0.000077 I-AmbiguouslyDefinedConcept | 2 | 0.000005 I-DiscoursePhenomenon | 29 | 0.000074 I-GraphemicPhenomenon | 5 | 0.000013 I-LanguageRelatedTerm | 136 | 0.000349 I-LanguageResourceInformation | 73 | 0.000187 I-LexicalPhenomenon | 59 | 0.000151 I-MorphologicalPhenomenon | 23 | 0.000059 I-MorphosyntacticPhenomenon | 354 | 0.000909 I-NEW_TAG | 124 | 0.000318 I-OtherLinguisticTerm | 112 | 0.000288 I-PhonologicalPhenomenon | 206 | 0.000529 I-SemanticPhenomenon | 86 | 0.000221 I-SyntacticPhenomenon | 252 | 0.000647 I-TOPNODE_DUMMY | 525 | 0.001348 I-UnclassifiedLinguisticConcept | 23 | 0.000059 O | 358151 | 0.919390

========================================

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 0.1
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy Academicdiscipline F1 Ambiguouslydefinedconcept F1 Discoursephenomenon F1 Graphemicphenomenon F1 Languagerelatedterm F1 Languageresourceinformation F1 Lexicalphenomenon F1 Morphologicalphenomenon F1 Morphosyntacticphenomenon F1 New Tag F1 Otherlinguisticterm F1 Phonologicalphenomenon F1 Semanticphenomenon F1 Syntacticphenomenon F1 Topnode Dummy F1 Unclassifiedlinguisticconcept F1
No log 1.0 196 88.7338 0.0 0.0 0.0 0.9155 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
No log 2.0 392 27.9599 0.6174 0.6089 0.6131 0.9460 0.0 0.0698 0.0 0.0 0.6775 0.5802 0.4863 0.7394 0.7285 0.6741 0.5367 0.7704 0.2400 0.5667 0.3722 0.2692
467.1807 3.0 588 19.9192 0.7356 0.7048 0.7199 0.9592 1.0 0.8156 0.0606 0.0 0.7185 0.6855 0.6418 0.8226 0.8036 0.7641 0.6416 0.8152 0.5830 0.6648 0.5576 0.8025
467.1807 4.0 784 18.3366 0.7491 0.7535 0.7513 0.9624 1.0 0.8608 0.6491 0.0 0.7304 0.7740 0.6737 0.8094 0.8145 0.8034 0.7044 0.8202 0.5980 0.6999 0.6328 0.8221
467.1807 5.0 980 18.2944 0.7686 0.7503 0.7593 0.9629 0.6667 0.8875 0.7458 0.0 0.7472 0.7321 0.7125 0.8274 0.8453 0.8007 0.6836 0.8327 0.6541 0.7058 0.6227 0.8205
53.9369 6.0 1176 18.9329 0.7585 0.7894 0.7736 0.9648 0.6667 0.7907 0.6833 0.0 0.7707 0.7665 0.7120 0.8384 0.8476 0.8157 0.6921 0.8371 0.6336 0.7322 0.6695 0.8690
53.9369 7.0 1372 19.5750 0.7704 0.7822 0.7763 0.9650 0.5 0.8287 0.7 0.0 0.7796 0.7485 0.7356 0.8130 0.8397 0.8107 0.7262 0.8349 0.6524 0.7426 0.6786 0.8161
20.5433 8.0 1568 21.3628 0.7546 0.8024 0.7778 0.9641 0.6667 0.8588 0.7287 0.0 0.7911 0.7765 0.6806 0.8156 0.8496 0.8152 0.7453 0.8285 0.6667 0.7269 0.6627 0.8136
20.5433 9.0 1764 22.3521 0.7415 0.8176 0.7777 0.9641 0.5 0.8304 0.6963 0.0 0.8113 0.7553 0.6927 0.8254 0.8470 0.7913 0.7244 0.8449 0.6528 0.7524 0.6684 0.8295
20.5433 10.0 1960 22.8445 0.7467 0.8057 0.7751 0.9641 0.4 0.8488 0.736 0.0 0.8021 0.7712 0.6825 0.8156 0.8485 0.7956 0.7211 0.8409 0.6508 0.7339 0.6658 0.8353
9.7332 11.0 2156 23.1898 0.7552 0.8109 0.7821 0.9649 0.4 0.8652 0.7438 0.0 0.8028 0.7825 0.6862 0.8149 0.8556 0.8177 0.7235 0.8374 0.6721 0.7401 0.6690 0.8795
9.7332 12.0 2352 24.5698 0.7510 0.8150 0.7817 0.9648 0.4 0.8353 0.7317 0.0 0.8040 0.7925 0.6918 0.8137 0.8582 0.8115 0.7303 0.8303 0.6510 0.7598 0.6615 0.8727
5.2110 13.0 2548 24.8242 0.7747 0.7978 0.7861 0.9660 0.6667 0.8284 0.7059 0.0 0.8094 0.7827 0.7086 0.8082 0.8520 0.8198 0.7372 0.8418 0.6630 0.7506 0.6799 0.8623
5.2110 14.0 2744 25.8318 0.7623 0.8054 0.7833 0.9655 0.6667 0.8353 0.6777 0.0 0.8038 0.7760 0.7059 0.7989 0.8582 0.8172 0.7279 0.8405 0.6368 0.7573 0.6708 0.8623
5.2110 15.0 2940 26.0250 0.7676 0.8088 0.7877 0.9662 0.6667 0.8383 0.7154 0.0 0.8106 0.7816 0.6877 0.8057 0.8595 0.8218 0.7259 0.8404 0.6452 0.7594 0.6845 0.8623

Framework versions

  • Transformers 5.8.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.22.2
Downloads last month
323
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for c-ho/2026-05-29-crf-classweights-clean

Finetuned
(3)
this model

Space using c-ho/2026-05-29-crf-classweights-clean 1