2026-05-29-crf-classweights-clean

This model is a fine-tuned version of Davlan/bert-base-multilingual-cased-ner-hrl on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 26.0250
Precision: 0.7676
Recall: 0.8088
F1: 0.7877
Accuracy: 0.9662
Academicdiscipline F1: 0.6667
Ambiguouslydefinedconcept F1: 0.8383
Discoursephenomenon F1: 0.7154
Graphemicphenomenon F1: 0.0
Languagerelatedterm F1: 0.8106
Languageresourceinformation F1: 0.7816
Lexicalphenomenon F1: 0.6877
Morphologicalphenomenon F1: 0.8057
Morphosyntacticphenomenon F1: 0.8595
New Tag F1: 0.8218
Otherlinguisticterm F1: 0.7259
Phonologicalphenomenon F1: 0.8404
Semanticphenomenon F1: 0.6452
Syntacticphenomenon F1: 0.7594
Topnode Dummy F1: 0.6845
Unclassifiedlinguisticconcept F1: 0.8623

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

Trained on an annotated dataset , with academic publications in the Bibliography of Linguistic Literature (BLL), labelled by human annotator (c-ho/2026-05-28_ub_opus_bll_ner_annotation).

========== LABEL DISTRIBUTION ==========

B-AcademicDiscipline | 40 | 0.000103 B-AmbiguouslyDefinedConcept | 253 | 0.000649 B-DiscoursePhenomenon | 199 | 0.000511 B-GraphemicPhenomenon | 13 | 0.000033 B-LanguageRelatedTerm | 2293 | 0.005886 B-LanguageResourceInformation | 1207 | 0.003098 B-LexicalPhenomenon | 581 | 0.001491 B-MorphologicalPhenomenon | 1393 | 0.003576 B-MorphosyntacticPhenomenon | 5665 | 0.014542 B-NEW_TAG | 3249 | 0.008340 B-OtherLinguisticTerm | 2338 | 0.006002 B-PhonologicalPhenomenon | 3424 | 0.008790 B-SemanticPhenomenon | 900 | 0.002310 B-SyntacticPhenomenon | 2765 | 0.007098 B-TOPNODE_DUMMY | 4769 | 0.012242 B-UnclassifiedLinguisticConcept | 274 | 0.000703 I-AcademicDiscipline | 30 | 0.000077 I-AmbiguouslyDefinedConcept | 2 | 0.000005 I-DiscoursePhenomenon | 29 | 0.000074 I-GraphemicPhenomenon | 5 | 0.000013 I-LanguageRelatedTerm | 136 | 0.000349 I-LanguageResourceInformation | 73 | 0.000187 I-LexicalPhenomenon | 59 | 0.000151 I-MorphologicalPhenomenon | 23 | 0.000059 I-MorphosyntacticPhenomenon | 354 | 0.000909 I-NEW_TAG | 124 | 0.000318 I-OtherLinguisticTerm | 112 | 0.000288 I-PhonologicalPhenomenon | 206 | 0.000529 I-SemanticPhenomenon | 86 | 0.000221 I-SyntacticPhenomenon | 252 | 0.000647 I-TOPNODE_DUMMY | 525 | 0.001348 I-UnclassifiedLinguisticConcept | 23 | 0.000059 O | 358151 | 0.919390

========================================

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 0.1
num_epochs: 15
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy	Academicdiscipline F1	Ambiguouslydefinedconcept F1	Discoursephenomenon F1	Languagerelatedterm F1	Languageresourceinformation F1	Lexicalphenomenon F1	Morphologicalphenomenon F1	Morphosyntacticphenomenon F1	New Tag F1	Otherlinguisticterm F1	Phonologicalphenomenon F1	Semanticphenomenon F1	Syntacticphenomenon F1	Topnode Dummy F1	Unclassifiedlinguisticconcept F1
No log	1.0	196	88.7338	0.0	0.0	0.0	0.9155	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
No log	2.0	392	27.9599	0.6174	0.6089	0.6131	0.9460	0.0	0.0698	0.0	0.6775	0.5802	0.4863	0.7394	0.7285	0.6741	0.5367	0.7704	0.2400	0.5667	0.3722	0.2692
467.1807	3.0	588	19.9192	0.7356	0.7048	0.7199	0.9592	1.0	0.8156	0.0606	0.7185	0.6855	0.6418	0.8226	0.8036	0.7641	0.6416	0.8152	0.5830	0.6648	0.5576	0.8025
467.1807	4.0	784	18.3366	0.7491	0.7535	0.7513	0.9624	1.0	0.8608	0.6491	0.7304	0.7740	0.6737	0.8094	0.8145	0.8034	0.7044	0.8202	0.5980	0.6999	0.6328	0.8221
467.1807	5.0	980	18.2944	0.7686	0.7503	0.7593	0.9629	0.6667	0.8875	0.7458	0.7472	0.7321	0.7125	0.8274	0.8453	0.8007	0.6836	0.8327	0.6541	0.7058	0.6227	0.8205
53.9369	6.0	1176	18.9329	0.7585	0.7894	0.7736	0.9648	0.6667	0.7907	0.6833	0.7707	0.7665	0.7120	0.8384	0.8476	0.8157	0.6921	0.8371	0.6336	0.7322	0.6695	0.8690
53.9369	7.0	1372	19.5750	0.7704	0.7822	0.7763	0.9650	0.5	0.8287	0.7	0.7796	0.7485	0.7356	0.8130	0.8397	0.8107	0.7262	0.8349	0.6524	0.7426	0.6786	0.8161
20.5433	8.0	1568	21.3628	0.7546	0.8024	0.7778	0.9641	0.6667	0.8588	0.7287	0.7911	0.7765	0.6806	0.8156	0.8496	0.8152	0.7453	0.8285	0.6667	0.7269	0.6627	0.8136
20.5433	9.0	1764	22.3521	0.7415	0.8176	0.7777	0.9641	0.5	0.8304	0.6963	0.8113	0.7553	0.6927	0.8254	0.8470	0.7913	0.7244	0.8449	0.6528	0.7524	0.6684	0.8295
20.5433	10.0	1960	22.8445	0.7467	0.8057	0.7751	0.9641	0.4	0.8488	0.736	0.8021	0.7712	0.6825	0.8156	0.8485	0.7956	0.7211	0.8409	0.6508	0.7339	0.6658	0.8353
9.7332	11.0	2156	23.1898	0.7552	0.8109	0.7821	0.9649	0.4	0.8652	0.7438	0.8028	0.7825	0.6862	0.8149	0.8556	0.8177	0.7235	0.8374	0.6721	0.7401	0.6690	0.8795
9.7332	12.0	2352	24.5698	0.7510	0.8150	0.7817	0.9648	0.4	0.8353	0.7317	0.8040	0.7925	0.6918	0.8137	0.8582	0.8115	0.7303	0.8303	0.6510	0.7598	0.6615	0.8727
5.2110	13.0	2548	24.8242	0.7747	0.7978	0.7861	0.9660	0.6667	0.8284	0.7059	0.8094	0.7827	0.7086	0.8082	0.8520	0.8198	0.7372	0.8418	0.6630	0.7506	0.6799	0.8623
5.2110	14.0	2744	25.8318	0.7623	0.8054	0.7833	0.9655	0.6667	0.8353	0.6777	0.8038	0.7760	0.7059	0.7989	0.8582	0.8172	0.7279	0.8405	0.6368	0.7573	0.6708	0.8623
5.2110	15.0	2940	26.0250	0.7676	0.8088	0.7877	0.9662	0.6667	0.8383	0.7154	0.8106	0.7816	0.6877	0.8057	0.8595	0.8218	0.7259	0.8404	0.6452	0.7594	0.6845	0.8623

Framework versions

Transformers 5.8.0
Pytorch 2.5.1+cu124
Datasets 3.2.0
Tokenizers 0.22.2

Downloads last month: 323

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for c-ho/2026-05-29-crf-classweights-clean

Base model

Davlan/bert-base-multilingual-cased-ner-hrl

Finetuned

(3)

this model

c-ho
/

2026-05-29-crf-classweights-clean