Fremtind/all-nli-norwegian
Viewer • Updated • 570k • 34 • 1
How to use thivy/norbert4-base-nli-norwegian with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("thivy/norbert4-base-nli-norwegian", trust_remote_code=True)
sentences = [
"Inne i igloen gjør den unge mannen seg klar for sitt overnattingsopphold.",
"Folk danser i gaten.",
"Den unge mannen gjør seg klar for sitt overnattingsopphold.",
"Den unge mannen gjør seg klar til å dra."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from ltg/norbert4-base on the all-nli-norwegian dataset. It maps sentences & paragraphs to a 640-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'GptBertModel'})
(1): Pooling({'word_embedding_dimension': 640, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("thivy/norbert4-base-nli-norwegian")
# Run inference
sentences = [
'En mann lager et sandmaleri på gulvet.',
'En mann lager kunst.',
'En kvinne ødelegger et sandmaleri.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 640]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6251, 0.2931],
# [0.6251, 1.0000, 0.1305],
# [0.2931, 0.1305, 1.0000]])
evalTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.9547 |
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
En person på en hest hopper over et havarert fly. |
En person er utendørs, på en hest. |
En person er på en diner og bestiller en omelett. |
Barn smiler og vinker til kameraet |
Det er barn til stede |
Barna rynker pannen |
En gutt hopper på skateboard midt på en rød bro. |
Gutten gjør et skateboardtriks. |
Gutten skater nedover fortauet. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
To kvinner klemmer mens de holder take-away pakker. |
To kvinner holder pakker. |
Mennene slåss utenfor en deli. |
To små barn i blå drakter, en med nummer 9 og en med nummer 2, står på trinn i et bad og vasker hendene i en vask. |
To barn i nummererte drakter vasker hendene. |
To barn i jakker går til skolen. |
En mann selger donuts til en kunde under et verdensutstillingsarrangement holdt i byen Angeles |
En mann selger donuts til en kunde. |
En kvinne drikker kaffen sin på en liten kafé. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
eval_strategy: stepsper_device_train_batch_size: 32per_device_eval_batch_size: 64learning_rate: 2e-05weight_decay: 0.01num_train_epochs: 1warmup_ratio: 0.1bf16: Trueload_best_model_at_end: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 64per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss | eval_cosine_accuracy |
|---|---|---|---|---|
| 0.0058 | 100 | 4.0493 | - | - |
| 0.0115 | 200 | 3.0097 | - | - |
| 0.0173 | 300 | 1.4324 | - | - |
| 0.0230 | 400 | 1.0791 | - | - |
| 0.0288 | 500 | 0.8985 | 0.7151 | 0.8682 |
| 0.0345 | 600 | 0.7899 | - | - |
| 0.0403 | 700 | 0.7379 | - | - |
| 0.0460 | 800 | 0.7333 | - | - |
| 0.0518 | 900 | 0.6676 | - | - |
| 0.0575 | 1000 | 0.6593 | 0.4987 | 0.9137 |
| 0.0633 | 1100 | 0.6162 | - | - |
| 0.0690 | 1200 | 0.6153 | - | - |
| 0.0748 | 1300 | 0.5763 | - | - |
| 0.0805 | 1400 | 0.6055 | - | - |
| 0.0863 | 1500 | 0.5504 | 0.4496 | 0.9207 |
| 0.0920 | 1600 | 0.5622 | - | - |
| 0.0978 | 1700 | 0.5484 | - | - |
| 0.1035 | 1800 | 0.5263 | - | - |
| 0.1093 | 1900 | 0.5789 | - | - |
| 0.1150 | 2000 | 0.5462 | 0.4225 | 0.9273 |
| 0.1208 | 2100 | 0.5521 | - | - |
| 0.1265 | 2200 | 0.5368 | - | - |
| 0.1323 | 2300 | 0.5079 | - | - |
| 0.1380 | 2400 | 0.5437 | - | - |
| 0.1438 | 2500 | 0.5123 | 0.4020 | 0.9346 |
| 0.1495 | 2600 | 0.4835 | - | - |
| 0.1553 | 2700 | 0.473 | - | - |
| 0.1610 | 2800 | 0.4957 | - | - |
| 0.1668 | 2900 | 0.4935 | - | - |
| 0.1725 | 3000 | 0.4894 | 0.3775 | 0.9383 |
| 0.1783 | 3100 | 0.4894 | - | - |
| 0.1840 | 3200 | 0.5203 | - | - |
| 0.1898 | 3300 | 0.4907 | - | - |
| 0.1955 | 3400 | 0.464 | - | - |
| 0.2013 | 3500 | 0.461 | 0.3808 | 0.9387 |
| 0.2071 | 3600 | 0.4486 | - | - |
| 0.2128 | 3700 | 0.4753 | - | - |
| 0.2186 | 3800 | 0.4591 | - | - |
| 0.2243 | 3900 | 0.4496 | - | - |
| 0.2301 | 4000 | 0.428 | 0.3680 | 0.9383 |
| 0.2358 | 4100 | 0.433 | - | - |
| 0.2416 | 4200 | 0.4525 | - | - |
| 0.2473 | 4300 | 0.4119 | - | - |
| 0.2531 | 4400 | 0.4335 | - | - |
| 0.2588 | 4500 | 0.4378 | 0.3586 | 0.9407 |
| 0.2646 | 4600 | 0.4073 | - | - |
| 0.2703 | 4700 | 0.3997 | - | - |
| 0.2761 | 4800 | 0.381 | - | - |
| 0.2818 | 4900 | 0.4064 | - | - |
| 0.2876 | 5000 | 0.4211 | 0.3577 | 0.9438 |
| 0.2933 | 5100 | 0.4338 | - | - |
| 0.2991 | 5200 | 0.3951 | - | - |
| 0.3048 | 5300 | 0.3813 | - | - |
| 0.3106 | 5400 | 0.4165 | - | - |
| 0.3163 | 5500 | 0.405 | 0.3464 | 0.9428 |
| 0.3221 | 5600 | 0.395 | - | - |
| 0.3278 | 5700 | 0.3869 | - | - |
| 0.3336 | 5800 | 0.3758 | - | - |
| 0.3393 | 5900 | 0.4021 | - | - |
| 0.3451 | 6000 | 0.374 | 0.3511 | 0.9460 |
| 0.3508 | 6100 | 0.3696 | - | - |
| 0.3566 | 6200 | 0.377 | - | - |
| 0.3623 | 6300 | 0.37 | - | - |
| 0.3681 | 6400 | 0.3584 | - | - |
| 0.3738 | 6500 | 0.3485 | 0.3399 | 0.9470 |
| 0.3796 | 6600 | 0.3841 | - | - |
| 0.3853 | 6700 | 0.3674 | - | - |
| 0.3911 | 6800 | 0.3843 | - | - |
| 0.3968 | 6900 | 0.3753 | - | - |
| 0.4026 | 7000 | 0.3533 | 0.3435 | 0.9448 |
| 0.4084 | 7100 | 0.3577 | - | - |
| 0.4141 | 7200 | 0.3442 | - | - |
| 0.4199 | 7300 | 0.3539 | - | - |
| 0.4256 | 7400 | 0.3723 | - | - |
| 0.4314 | 7500 | 0.3666 | 0.3383 | 0.9456 |
| 0.4371 | 7600 | 0.3644 | - | - |
| 0.4429 | 7700 | 0.3644 | - | - |
| 0.4486 | 7800 | 0.3474 | - | - |
| 0.4544 | 7900 | 0.3538 | - | - |
| 0.4601 | 8000 | 0.3733 | 0.3316 | 0.9508 |
| 0.4659 | 8100 | 0.3587 | - | - |
| 0.4716 | 8200 | 0.347 | - | - |
| 0.4774 | 8300 | 0.3809 | - | - |
| 0.4831 | 8400 | 0.3222 | - | - |
| 0.4889 | 8500 | 0.3408 | 0.3281 | 0.9492 |
| 0.4946 | 8600 | 0.3345 | - | - |
| 0.5004 | 8700 | 0.3492 | - | - |
| 0.5061 | 8800 | 0.3311 | - | - |
| 0.5119 | 8900 | 0.3576 | - | - |
| 0.5176 | 9000 | 0.3377 | 0.3215 | 0.9488 |
| 0.5234 | 9100 | 0.3405 | - | - |
| 0.5291 | 9200 | 0.3243 | - | - |
| 0.5349 | 9300 | 0.351 | - | - |
| 0.5406 | 9400 | 0.3547 | - | - |
| 0.5464 | 9500 | 0.3438 | 0.3241 | 0.9500 |
| 0.5521 | 9600 | 0.3384 | - | - |
| 0.5579 | 9700 | 0.3306 | - | - |
| 0.5636 | 9800 | 0.353 | - | - |
| 0.5694 | 9900 | 0.299 | - | - |
| 0.5751 | 10000 | 0.3064 | 0.3173 | 0.9509 |
| 0.5809 | 10100 | 0.3292 | - | - |
| 0.5866 | 10200 | 0.292 | - | - |
| 0.5924 | 10300 | 0.3599 | - | - |
| 0.5981 | 10400 | 0.3271 | - | - |
| 0.6039 | 10500 | 0.3002 | 0.3225 | 0.9492 |
| 0.6097 | 10600 | 0.3455 | - | - |
| 0.6154 | 10700 | 0.2981 | - | - |
| 0.6212 | 10800 | 0.3255 | - | - |
| 0.6269 | 10900 | 0.3 | - | - |
| 0.6327 | 11000 | 0.304 | 0.3170 | 0.9512 |
| 0.6384 | 11100 | 0.3136 | - | - |
| 0.6442 | 11200 | 0.3348 | - | - |
| 0.6499 | 11300 | 0.3255 | - | - |
| 0.6557 | 11400 | 0.3101 | - | - |
| 0.6614 | 11500 | 0.314 | 0.3149 | 0.9500 |
| 0.6672 | 11600 | 0.3157 | - | - |
| 0.6729 | 11700 | 0.3149 | - | - |
| 0.6787 | 11800 | 0.2966 | - | - |
| 0.6844 | 11900 | 0.3145 | - | - |
| 0.6902 | 12000 | 0.2928 | 0.3075 | 0.9532 |
| 0.6959 | 12100 | 0.3035 | - | - |
| 0.7017 | 12200 | 0.3142 | - | - |
| 0.7074 | 12300 | 0.3289 | - | - |
| 0.7132 | 12400 | 0.3046 | - | - |
| 0.7189 | 12500 | 0.311 | 0.3103 | 0.9529 |
| 0.7247 | 12600 | 0.2942 | - | - |
| 0.7304 | 12700 | 0.295 | - | - |
| 0.7362 | 12800 | 0.2802 | - | - |
| 0.7419 | 12900 | 0.3258 | - | - |
| 0.7477 | 13000 | 0.28 | 0.3027 | 0.9518 |
| 0.7534 | 13100 | 0.2887 | - | - |
| 0.7592 | 13200 | 0.2729 | - | - |
| 0.7649 | 13300 | 0.2936 | - | - |
| 0.7707 | 13400 | 0.2883 | - | - |
| 0.7764 | 13500 | 0.2972 | 0.3048 | 0.9549 |
| 0.7822 | 13600 | 0.2806 | - | - |
| 0.7879 | 13700 | 0.2851 | - | - |
| 0.7937 | 13800 | 0.3097 | - | - |
| 0.7994 | 13900 | 0.2663 | - | - |
| 0.8052 | 14000 | 0.2743 | 0.3004 | 0.9529 |
| 0.8110 | 14100 | 0.2911 | - | - |
| 0.8167 | 14200 | 0.2955 | - | - |
| 0.8225 | 14300 | 0.2892 | - | - |
| 0.8282 | 14400 | 0.2796 | - | - |
| 0.8340 | 14500 | 0.2674 | 0.3000 | 0.9528 |
| 0.8397 | 14600 | 0.2604 | - | - |
| 0.8455 | 14700 | 0.2816 | - | - |
| 0.8512 | 14800 | 0.2711 | - | - |
| 0.8570 | 14900 | 0.2897 | - | - |
| 0.8627 | 15000 | 0.2495 | 0.3008 | 0.9544 |
| 0.8685 | 15100 | 0.3126 | - | - |
| 0.8742 | 15200 | 0.3151 | - | - |
| 0.8800 | 15300 | 0.2664 | - | - |
| 0.8857 | 15400 | 0.2884 | - | - |
| 0.8915 | 15500 | 0.263 | 0.2984 | 0.9552 |
| 0.8972 | 15600 | 0.2733 | - | - |
| 0.9030 | 15700 | 0.2755 | - | - |
| 0.9087 | 15800 | 0.2818 | - | - |
| 0.9145 | 15900 | 0.2853 | - | - |
| 0.9202 | 16000 | 0.2742 | 0.2980 | 0.9544 |
| 0.9260 | 16100 | 0.269 | - | - |
| 0.9317 | 16200 | 0.257 | - | - |
| 0.9375 | 16300 | 0.2637 | - | - |
| 0.9432 | 16400 | 0.2752 | - | - |
| 0.9490 | 16500 | 0.2719 | 0.2971 | 0.9546 |
| 0.9547 | 16600 | 0.282 | - | - |
| 0.9605 | 16700 | 0.2461 | - | - |
| 0.9662 | 16800 | 0.2673 | - | - |
| 0.9720 | 16900 | 0.2646 | - | - |
| 0.9777 | 17000 | 0.2665 | 0.2960 | 0.9547 |
| 0.9835 | 17100 | 0.258 | - | - |
| 0.9892 | 17200 | 0.2562 | - | - |
| 0.9950 | 17300 | 0.2511 | - | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
ltg/norbert4-base