Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
dense
Generated from Trainer
dataset_size:268
loss:MultipleNegativesRankingLoss
text-embeddings-inference
Instructions to use Nicolas-Spettel/bird-qa-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Nicolas-Spettel/bird-qa-model with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Nicolas-Spettel/bird-qa-model") sentences = [ "Birdwatching for Beginners with Barbara Hannah Grufferman", "Bird that breeds in the Arctic and sub-Arctic and migrates to the Antarctic", "[birds chirping] [Discover Bird-Watching\nwith Barbara Hannah Grufferman] [♪ music and birds chirping ♪] We are in the park and I'm meeting up\nwith Birder Bob, who's an expert on birding. >> It's so nice to meet you.\n>> Yes. So, listen. I came prepared.\nI have my backpack. >> I even have a little notepad in there.\n>> All right. >> But what am I missing?\n>> Ah! Binoculars. >> May I place these over your head?\n>> Please do! Thank you. [laughing]\nLet's go! Vamos! So birding is becoming\nthe fastest-growing outdoor activity— [\"Birding Bob\" DeCandido, Ornithologist]\n>> Yes.\n>> —in the country. >> Why do you think that is? Why?\n>> Well, you can watch birds from inside looking outside\nat a bird feeder in your backyard, but you can also go to a local park. [♪ music ♪]\nHere we are in this giant woodland in the middle of the city. >> And it's beautiful.\n>> Yeah. And we're getting the clean air. We're looking up.\nWe can see birds up there. >> We can hear the cardinals singing.\n>> Here's one. They're migrating north along a flyway here. >> What's a flyway?\n>> Oh my goodness! A flyway is like an aerial path for birds, and oftentimes it's tied to a coastline\nor a mountain chain. So there are some very common birds around\nthat are easy to recognize. Here he is.\nHere's your red-bellied woodpecker right here. I'm going to use my binoculars.\n[laughing] Ah! It seems to me that with birding you could just—depending upon weather— put on a sweater, a jacket, whatever and get out there and walk and look\nand you'll be birding. >> Yes.\n>> Is it more complicated than that? Do I need more equipment? If you want to take it to the next level,\na pair of inexpensive binoculars and a book so you have a reference to go with. It's like a guidebook to birds. Yes, because this is your classroom, you know? >> Right.\n>> And if you can teach yourself, all the best way in the world to learn. [♪ music ♪] I'm going to do some special sounds. This is called pishing, which is\n[demonstrating pishing] There comes somebody on the left. Now, it seems counterintuitive\nthat you make sounds and birds come to the sound. >> Yes.\n>> But birds come in because they operate as a team. [demonstrating pishing]\nWhat a wonderful thing! Yes, yeah.\n[pishing] >> Look, here comes something.\n>> You never know what you're going to find\nas you turn a corner. And all you need\nis your eyes and ears and curiosity. Should I go closer? [birds chirping] Oh!\n[bird chirping] Hello, little cutie! [birds chirping] They like my chia energy bars.\n[laughing] [♪ music and birds chirping ♪] [♪ music and birds chirping ♪] This is such a great way to get outside,\nmove your body, and be with nature. Bird-watching is a great way\nto see the local area and then take it national. I loved my birding experience today. It’s—a new world\nhas been opened up for me really. So I think as of today\nI can call myself an official birder. [AARP, Real Possibilities]", "Teal is a dark cyan color. Its name comes from that of a bird, the Eurasian teal which has a similarly colored stripe on its head. The word is often used colloquially to refer to shades of cyan in general." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:268
- loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
- source_sentence: Birdwatching for Beginners with Barbara Hannah Grufferman
sentences:
- >-
Bird that breeds in the Arctic and sub-Arctic and migrates to the
Antarctic
- >-
[birds chirping] [Discover Bird-Watching
with Barbara Hannah Grufferman] [♪ music and birds chirping ♪] We are in
the park and I'm meeting up
with Birder Bob, who's an expert on birding. >> It's so nice to meet
you.
>> Yes. So, listen. I came prepared.
I have my backpack. >> I even have a little notepad in there.
>> All right. >> But what am I missing?
>> Ah! Binoculars. >> May I place these over your head?
>> Please do! Thank you. [laughing]
Let's go! Vamos! So birding is becoming
the fastest-growing outdoor activity— ["Birding Bob" DeCandido,
Ornithologist]
>> Yes.
>> —in the country. >> Why do you think that is? Why?
>> Well, you can watch birds from inside looking outside
at a bird feeder in your backyard, but you can also go to a local park.
[♪ music ♪]
Here we are in this giant woodland in the middle of the city. >> And
it's beautiful.
>> Yeah. And we're getting the clean air. We're looking up.
We can see birds up there. >> We can hear the cardinals singing.
>> Here's one. They're migrating north along a flyway here. >> What's a
flyway?
>> Oh my goodness! A flyway is like an aerial path for birds, and
oftentimes it's tied to a coastline
or a mountain chain. So there are some very common birds around
that are easy to recognize. Here he is.
Here's your red-bellied woodpecker right here. I'm going to use my
binoculars.
[laughing] Ah! It seems to me that with birding you could just—depending
upon weather— put on a sweater, a jacket, whatever and get out there and
walk and look
and you'll be birding. >> Yes.
>> Is it more complicated than that? Do I need more equipment? If you
want to take it to the next level,
a pair of inexpensive binoculars and a book so you have a reference to
go with. It's like a guidebook to birds. Yes, because this is your
classroom, you know? >> Right.
>> And if you can teach yourself, all the best way in the world to
learn. [♪ music ♪] I'm going to do some special sounds. This is called
pishing, which is
[demonstrating pishing] There comes somebody on the left. Now, it seems
counterintuitive
that you make sounds and birds come to the sound. >> Yes.
>> But birds come in because they operate as a team. [demonstrating
pishing]
What a wonderful thing! Yes, yeah.
[pishing] >> Look, here comes something.
>> You never know what you're going to find
as you turn a corner. And all you need
is your eyes and ears and curiosity. Should I go closer? [birds
chirping] Oh!
[bird chirping] Hello, little cutie! [birds chirping] They like my chia
energy bars.
[laughing] [♪ music and birds chirping ♪] [♪ music and birds chirping ♪]
This is such a great way to get outside,
move your body, and be with nature. Bird-watching is a great way
to see the local area and then take it national. I loved my birding
experience today. It’s—a new world
has been opened up for me really. So I think as of today
I can call myself an official birder. [AARP, Real Possibilities]
- >-
Teal is a dark cyan color. Its name comes from that of a bird, the
Eurasian teal which has a similarly colored stripe on its head. The word
is often used colloquially to refer to shades of cyan in general.
- source_sentence: Corn bunting
sentences:
- >-
The corn bunting is a passerine bird in the bunting family Emberizidae,
a group now separated by most modern authors from the finches,
Fringillidae. This is a large bunting with heavily streaked buff-brown
plumage. The sexes are similar but the male is slightly larger than the
female. Its range extends from Western Europe and North Africa across to
northwestern China.
- >-
The alpine swift is a species of swift found in Africa, southern Europe,
and Asia. They breed in mountains from southern Europe to the Himalayas.
Like common swifts, they are migratory; the southern European population
winters further south in southern Africa. They have very short legs
which are used for clinging to vertical surfaces. Like most swifts, they
never settle voluntarily on the ground, spending most of their lives in
the air living on the insects they catch in their beaks.
- >-
The little tern is a seabird of the family Laridae. It was first
described by the German naturalist Peter Simon Pallas in 1764 and given
the binomial name Sterna albifrons. It was moved to the genus Sternula
when the genus Sterna was restricted to the larger typical terns. The
genus name Sternula is a diminutive of Sterna, 'tern', while the
specific name albifrons is from Latin albus, 'white', and frons,
'forehead'.
- source_sentence: Lesser spotted woodpecker
sentences:
- >-
The Mediterranean gull is a small gull. The scientific name is from
Ancient Greek. The genus Ichthyaetus is from ikhthus, "fish", and aetos,
"eagle", and the specific melanocephalus is from melas, "black", and
-kephalos "-headed".
- >-
The spotted flycatcher is a small passerine bird in the Old World
flycatcher family. It breeds in most of Europe and in the Palearctic to
Siberia, and is migratory, wintering in Africa and south western Asia.
It is declining in parts of its range.
- >-
The lesser spotted woodpecker is a member of the woodpecker family
Picidae. It was formerly assigned to the genus Dendrocopos. Some
taxonomic authorities continue to list the species there.
- source_sentence: Barnacle goose
sentences:
- >-
The short-toed treecreeper is a small passerine bird found in woodlands
through much of the warmer regions of Europe and into north Africa. It
has a generally more southerly distribution than the other European
treecreeper species, the common treecreeper, with which it is easily
confused where they both occur. The short-toed treecreeper tends to
prefer deciduous trees and lower altitudes than its relative in these
overlap areas. Although mainly sedentary, vagrants have occurred outside
the breeding range.
- >-
The barnacle goose is a species of goose that belongs to the genus
Branta of black geese, which contains species with extensive black in
the plumage, distinguishing them from the grey Anser species. Despite
its superficial similarity to the brant goose, genetic analysis has
shown its closest relative is the cackling goose.
- >-
The grey plover or black-bellied plover is a large plover breeding in
Arctic regions. It is a long-distance migrant, with a nearly worldwide
coastal distribution when not breeding.
- source_sentence: White stork
sentences:
- >-
The long-tailed duck is a medium-sized sea duck that breeds in the
tundra and taiga regions of the arctic and winters along the northern
coastlines of the Atlantic and Pacific Oceans. It is the only member of
the genus Clangula.
- "The white stork is a large bird in the stork family, Ciconiidae. Its plumage is mainly white, with black on the bird's wings. Adults have long red legs and long pointed red beaks, and measure on average 100–115\_cm (39–45\_in) from beak tip to end of tail, with a 155–215\_cm (61–85\_in) wingspan. The two subspecies, which differ slightly in size, breed in Europe north to Finland, northwestern Africa, Palearctic east to southern Kazakhstan and southern Africa. The white stork is a long-distance migrant, wintering in Africa from tropical Sub-Saharan Africa to as far south as South Africa, or on the Indian subcontinent. When migrating between Europe and Africa, it avoids crossing the Mediterranean Sea and detours via the Levant in the east or the Strait of Gibraltar in the west, because the air thermals on which it depends for soaring do not form over water."
- >-
The shovelers are four species of dabbling ducks in the genus Spatula
with long, broad spatula-shaped beaks:Red shoveler Spatula platalea
Cape shoveler Spatula smithii
Australasian shoveler Spatula rhynchotis
Northern shoveler Spatula clypeata
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Nicolas-Spettel/bird-qa-model")
# Run inference
sentences = [
'White stork',
"The white stork is a large bird in the stork family, Ciconiidae. Its plumage is mainly white, with black on the bird's wings. Adults have long red legs and long pointed red beaks, and measure on average 100–115\xa0cm (39–45\xa0in) from beak tip to end of tail, with a 155–215\xa0cm (61–85\xa0in) wingspan. The two subspecies, which differ slightly in size, breed in Europe north to Finland, northwestern Africa, Palearctic east to southern Kazakhstan and southern Africa. The white stork is a long-distance migrant, wintering in Africa from tropical Sub-Saharan Africa to as far south as South Africa, or on the Indian subcontinent. When migrating between Europe and Africa, it avoids crossing the Mediterranean Sea and detours via the Levant in the east or the Strait of Gibraltar in the west, because the air thermals on which it depends for soaring do not form over water.",
'The long-tailed duck is a medium-sized sea duck that breeds in the tundra and taiga regions of the arctic and winters along the northern coastlines of the Atlantic and Pacific Oceans. It is the only member of the genus Clangula.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7092, 0.0837],
# [0.7092, 1.0000, 0.1957],
# [0.0837, 0.1957, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 268 training samples
- Columns:
sentence_0andsentence_1 - Approximate statistics based on the first 268 samples:
sentence_0 sentence_1 type string string details - min: 3 tokens
- mean: 5.63 tokens
- max: 16 tokens
- min: 15 tokens
- mean: 94.23 tokens
- max: 256 tokens
- Samples:
sentence_0 sentence_1 Corn buntingThe corn bunting is a passerine bird in the bunting family Emberizidae, a group now separated by most modern authors from the finches, Fringillidae. This is a large bunting with heavily streaked buff-brown plumage. The sexes are similar but the male is slightly larger than the female. Its range extends from Western Europe and North Africa across to northwestern China.Water pipitThe water pipit is a small passerine bird which breeds in the mountains of Southern Europe and the Palearctic eastwards to China. It is a short-distance migrant; many birds move to lower altitudes or wet open lowlands in winter.Marsh titThe marsh tit is a Eurasian passerine bird in the tit family Paridae and genus Poecile, closely related to the willow tit, Père David's and Songar tits. It is a small bird, around 12 cm (4.7 in) long and weighing 12 g (0.42 oz), with a black crown and nape, pale cheeks, brown back and greyish-brown wings and tail. Between 8 and 11 subspecies are recognised. Its close resemblance to the willow tit can cause identification problems, especially in the United Kingdom where the local subspecies of the two are very similar: they were not recognised as separate species until 1897. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 32per_device_eval_batch_size: 32num_train_epochs: 2multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Framework Versions
- Python: 3.13.7
- Sentence Transformers: 5.1.0
- Transformers: 4.56.1
- PyTorch: 2.8.0+cpu
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}