Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use AlekseyCalvin/LyricalEmbeddingGemma_v1 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("AlekseyCalvin/LyricalEmbeddingGemma_v1")
sentences = [
"(Название:) ДО НОВОЙ ЗАРИ \n(Поэт:) Владимир Силлов \n\nДни \nПо жуткой нехоженой лестнице \nЗашагают быстрей. \nДеревья скоро разлистятся, \nНо станет ясней, \n\nВеснам, \nОбглоданным поэтами, \nПришел конец. \nИ солнцу с черными отметами \nКонец — венец. \n\nИ мы солнце и весны \nПотащим на рынок. \nПотащим чрез гвалты и давку \nИ бросим за тусклый полтинник \nК антиквару в лавку. \n\nВ душах оплеванных, \nНаглых и сильных, \nЕсть алтари. \nИ на них мы затеплим лампады к вечерне \nДо новой зари.",
"(Title:) UNTIL IT DAWNS ANEW \n(Poet:) Vladimir SIllov \n\nDays \nWould walk an untrod morbid staircase \nAt accelerant pace. \nSoon trees splinter in leaflessness, \nAll the clearer it makes, \n\nSpringtimes \nThe poets still nibble on \nGet abruptly pulled down. \nWith the sun, a blotched face nothing beams upon, \nThey come down – are crowned. \n\nAnd this sun with springs \nTo market we’ll bring, \nHoist them over tussle and din, \nAnd for five faded roubles toss them \nTo some antiquarian. \n\nSouls spat on, slandered, \nInsolent, headstrong, \nAltars strew. \nOn them we'll light lamps for vesper nights, \nUntil it dawns anew.",
"(Title:) BEFORE A FRESH DAWN \n(Poet:) Vladimir SIlloff \n\nDaytimes \nOn the eerie, untrodden stairs \nWill stomp faster. \nThe trees will soon blossom, \nBut it will become clearer, \n\nThat spring, \nGnawed away by poets, \nHas come to an end. \nAnd the sun with black marks \nIs the end — the crown. \n\nAnd we will lug the sun and spring \nTo the store. \nWe will lug them through the noise and crush \nAnd throw them for a dull fifty kopecks \nTo the antique dealer's shop. \n\nIn souls spat upon, \nArrogant and strong, \nThere rooms with altars. \nAnd on them we will light lamps for vespers \nUntil a new dawn.",
"(Title:) THE WAYWARD STREETCAR (Stanzas 9-12) \n(Poet:) Nikolay Gumilev \n\nAnd in an alley with wooden-planked fencing, \nI see that house with three windows, gray lawns… \nStreetcar conductor, oh, why don't you stop here; \nHey, stop it now! Why, I have to get off! \n\nMasha, my darling, you lived and you sang here, \nWeaving a rug for the groom, for me; \nWhere is your voice now and where is your body? \nIt cannot be that you died, cannot be?! \n\nBut you were coughing, collapsed by the entrance, \nWhile, with a braid over-powdered and plumed, \nI went to get introduced to the Empress, \nSo that I never again met with you. \n\nI understand now: that all of our freedoms, \nOnly from there, are a light beating through; \nPeople and shades are amassed by the entrance \nInto a garden of planets, a zoo."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from google/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(4): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
"(\u041d\u0430\u0437\u0432\u0430\u043d\u0438\u0435:) \u0420\u0415\u0419\u0413\u0410\u041d-\u041f\u0420\u041e\u0412\u041e\u041a\u0410\u0422\u041e\u0420 \n(\u041f\u043e\u044d\u0442:) \u0410\u043d\u0434\u0440\u0435\u0439 \u201c\u0421\u0432\u0438\u043d\u201d \u041f\u0430\u043d\u043e\u0432 \n\n\u041e\u043f\u044f\u0442\u044c \u043d\u0435\u043b\u0451\u0442\u043d\u0430\u044f \u043f\u043e\u0433\u043e\u0434\u0430, \n\u0418 \u043e\u0442 \u0441\u0443\u0434\u044c\u0431\u044b \u043d\u0435 \u0443\u0431\u0435\u0436\u0430\u0442\u044c. \n\u041c\u044b \u043e\u0431\u044a\u044f\u0432\u043b\u044f\u0435\u043c \u043c\u043e\u0440\u0430\u0442\u043e\u0440\u0438\u0439 \u2014 \n\u0410\u043c\u0435\u0440\u0438\u043a\u0430\u043d\u0446\u0430\u043c \u043d\u0430\u043f\u043b\u0435\u0432\u0430\u0442\u044c\u2026 \n\u0412 \u0416\u0435\u043d\u0435\u0432\u0435 \u0432\u0441\u0435 \u043f\u0435\u0440\u0435\u0433\u043e\u0432\u043e\u0440\u044b \n\u0414\u0430\u0432\u043d\u043e \u0443\u0436\u0435 \u0437\u0430\u0448\u043b\u0438 \u0432 \u0442\u0443\u043f\u0438\u043a. \n\u0420\u0435\u0439\u043a\u044c\u044f\u0432\u0438\u043a \u0434\u0430\u043b \u043f\u043e\u043d\u044f\u0442\u044c \u043d\u0430\u0440\u043e\u0434\u0443 \u2014 \n\u041a\u0430\u043a \u0413\u043e\u0440\u0431\u0430\u0447\u0451\u0432 \u0443 \u043d\u0430\u0441 \u0432\u0435\u043b\u0438\u043a. \n\n\u0410\u2026 \n\u0420\u0435\u0439\u0433\u0430\u043d-\u043f\u0440\u043e\u0432\u043e\u043a\u0430\u0442\u043e\u0440! \u0420\u0435\u0439\u0433\u0430\u043d-\u043f\u0440\u043e\u0432\u043e\u043a\u0430\u0442\u043e\u0440! \n\u0420\u0435\u0439\u0433\u0430\u043d-\u043f\u0440\u043e\u0432\u043e\u043a\u0430\u0442\u043e\u0440! \u0420\u0435\u0439\u0433\u0430\u043d-\u043f\u0440\u043e\u0432\u043e\u043a\u0430\u0442\u043e\u0440! \n\n\u0412 \u041d\u0435\u0432\u0430\u0434\u0435 \u0442\u0430\u043a \u0436\u0435, \u043a\u0430\u043a \u0438 \u0440\u0430\u043d\u044c\u0448\u0435, \n\u0412\u0437\u0440\u044b\u0432\u0430\u044e\u0442 \u043c\u043d\u043e\u0433\u043e \u043a\u0438\u043b\u043e\u0442\u043e\u043d\u043d. \n\u0420\u044d\u0439-\u0433\u0430\u043d \u0443\u0441\u0442\u0440\u043e\u0438\u043b \u0438\u0437 \u043f\u043b\u0430\u043d\u0435\u0442\u044b \n\u041e\u0433\u0440\u043e\u043c\u043d\u044b\u0439 \u043c\u043e\u0449\u043d\u044b\u0439 \u043f\u043e\u043b\u0438\u0433\u043e\u043d. \n\u041d\u0430\u0440\u043e\u0434\u044b \u0432 \u0433\u043e\u043b\u043e\u0434\u0435 \u0438 \u0432 \u0432\u043e\u0439\u043d\u0430\u0445 \n\u0422\u0435\u0440\u044f\u044e\u0442 \u0442\u044b\u0441\u044f\u0447\u0438 \u0436\u0438\u0437\u043d\u0435\u0301\u0439; \n\u0410 \u0432 \u0412\u0430\u0448\u0438\u043d\u0433\u0442\u043e\u043d\u0435, \u0411\u0435\u043b\u043e\u043c \u0434\u043e\u043c\u0435 \n\u0420\u0435\u0448\u0430\u044e\u0442 \u043a\u0430\u043a \u0443\u0431\u0438\u0442\u044c \u0434\u0435\u0442\u0435\u0439. \n\n\u0422\u043e\u0442\u2026 \n\u0420\u0435\u0439\u0433\u0430\u043d-\u043f\u0440\u043e\u0432\u043e\u043a\u0430\u0442\u043e\u0440! \u0420\u0435\u0439\u0433\u0430\u043d-\u043f\u0440\u043e\u0432\u043e\u043a\u0430\u0442\u043e\u0440! \n\u0420\u0435\u0439\u0433\u0430\u043d-\u043f\u0440\u043e\u0432\u043e\u043a\u0430\u0442\u043e\u0440! \u0420\u0435\u0439\u0433\u0430\u043d-\u043f\u0440\u043e\u0432\u043e\u043a\u0430\u0442\u043e\u0440!",
]
documents = [
"(Title:) REAGAN PROVOCATEUR \n(Poet:) Andrey “Swine” Panov \n\nAgain a weather barring travel, \nBut fate not easily escaped, \nOur moratorium is blaring — \nBut USA don't give a shit… \nNegotiations in Geneva, \nAchieve an impasse again, \nReykjavik really showed the people — \nOur Gorbachev the greater man. \n\nAnd… \nReagan — provocateur! Reagan — provocateur! \nReagan — provocateur! Reagan — provocateur! \n\nWhile in Nevada, same as ever, \nThey blow up many kilotons, \nRay-gun is splitting up our planet \nIn mighty weapons testing zones. \nAmid the war and hunger, peoples \nAre losing millions of lives; \nWhile back in Washington, the White House\nTries to ensure no child survives. \n\nExcept… \nReagan — provocateur! Reagan — provocateur! \nReagan — provocateur! Reagan — provocateur!",
"THAT PROVOCATIVE REAGAN \n(Poet:) Andrew “The Pig” Panoff \n\nOnce again, the weather is unfavorable for flying, \nAnd there is no escaping fate. \nWe declare a moratorium— \nThe Americans don't care… \nIn Geneva, all negotiations \nHave long since reached an impasse. \nReykjavik made it clear to the people — \nHow great Gorbachev is to us. \n\nAh... \nReagan the provocateur! Provocateur Reagan! \nReagan the provocateur! Provocateur Reagan! \n\nIn Nevada, just like before, \nThey detonate thousands of kilotons. \nReagan has turned the planet \nInto a huge, powerful testing ground. \nPeople are starving and fighting wars, \nLosing thousands of lives; \nAnd in Washington, in the White House, \nThey decide how to kill children. \n\nThat... \nProvocateur Reagan! Reagan the provocateur! \nHe... Provocateur Reagan! Reagan the provocateur!",
'(Title:) I GROW BITCHY \n(Songwriter: Yanka Dyagileva) \n\nI irrevocably grow bitchy \nEvery night, with every chuckle, \nEvery emptied out glass cup \nI go on boarding up the doors \nAnd letting mean and hungry dogs \nFrom all the chains run freely \nWhat else could we do – \nWe who inherit only kneecaps blistered over \nI irrevocably grow bitchy every time I \n\nI’m educated \nTo be iron barrel’s latched continuation \nOf a rifle the arm shaft \nSit if you wanna \nHave a smoke beside me on a little bench – into the ground staring \n\nWhere else could we go – we who inherit only dirtiest of pathways \nI irrevocably grow bitchy by the hour \n\nI’m irrevocably made bitchy \nEvery sighting of a cop hat, or a fancy mink fur hat \nOut where the wartime never ends, \nWhere springtime never really sets, where childhood never continues, \nWhere else could we turn – we who are left with only dreams and conversations, \nI irrevocably grow bitchy by the hour \nI irrevocably grow bitchy every step I \nI irrevocably grow bitchy every time I',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.1512, -0.5952, 0.8725]])
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
(Название:) ЗВЕЗДА ПО ИМЕНИ СОЛНЦЕ |
(Title:) STAR KNOWN AS THE SUN |
(Title:) THAT SUN NAMED STAR |
(Название:) “Вечером желтым как зрелый колос…“
(Поэт:) Константин Вагинов
Вечером желтым как зрелый колос,
Средь случайных дорожных берез,
Цыганенок плакал голый,
Вспоминал он имя свое;
Но не мог никак он вспомнить –
Кто, откуда, зачем он здесь;
Слышал матери шепот любовный;
Но не видел ее нигде.
На дороге воробьи чирикают –
Чирик, чирик и по дороге скок;
И девушки уносят землянику;
Но завтра солнце озарит восток. | (Title:) “Come one evening, as yellow as ripe grain…“
(Poet:) Konstantin Vaginov
Come one evening, as yellow as ripe grain,
Between birches there strewn by the road,
Wept a small gypsy boy, bare-naked,
For his name he no longer recalled;
How he tried to, but couldn’t remember –
Wherefrom, or wherefore, he’d fared;
Heard his mother whispering tender;
He would look, but she wasn’t there.
On the road, baby sparrows chirp past him –
Chirpy-chirpy, up the road they would frisk;
Girls are passing with strawberry baskets;
But tomorrow, the sun would flood the east. | At Night…
(Poetry:) Vaginoff
At night, as yellowy as ripened ears of corn,
Amidst random roadside birches,
A naked gypsy-wanderer child cried.
He remembered his name,
But he couldn't remember
Who he was, where he came from, why he was here.
He heard his mommy’s loving speaking under her breath,
But he couldn't see her around him.
Sparrows squeak on the highway
Squeaking, squeaking, and bounce along the highway –
And women carry away strawberries,
But the very next day the sun will shine in the eastern direction. |
| (Название:) НА ЧËРНЫЙ ДЕНЬ
(Поэт:) Янка Дягилева
На чёрный день усталый танец пьяных глаз, дырявых рук
Второй упал, четвёртый сел, восьмого вывели на круг
На провода из-под колёс да на три буквы из-под асфальта
В тихий омут буйной головой
В холодный пот — расходятся круги
Железный конь, защитный цвет, резные гусеницы в ряд
Аттракцион для новичков — по кругу лошади летят
А заводной калейдоскоп гремит кривыми зеркалами
Колесо вращается быстрей
Под звуки марша головы долой
Поела моль цветную шаль, на картах тройка и семёрка
Бык хвостом сгоняя мух с тяжёлым сердцем лезет в горку
Лбов бильярдные шары от столкновения раскатились
Пополам по обе стороны
Да по углам просторов и широт
А за осколками витрин обрывки праздничных нарядов
Под полозьями саней живая плоть чужих раскладов
За прилавком попугай из шапки достаёт билеты на трамвай
До ближнего моста
На вертолёт без окон и дверей —
В тихий омут буйной головой —
Колесо вращается быстрей... | (Title:) COME RAINY DAY
(Poet:) Yanka Dyaghileva
Come rainy day a weary dance of drunken eyes, of clumsy hands
Odd person fell, each fourth was penned, each eighth was rounded for a spin
Across the wiring under wheels and triple letters under asphalt
Quiet washers whirlpool rowdy heads
Unto cold sweat — their spinners gather steam
The iron horse, its armored shade, etched rows of caterpillar tracks
A ride for novices designed — such horses flying endless rounds
While the kaleidoscope on cogs flips grating crooked fun house mirrors
And the wheel is spinning faster yet
Off with their heads, off to the marching band
A moth consumed the rainbow shawl, a three then seven in the cards
While swatting flies under its tail a bull grave hearted climbs a mount
Of forehead lobes, like billiard balls from a collision scattered
Over either side in equal portioned parts
And into nooks of latitudes and views
While over shattered store displays hang scraps of celebration wear
And... | (Title:) ON A DARK DAY
(Poet:) Yanka Dyagileva
On a dark day, the exhausted jig of intoxicated looks, hole-covered hands
The second collapsed, the fourth sat down, every eighth was led into the circle
On wires from under the tracks and on three letters from under the concrete
Into a quiet pool with a wild head
In chilled sweat frothing — circles spread
An metallic horsey, protective color, carved caterpillars in a row
An attraction for beginners — horses fly in a circle
And a wind-up kaleidoscope rattles with crooked mirrors
The wheel spins faster
To the sounds of the march, heads down
The moth ate the colorful fabric, on the cards appear a three and then a seven
The bull brushes flies with its tail and climbs the hill with a heavy spirit
Pool balls roll away from the bounce
In half on both sides
And in the corners of spaces and meridians
And behind the shards of shop windows, scraps of festive costumes
Under the runners of the sleigh, the living flesh of stranger... |MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
per_device_train_batch_size: 1learning_rate: 2e-05num_train_epochs: 4warmup_ratio: 0.1prompts: task: classification | query:overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 1per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: task: classification | query: batch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 1.0 | 43 | 0.231 |
| 2.0 | 86 | 0.0008 |
| 3.0 | 129 | 0.0002 |
| 4.0 | 172 | 0.0 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
google/embeddinggemma-300m