PsyEmbedding
Collection
4 items • Updated
How to use Culture-and-Morality-Lab/psyembedding-bert-large-uncased with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Culture-and-Morality-Lab/psyembedding-bert-large-uncased")
sentences = [
"I really shy away from early voting narratives, which I got into in another question thats posted here (I think!) The polls could absolutely be missing a meaningful chunk of Trumps support, as was the case in 16 and 20. Though its worth noting that he is polling much better this time around than the last two times. That could be an indication that the issue has been resolved, through some combination of new polling methods and more willingness of Trump supporters to answer polls and state their support for him. (I talked with one pollster who believes he missed in 20 because theyd get Trump voters on the phone but, for whatever reason, they wouldnt say they were voting for him. Now when voters hesitate, this pollster nudges them a little to pick a candidate. He thinks its fixed the problem. Well see.)\n\n\n\nI did write about some of the polling issues here:\n\n[",
"Those are ways to cope individually but to actually abolish the institution of capitalist employment relationships there has to be collective action. You can call it however you want, socialism/communism happens to be a banner that people who held these ideas have been fighting under for many many years.",
"everyones scared shitless of trump winning. every single one of us is voting again and then add in new voters and R voters going D and its a landslide win.",
"I know, that's why I said one of the reasons. The main reason is his uselessness when your team is bad"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model trained. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Every American poet feels that the whole responsibility for contemporary poetry has fallen upon his shoulders, that he is a literary aristocracy of one.',
"I'm not sure why Spike Lee made this train wreck of a movie and conned poor Stevie Wonder into eternally pairing his beautiful music with this theatrical mess. I also resent the way he uses profanity as a part of the normal prose of professional Blacks. The abuse of his hold on ethnic movie goers is a shame. Scenes which seem to be contrived out the blue and have nothing to do with the theme or sub themes, play as if some college kid wrote this. I especially detest the ludicrous scene where the two leads are playfully sparring for no reason at all and the cops come and rough up Snipes. The overacting of the leads makes one feel as if Spike has no respect for his viewers or he has no clue what a movie is all about. The final scene appears to be thrown in to justify the use of a sledge hammer to tack a point in. This movie also supports the myth that all people of culture use the F-word in casual conversation. I am hoping he will realize that the rest of his movies are in the same pool as this one where he is not growing as a film maker. I think his union with Scorcesee in Clockers was a wise move. He should stick to making documentaries like the Four Little Colored Girls. Shock movies do not an Oscar make.",
'Many thousands of youth have been deprived of the benefit of education thereby, their morals ruined, and talents irretrievably lost to society, for want of cultivation: while two parties have been idly contending who should bestow it.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6358, 0.5940],
# [0.6358, 1.0000, 0.4347],
# [0.5940, 0.4347, 1.0000]])
similarityEmbeddingSimilarityEvaluator| Metric | Value |
|---|---|
| pearson_cosine | 0.3694 |
| spearman_cosine | 0.3901 |
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
I AM ANGRY AT YOU BILLJ! YOU GOT PEOPLE BLOCKED FOR AS LONG AS YOU LIVE! I ASKED YOU TO STOP DELETING MY EDITS OR I WILL BLOCK YOU FOR ALl EONS YOU ASSHOLE! WIKIPEDIA IS NOT CENSORED SO STOP REMOVING MY FUCKING MESSAGES OR I WILL BEAT YOU UP SILLY! |
The thing is i don't see any shyness from people supporting far right anymore. The life of avg Joe became signicantly shittier after covid and global conflicts. They are very vocal about their distaste. And they blame lefties and immigrants for their problems. So they are vocal and very organized. |
0.4082482904638631 |
I understand that you may be confused, but you still shouldn't judge someone's sexual identity. Just because they haven't acted on all of their sexual inclinations, doesn't mean that they don't still have those feelings. Accept others as they present themselves. |
The head of the Mormon church has married same sex couples in the temple because they were close family. It's all about $$$. |
0.5773502691896258 |
Ugh there is so many bad decisions by conservative judges that need to be undone. |
you say waste a draft pick on Manziel when we have Mallet. that's why I'm telling you to delete your account. You're retarded |
0.5773502691896258 |
CosineSimilarityLoss with these parameters:{
"loss_fct": "torch.nn.modules.loss.MSELoss"
}
eval_strategy: stepsper_device_train_batch_size: 32per_device_eval_batch_size: 32fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | similarity_spearman_cosine |
|---|---|---|---|
| 0.0286 | 10 | - | -0.0664 |
| 0.0571 | 20 | - | -0.0621 |
| 0.0857 | 30 | - | -0.0581 |
| 0.1143 | 40 | - | -0.0516 |
| 0.1429 | 50 | - | -0.0444 |
| 0.1714 | 60 | - | -0.0334 |
| 0.2 | 70 | - | -0.0194 |
| 0.2286 | 80 | - | -0.0061 |
| 0.2571 | 90 | - | 0.0177 |
| 0.2857 | 100 | - | 0.0317 |
| 0.3143 | 110 | - | 0.0510 |
| 0.3429 | 120 | - | 0.0667 |
| 0.3714 | 130 | - | 0.0892 |
| 0.4 | 140 | - | 0.1206 |
| 0.4286 | 150 | - | 0.1584 |
| 0.4571 | 160 | - | 0.1821 |
| 0.4857 | 170 | - | 0.1716 |
| 0.5143 | 180 | - | 0.1749 |
| 0.5429 | 190 | - | 0.2192 |
| 0.5714 | 200 | - | 0.2473 |
| 0.6 | 210 | - | 0.2399 |
| 0.6286 | 220 | - | 0.2419 |
| 0.6571 | 230 | - | 0.2637 |
| 0.6857 | 240 | - | 0.2672 |
| 0.7143 | 250 | - | 0.2754 |
| 0.7429 | 260 | - | 0.2942 |
| 0.7714 | 270 | - | 0.3079 |
| 0.8 | 280 | - | 0.3079 |
| 0.8286 | 290 | - | 0.3077 |
| 0.8571 | 300 | - | 0.3012 |
| 0.8857 | 310 | - | 0.3148 |
| 0.9143 | 320 | - | 0.3199 |
| 0.9429 | 330 | - | 0.3306 |
| 0.9714 | 340 | - | 0.3363 |
| 1.0 | 350 | - | 0.3419 |
| 1.0286 | 360 | - | 0.3402 |
| 1.0571 | 370 | - | 0.3366 |
| 1.0857 | 380 | - | 0.3402 |
| 1.1143 | 390 | - | 0.3360 |
| 1.1429 | 400 | - | 0.3371 |
| 1.1714 | 410 | - | 0.3536 |
| 1.2 | 420 | - | 0.3268 |
| 1.2286 | 430 | - | 0.3443 |
| 1.2571 | 440 | - | 0.3011 |
| 1.2857 | 450 | - | 0.3549 |
| 1.3143 | 460 | - | 0.3321 |
| 1.3429 | 470 | - | 0.3505 |
| 1.3714 | 480 | - | 0.3412 |
| 1.4 | 490 | - | 0.3337 |
| 1.4286 | 500 | 0.1211 | 0.3488 |
| 1.4571 | 510 | - | 0.3486 |
| 1.4857 | 520 | - | 0.3508 |
| 1.5143 | 530 | - | 0.3561 |
| 1.5429 | 540 | - | 0.3592 |
| 1.5714 | 550 | - | 0.2950 |
| 1.6 | 560 | - | 0.3287 |
| 1.6286 | 570 | - | 0.3369 |
| 1.6571 | 580 | - | 0.3407 |
| 1.6857 | 590 | - | 0.3283 |
| 1.7143 | 600 | - | 0.3547 |
| 1.7429 | 610 | - | 0.3665 |
| 1.7714 | 620 | - | 0.3459 |
| 1.8 | 630 | - | 0.3614 |
| 1.8286 | 640 | - | 0.3514 |
| 1.8571 | 650 | - | 0.3714 |
| 1.8857 | 660 | - | 0.3647 |
| 1.9143 | 670 | - | 0.3601 |
| 1.9429 | 680 | - | 0.3292 |
| 1.9714 | 690 | - | 0.3321 |
| 2.0 | 700 | - | 0.3624 |
| 2.0286 | 710 | - | 0.3605 |
| 2.0571 | 720 | - | 0.3702 |
| 2.0857 | 730 | - | 0.3783 |
| 2.1143 | 740 | - | 0.3788 |
| 2.1429 | 750 | - | 0.3813 |
| 2.1714 | 760 | - | 0.3736 |
| 2.2 | 770 | - | 0.3762 |
| 2.2286 | 780 | - | 0.3804 |
| 2.2571 | 790 | - | 0.3805 |
| 2.2857 | 800 | - | 0.3755 |
| 2.3143 | 810 | - | 0.3647 |
| 2.3429 | 820 | - | 0.3654 |
| 2.3714 | 830 | - | 0.3767 |
| 2.4 | 840 | - | 0.3727 |
| 2.4286 | 850 | - | 0.3824 |
| 2.4571 | 860 | - | 0.3660 |
| 2.4857 | 870 | - | 0.3791 |
| 2.5143 | 880 | - | 0.3723 |
| 2.5429 | 890 | - | 0.3818 |
| 2.5714 | 900 | - | 0.3861 |
| 2.6 | 910 | - | 0.3861 |
| 2.6286 | 920 | - | 0.3857 |
| 2.6571 | 930 | - | 0.3825 |
| 2.6857 | 940 | - | 0.3680 |
| 2.7143 | 950 | - | 0.3750 |
| 2.7429 | 960 | - | 0.3815 |
| 2.7714 | 970 | - | 0.3851 |
| 2.8 | 980 | - | 0.3879 |
| 2.8286 | 990 | - | 0.3863 |
| 2.8571 | 1000 | 0.1033 | 0.3818 |
| 2.8857 | 1010 | - | 0.3882 |
| 2.9143 | 1020 | - | 0.3896 |
| 2.9429 | 1030 | - | 0.3899 |
| 2.9714 | 1040 | - | 0.3901 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}