B0ketto commited on
Commit
f0a0665
·
verified ·
1 Parent(s): 3ac152b

garima77622/simaese2.0

Browse files
README.md CHANGED
@@ -4,80 +4,80 @@ tags:
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
- - dataset_size:98546
8
  - loss:ContrastiveLoss
9
- base_model: sentence-transformers/all-mpnet-base-v2
10
  widget:
11
- - source_sentence: La soberanía nacional está reconocida en el conjunto de españoles,
12
- y significa que atañe a todos decidir al respecto, ya que todos forman el conjunto
13
- de la comunidad política. De hecho, el sistema autónomico funciona como una cascada
14
- competencial desde esa base.
15
  sentences:
16
- - Donde la mayoría de los votantes del no, no fueron por no ser vinculante.
17
- - El subsidio de Cataluña a ciertas regiones con menor nivel de bienestar resulta
18
- perverso para estas regiones, haciendo su economía menos competitiva y generadora
19
- de riqueza.
20
- - No creo que aporte al debate, Cataluña quiere separarse precisamente para alejarse
21
- de un sistema monolítico , anticuado que favorece la aparición de corrupción.
22
- - source_sentence: La tendencia mundial desde 1945 ha sido la de aumentar el número
23
- de estados soberanos, no la de disminuirlo. A pesar de ello la globalización \(economía,
24
- comunicaciones, transportes, migraciones, cultura, etc.\) nos ha traído más desarrollo.
25
  sentences:
26
- - No es realista confiar en que un país de apenas 7.7 millones de habitantes pueda
27
- sentarse a negociar con países diez o cien veces más grandes en una relación de
28
- igualdad o simetría.
29
- - 'A pesar de que ha aumentado el número de estados, las naciones más grandes, o
30
- federaciones, son las que han tenido más desarrollo: China, Rusia o EEUU, por
31
- ejemplo. Solo en bloque la Unión Europea ha podido competir de a igual por los
32
- mercados de terceras naciones.'
33
- - La independencia es una oportunidad para que España se saque de encima el régimen
34
- del 78, y construya una tercera república social, participativa y justa.
35
- - source_sentence: El Gobierno de Cataluña no está por encima de la ley catalana.
36
- En caso de independencia habría un tribunal superior en Cataluña independiente
37
- que puede anular leyes del parlamento. Esto pasa en cualquier estado democrático
38
- moderno.
 
39
  sentences:
40
- - Pero el tribunal sería un tribunal catalán, no un tribunal español.
41
- - Cataluña quiere entrar en la Unión Europea. El futuro de la UE es posible que
42
- pase por una unión fiscal en la que la voluntad de la mayoría se impondrá por
43
- encima de países pequeños como una posible Cataluña.
44
- - Si el nacionalismo es malo, y España es una nación, es malo pertenecer a España,
45
- por lo que el nacionalismo catalán es tan bueno o malo como el nacionalismo español.
46
- - source_sentence: Si la independencia se decidiera en referendum, todos los españoles
47
- deberían votar esa decisión ya que afecta a todos, y Cataluña ha sido España desde
48
- 1492.
 
 
 
 
 
49
  sentences:
50
- - Los partidarios del 155 no han tenido mayoría de votos
51
- - Cataluña forma parte de España como estado soberano desde 1715. Antes "España"
52
- era, desde 1492, un concepto para dar nombre a una unión dinástica, el equivalente
53
- de lo que hoy en día sería una federación o unión entre estados como la UE.
54
- - 'Hay muchos ejemplos de estados pequeños exitosos: Singapur, Taiwan, Costa Rica,
55
- Nueva Zelanda, paises escandinavos, Austria, Bélgica...'
56
- - source_sentence: Si el gobierno empezara a centralizar tendría enfrente a muchos
57
- más regiones que a Cataluña. Si el problema fuese únicamente catalán, la independencia
58
- podría ser una solución, pero hay otros muchos territorios incómodos que van a
59
- llegar más pronto que tarde a la misma conclusión. La confederación tiene mucho
60
- más futuro, ya que al final los territorios se van a necesitar.
61
  sentences:
62
- - Para contrarrestar el riesgo de descentralización, la futura España sin Cataluña
63
- debería federalizarse, a pesar de que ello implica un enorme cuestionamiento del
64
- sistema de poder español actual.
65
- - Cataluña tienen grandes deportistas, sin embargo la mayoría han ganado representando
66
- a la selección Española.
67
- - La Constitución española se fundamenta en la indisoluble unidad de la Nación española.
68
  pipeline_tag: sentence-similarity
69
  library_name: sentence-transformers
70
  ---
71
 
72
- # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
73
 
74
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
75
 
76
  ## Model Details
77
 
78
  ### Model Description
79
  - **Model Type:** Sentence Transformer
80
- - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision 9a3225965996d404b775526de6dbfe85d3368642 -->
81
  - **Maximum Sequence Length:** 384 tokens
82
  - **Output Dimensionality:** 768 dimensions
83
  - **Similarity Function:** Cosine Similarity
@@ -119,9 +119,9 @@ from sentence_transformers import SentenceTransformer
119
  model = SentenceTransformer("sentence_transformers_model_id")
120
  # Run inference
121
  sentences = [
122
- 'Si el gobierno empezara a centralizar tendría enfrente a muchos más regiones que a Cataluña. Si el problema fuese únicamente catalán, la independencia podría ser una solución, pero hay otros muchos territorios incómodos que van a llegar más pronto que tarde a la misma conclusión. La confederación tiene mucho más futuro, ya que al final los territorios se van a necesitar.',
123
- 'Para contrarrestar el riesgo de descentralización, la futura España sin Cataluña debería federalizarse, a pesar de que ello implica un enorme cuestionamiento del sistema de poder español actual.',
124
- 'La Constitución española se fundamenta en la indisoluble unidad de la Nación española.',
125
  ]
126
  embeddings = model.encode(sentences)
127
  print(embeddings.shape)
@@ -175,19 +175,19 @@ You can finetune this model on your own dataset.
175
 
176
  #### Unnamed Dataset
177
 
178
- * Size: 98,546 training samples
179
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
180
  * Approximate statistics based on the first 1000 samples:
181
- | | sentence1 | sentence2 | label |
182
- |:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------|
183
- | type | string | string | int |
184
- | details | <ul><li>min: 12 tokens</li><li>mean: 55.71 tokens</li><li>max: 173 tokens</li></ul> | <ul><li>min: 12 tokens</li><li>mean: 70.67 tokens</li><li>max: 180 tokens</li></ul> | <ul><li>0: ~60.00%</li><li>1: ~40.00%</li></ul> |
185
  * Samples:
186
- | sentence1 | sentence2 | label |
187
- |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
188
- | <code>La soberanía y la decisión sobre la unidad de España residen en el conjunto de España.</code> | <code>Apostar por un proceso de secesión es ir en contra de la globalización, la corriente histórica que vivimos.</code> | <code>1</code> |
189
- | <code>Apostar por un proceso de secesión es ir en contra de la globalización, la corriente histórica que vivimos.</code> | <code>La independencia de Cataluña choca contra el ideal consistente en que la humanidad como especie debe evolucionar a estar más unida, favoreciendo el intercambio científico y tecnológico.</code> | <code>1</code> |
190
- | <code>La independencia de Cataluña choca contra el ideal consistente en que la humanidad como especie debe evolucionar a estar más unida, favoreciendo el intercambio científico y tecnológico.</code> | <code>Los pueblos deben estar unidos y favorecer el diálogo para solucionar problemas que importan y mejorar así la convivencia.</code> | <code>1</code> |
191
  * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
192
  ```json
193
  {
@@ -323,79 +323,55 @@ You can finetune this model on your own dataset.
323
  ### Training Logs
324
  | Epoch | Step | Training Loss |
325
  |:------:|:-----:|:-------------:|
326
- | 0.0406 | 500 | 0.0312 |
327
- | 0.0812 | 1000 | 0.028 |
328
- | 0.1218 | 1500 | 0.027 |
329
- | 0.1624 | 2000 | 0.0308 |
330
- | 0.2029 | 2500 | 0.0281 |
331
- | 0.2435 | 3000 | 0.0267 |
332
- | 0.2841 | 3500 | 0.0268 |
333
- | 0.3247 | 4000 | 0.0266 |
334
- | 0.3653 | 4500 | 0.0274 |
335
- | 0.4059 | 5000 | 0.027 |
336
- | 0.4465 | 5500 | 0.0273 |
337
- | 0.4871 | 6000 | 0.0264 |
338
- | 0.5276 | 6500 | 0.0266 |
339
- | 0.5682 | 7000 | 0.0259 |
340
- | 0.6088 | 7500 | 0.0263 |
341
- | 0.6494 | 8000 | 0.0272 |
342
- | 0.6900 | 8500 | 0.0272 |
343
- | 0.7306 | 9000 | 0.0268 |
344
- | 0.7712 | 9500 | 0.0276 |
345
- | 0.8118 | 10000 | 0.0263 |
346
- | 0.8523 | 10500 | 0.0258 |
347
- | 0.8929 | 11000 | 0.025 |
348
- | 0.9335 | 11500 | 0.0262 |
349
- | 0.9741 | 12000 | 0.0287 |
350
- | 1.0147 | 12500 | 0.025 |
351
- | 1.0553 | 13000 | 0.0236 |
352
- | 1.0959 | 13500 | 0.0231 |
353
- | 1.1365 | 14000 | 0.0241 |
354
- | 1.1770 | 14500 | 0.0231 |
355
- | 1.2176 | 15000 | 0.0236 |
356
- | 1.2582 | 15500 | 0.0242 |
357
- | 1.2988 | 16000 | 0.0232 |
358
- | 1.3394 | 16500 | 0.0234 |
359
- | 1.3800 | 17000 | 0.0231 |
360
- | 1.4206 | 17500 | 0.0241 |
361
- | 1.4612 | 18000 | 0.0261 |
362
- | 1.5017 | 18500 | 0.0243 |
363
- | 1.5423 | 19000 | 0.0231 |
364
- | 1.5829 | 19500 | 0.0222 |
365
- | 1.6235 | 20000 | 0.0224 |
366
- | 1.6641 | 20500 | 0.0232 |
367
- | 1.7047 | 21000 | 0.0225 |
368
- | 1.7453 | 21500 | 0.022 |
369
- | 1.7859 | 22000 | 0.0216 |
370
- | 1.8264 | 22500 | 0.0215 |
371
- | 1.8670 | 23000 | 0.0219 |
372
- | 1.9076 | 23500 | 0.0217 |
373
- | 1.9482 | 24000 | 0.0218 |
374
- | 1.9888 | 24500 | 0.0217 |
375
- | 2.0294 | 25000 | 0.019 |
376
- | 2.0700 | 25500 | 0.0196 |
377
- | 2.1106 | 26000 | 0.0187 |
378
- | 2.1511 | 26500 | 0.0191 |
379
- | 2.1917 | 27000 | 0.0186 |
380
- | 2.2323 | 27500 | 0.0183 |
381
- | 2.2729 | 28000 | 0.0184 |
382
- | 2.3135 | 28500 | 0.0181 |
383
- | 2.3541 | 29000 | 0.0191 |
384
- | 2.3947 | 29500 | 0.0177 |
385
- | 2.4353 | 30000 | 0.0181 |
386
- | 2.4759 | 30500 | 0.0181 |
387
- | 2.5164 | 31000 | 0.0173 |
388
- | 2.5570 | 31500 | 0.0181 |
389
- | 2.5976 | 32000 | 0.0177 |
390
- | 2.6382 | 32500 | 0.0179 |
391
- | 2.6788 | 33000 | 0.0172 |
392
- | 2.7194 | 33500 | 0.0182 |
393
- | 2.7600 | 34000 | 0.0176 |
394
- | 2.8006 | 34500 | 0.0167 |
395
- | 2.8411 | 35000 | 0.0173 |
396
- | 2.8817 | 35500 | 0.0175 |
397
- | 2.9223 | 36000 | 0.0171 |
398
- | 2.9629 | 36500 | 0.017 |
399
 
400
 
401
  ### Framework Versions
 
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
+ - dataset_size:65698
8
  - loss:ContrastiveLoss
9
+ base_model: B0ketto/tmp_trainer
10
  widget:
11
+ - source_sentence: Enforcement of minor traffic offenses leads to the discovery of
12
+ more serious crimes.
 
 
13
  sentences:
14
+ - Western culture has created independent women who are strong on their own and
15
+ do not need the protection or support of their husband. This reduces the subjugation
16
+ of women.
17
+ - Philando Castile, stopped for a broken tailight, was shot seven times and killed
18
+ trying to comply with the officer's request for identification.
19
+ - The children will have several older / more mature stepmothers.
20
+ - source_sentence: Women and men can always file for divorce.
 
 
21
  sentences:
22
+ - A partner having multiple partners is taken care of enough. There is probably
23
+ less need to find even more partners. This is also a matter of free time, when
24
+ having multiple partners free time is probably rare.
25
+ - The power relations in polygamous marriages should be even more favorable to female
26
+ sponsored divorce as it is more likely that women can keep their children while
27
+ at the same time the man becomes less dependent on one woman emotionally.
28
+ - People close to the individual who commits suicide may feel that they could and
29
+ should have done more to prevent it, thus leaving them with intense feelings of
30
+ guilt.
31
+ - source_sentence: 'It''s okay that specific groups of people are not allowed to vote.
32
+ For example: children aren''t usually allowed to vote, because they are considered
33
+ too young - too inexperienced. The same kind of logic could be used to "filter
34
+ out" people who have very little knowledge of the world or terrible analytical
35
+ capabilities.'
36
  sentences:
37
+ - Those who have a medically diagnosed incapacity for voting should not be allowed
38
+ to vote, because they may be far more easily swayed to vote one way or another.
39
+ However, this must be regulated to medically diagnosed conditions on a mental
40
+ level.
41
+ - Representation is foundational to the American DNA. "No taxation without representation"
42
+ is one of our oldest grievance slogans. Removing the ability of any group to vote
43
+ reinstates this 400-year old injustice.
44
+ - Retailers would supposedly be able to sell the discarded bottles on, thereby making
45
+ a profit after the initial investment into the necessary infrastructure.
46
+ - source_sentence: 'It''s okay that specific groups of people are not allowed to vote.
47
+ For example: children aren''t usually allowed to vote, because they are considered
48
+ too young - too inexperienced. The same kind of logic could be used to "filter
49
+ out" people who have very little knowledge of the world or terrible analytical
50
+ capabilities.'
51
  sentences:
52
+ - Planned Parenthood is not only offering abortions but a host of other services,
53
+ such as clinical breast examination.
54
+ - Some budgetary problems for local law enforcement would be alleviated by removing
55
+ proactive policing duties from the officer's mission.
56
+ - The benefit is to keep those who you do not wish to vote, unable to pass the test.
57
+ This can lead to education suppression, as an example. There are vast amounts
58
+ of education imbalance which can be furthered to suppress votes from those who
59
+ wish to change the system-- ergo, suppressing those who would wrest power from
60
+ those who wish to maintain it through unfair means.
61
+ - source_sentence: For children, it is bad to grow up in a polygamous family.
 
62
  sentences:
63
+ - Polygamous families tend to have more children.
64
+ - The right of adults to marry should not be precluded by a person's distaste for
65
+ their marital structure. The same argument is used against same-sex marriage,
66
+ and it is invariably irrelevant.
67
+ - This threatens the idea of true democracy.
 
68
  pipeline_tag: sentence-similarity
69
  library_name: sentence-transformers
70
  ---
71
 
72
+ # SentenceTransformer based on B0ketto/tmp_trainer
73
 
74
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [B0ketto/tmp_trainer](https://huggingface.co/B0ketto/tmp_trainer). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
75
 
76
  ## Model Details
77
 
78
  ### Model Description
79
  - **Model Type:** Sentence Transformer
80
+ - **Base model:** [B0ketto/tmp_trainer](https://huggingface.co/B0ketto/tmp_trainer) <!-- at revision 3ac152b5b7c2227049ce77084d6de8c3b57acc4a -->
81
  - **Maximum Sequence Length:** 384 tokens
82
  - **Output Dimensionality:** 768 dimensions
83
  - **Similarity Function:** Cosine Similarity
 
119
  model = SentenceTransformer("sentence_transformers_model_id")
120
  # Run inference
121
  sentences = [
122
+ 'For children, it is bad to grow up in a polygamous family.',
123
+ 'Polygamous families tend to have more children.',
124
+ 'This threatens the idea of true democracy.',
125
  ]
126
  embeddings = model.encode(sentences)
127
  print(embeddings.shape)
 
175
 
176
  #### Unnamed Dataset
177
 
178
+ * Size: 65,698 training samples
179
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
180
  * Approximate statistics based on the first 1000 samples:
181
+ | | sentence1 | sentence2 | label |
182
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
183
+ | type | string | string | int |
184
+ | details | <ul><li>min: 7 tokens</li><li>mean: 25.0 tokens</li><li>max: 130 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 31.05 tokens</li><li>max: 130 tokens</li></ul> | <ul><li>0: ~55.50%</li><li>1: ~44.50%</li></ul> |
185
  * Samples:
186
+ | sentence1 | sentence2 | label |
187
+ |:----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
188
+ | <code>Public opinion favors euthanasia which suggests some support for a right to die.</code> | <code>Europeans generally support euthanasia. For example, more than 70% of citizens of Spain, Germany, France and Britain are in favor.</code> | <code>1</code> |
189
+ | <code>Public opinion favors euthanasia which suggests some support for a right to die.</code> | <code>In the US, support for assisted suicide has risen to 69% acceptance rate in the last few decades.</code> | <code>1</code> |
190
+ | <code>Public opinion favors euthanasia which suggests some support for a right to die.</code> | <code>The young and healthy that are asked in polls cannot imagine a situation of disability. This, so the criticism goes, blurs their image of euthanasia.</code> | <code>0</code> |
191
  * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
192
  ```json
193
  {
 
323
  ### Training Logs
324
  | Epoch | Step | Training Loss |
325
  |:------:|:-----:|:-------------:|
326
+ | 0.0609 | 500 | 0.0256 |
327
+ | 0.1218 | 1000 | 0.0257 |
328
+ | 0.1826 | 1500 | 0.0263 |
329
+ | 0.2435 | 2000 | 0.0291 |
330
+ | 0.3044 | 2500 | 0.0276 |
331
+ | 0.3653 | 3000 | 0.0304 |
332
+ | 0.4262 | 3500 | 0.0297 |
333
+ | 0.4870 | 4000 | 0.0332 |
334
+ | 0.5479 | 4500 | 0.033 |
335
+ | 0.6088 | 5000 | 0.0328 |
336
+ | 0.6697 | 5500 | 0.0328 |
337
+ | 0.7305 | 6000 | 0.0331 |
338
+ | 0.7914 | 6500 | 0.0321 |
339
+ | 0.8523 | 7000 | 0.0326 |
340
+ | 0.9132 | 7500 | 0.0329 |
341
+ | 0.9741 | 8000 | 0.0318 |
342
+ | 1.0349 | 8500 | 0.0323 |
343
+ | 1.0958 | 9000 | 0.0321 |
344
+ | 1.1567 | 9500 | 0.0321 |
345
+ | 1.2176 | 10000 | 0.0322 |
346
+ | 1.2785 | 10500 | 0.0321 |
347
+ | 1.3393 | 11000 | 0.0317 |
348
+ | 1.4002 | 11500 | 0.0317 |
349
+ | 1.4611 | 12000 | 0.0315 |
350
+ | 1.5220 | 12500 | 0.0318 |
351
+ | 1.5829 | 13000 | 0.0319 |
352
+ | 1.6437 | 13500 | 0.0315 |
353
+ | 1.7046 | 14000 | 0.0313 |
354
+ | 1.7655 | 14500 | 0.0294 |
355
+ | 1.8264 | 15000 | 0.0292 |
356
+ | 1.8873 | 15500 | 0.0278 |
357
+ | 1.9481 | 16000 | 0.0286 |
358
+ | 2.0090 | 16500 | 0.0274 |
359
+ | 2.0699 | 17000 | 0.0273 |
360
+ | 2.1308 | 17500 | 0.027 |
361
+ | 2.1916 | 18000 | 0.0271 |
362
+ | 2.2525 | 18500 | 0.0265 |
363
+ | 2.3134 | 19000 | 0.0262 |
364
+ | 2.3743 | 19500 | 0.0254 |
365
+ | 2.4352 | 20000 | 0.0255 |
366
+ | 2.4960 | 20500 | 0.0256 |
367
+ | 2.5569 | 21000 | 0.0252 |
368
+ | 2.6178 | 21500 | 0.0246 |
369
+ | 2.6787 | 22000 | 0.0251 |
370
+ | 2.7396 | 22500 | 0.0238 |
371
+ | 2.8004 | 23000 | 0.025 |
372
+ | 2.8613 | 23500 | 0.0247 |
373
+ | 2.9222 | 24000 | 0.0252 |
374
+ | 2.9831 | 24500 | 0.0237 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
375
 
376
 
377
  ### Framework Versions
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "sentence-transformers/all-mpnet-base-v2",
3
  "architectures": [
4
  "MPNetModel"
5
  ],
 
1
  {
2
+ "_name_or_path": "B0ketto/tmp_trainer",
3
  "architectures": [
4
  "MPNetModel"
5
  ],
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f2f135943cdb35b8ec7a07682bb1846bdc5d5e961e0bf1fb49c6017551b4b171
3
  size 437967672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5f790f2dff8725951c8abc5046b6e8aa2483bd154d0572d75dcce97dc72028d
3
  size 437967672
runs/Feb20_07-46-02_0d37366320af/events.out.tfevents.1740037564.0d37366320af.2267.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:15aff979b7352056db423f686145fdb71ef4b77aaa8b8a9b4f131e86b3d65928
3
+ size 15142
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4051f632cedbd8c281ef799ae9edbff5498f73a2d202250def95a47ecfd77331
3
  size 5560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5edd33958de292b0848e6d5bd07f04696391e3f89b4a367d175021c4bf137afd
3
  size 5560