romain125 commited on
Commit
3bcea80
·
verified ·
1 Parent(s): e240d27

End of training

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +1 -0
  2. 1_Pooling/config.json +10 -0
  3. README.md +430 -0
  4. config.json +26 -0
  5. config_sentence_transformers.json +10 -0
  6. eval/binary_classification_evaluation_BinaryClassifEval_results.csv +15 -0
  7. model.safetensors +3 -0
  8. modules.json +20 -0
  9. my_test/chunks/england.c-0.json +1 -0
  10. my_test/chunks/england.c-1.json +1 -0
  11. my_test/chunks/england.c-10.json +1 -0
  12. my_test/chunks/england.c-11.json +1 -0
  13. my_test/chunks/england.c-12.json +1 -0
  14. my_test/chunks/england.c-13.json +1 -0
  15. my_test/chunks/england.c-14.json +1 -0
  16. my_test/chunks/england.c-15.json +1 -0
  17. my_test/chunks/england.c-16.json +1 -0
  18. my_test/chunks/england.c-17.json +1 -0
  19. my_test/chunks/england.c-18.json +1 -0
  20. my_test/chunks/england.c-19.json +1 -0
  21. my_test/chunks/england.c-2.json +1 -0
  22. my_test/chunks/england.c-20.json +1 -0
  23. my_test/chunks/england.c-21.json +1 -0
  24. my_test/chunks/england.c-22.json +1 -0
  25. my_test/chunks/england.c-23.json +1 -0
  26. my_test/chunks/england.c-24.json +1 -0
  27. my_test/chunks/england.c-25.json +1 -0
  28. my_test/chunks/england.c-26.json +1 -0
  29. my_test/chunks/england.c-27.json +1 -0
  30. my_test/chunks/england.c-28.json +1 -0
  31. my_test/chunks/england.c-29.json +1 -0
  32. my_test/chunks/england.c-3.json +1 -0
  33. my_test/chunks/england.c-30.json +1 -0
  34. my_test/chunks/england.c-31.json +1 -0
  35. my_test/chunks/england.c-32.json +1 -0
  36. my_test/chunks/england.c-33.json +1 -0
  37. my_test/chunks/england.c-34.json +1 -0
  38. my_test/chunks/england.c-35.json +1 -0
  39. my_test/chunks/england.c-36.json +1 -0
  40. my_test/chunks/england.c-37.json +1 -0
  41. my_test/chunks/england.c-38.json +1 -0
  42. my_test/chunks/england.c-39.json +1 -0
  43. my_test/chunks/england.c-4.json +1 -0
  44. my_test/chunks/england.c-40.json +1 -0
  45. my_test/chunks/england.c-41.json +1 -0
  46. my_test/chunks/england.c-42.json +1 -0
  47. my_test/chunks/england.c-43.json +1 -0
  48. my_test/chunks/england.c-44.json +1 -0
  49. my_test/chunks/england.c-5.json +1 -0
  50. my_test/chunks/england.c-6.json +1 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,430 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:2467
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: intfloat/multilingual-e5-small
10
+ pipeline_tag: sentence-similarity
11
+ library_name: sentence-transformers
12
+ metrics:
13
+ - cosine_accuracy
14
+ - cosine_accuracy_threshold
15
+ - cosine_f1
16
+ - cosine_f1_threshold
17
+ - cosine_precision
18
+ - cosine_recall
19
+ - cosine_ap
20
+ - cosine_mcc
21
+ model-index:
22
+ - name: SentenceTransformer based on intfloat/multilingual-e5-small
23
+ results:
24
+ - task:
25
+ type: binary-classification
26
+ name: Binary Classification
27
+ dataset:
28
+ name: BinaryClassifEval
29
+ type: BinaryClassifEval
30
+ metrics:
31
+ - type: cosine_accuracy
32
+ value: 0.8
33
+ name: Cosine Accuracy
34
+ - type: cosine_accuracy_threshold
35
+ value: 0.8466682434082031
36
+ name: Cosine Accuracy Threshold
37
+ - type: cosine_f1
38
+ value: 0.888888888888889
39
+ name: Cosine F1
40
+ - type: cosine_f1_threshold
41
+ value: 0.8466682434082031
42
+ name: Cosine F1 Threshold
43
+ - type: cosine_precision
44
+ value: 1.0
45
+ name: Cosine Precision
46
+ - type: cosine_recall
47
+ value: 0.8
48
+ name: Cosine Recall
49
+ - type: cosine_ap
50
+ value: 1.0
51
+ name: Cosine Ap
52
+ - type: cosine_mcc
53
+ value: 0.0
54
+ name: Cosine Mcc
55
+ ---
56
+
57
+ # SentenceTransformer based on intfloat/multilingual-e5-small
58
+
59
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) on the json dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
60
+
61
+ ## Model Details
62
+
63
+ ### Model Description
64
+ - **Model Type:** Sentence Transformer
65
+ - **Base model:** [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) <!-- at revision c007d7ef6fd86656326059b28395a7a03a7c5846 -->
66
+ - **Maximum Sequence Length:** 512 tokens
67
+ - **Output Dimensionality:** 384 dimensions
68
+ - **Similarity Function:** Cosine Similarity
69
+ - **Training Dataset:**
70
+ - json
71
+ <!-- - **Language:** Unknown -->
72
+ <!-- - **License:** Unknown -->
73
+
74
+ ### Model Sources
75
+
76
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
77
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
78
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
79
+
80
+ ### Full Model Architecture
81
+
82
+ ```
83
+ SentenceTransformer(
84
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
85
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
86
+ (2): Normalize()
87
+ )
88
+ ```
89
+
90
+ ## Usage
91
+
92
+ ### Direct Usage (Sentence Transformers)
93
+
94
+ First install the Sentence Transformers library:
95
+
96
+ ```bash
97
+ pip install -U sentence-transformers
98
+ ```
99
+
100
+ Then you can load this model and run inference.
101
+ ```python
102
+ from sentence_transformers import SentenceTransformer
103
+
104
+ # Download from the 🤗 Hub
105
+ model = SentenceTransformer("sentence_transformers_model_id")
106
+ # Run inference
107
+ sentences = [
108
+ 'The weather is lovely today.',
109
+ "It's so sunny outside!",
110
+ 'He drove to the stadium.',
111
+ ]
112
+ embeddings = model.encode(sentences)
113
+ print(embeddings.shape)
114
+ # [3, 384]
115
+
116
+ # Get the similarity scores for the embeddings
117
+ similarities = model.similarity(embeddings, embeddings)
118
+ print(similarities.shape)
119
+ # [3, 3]
120
+ ```
121
+
122
+ <!--
123
+ ### Direct Usage (Transformers)
124
+
125
+ <details><summary>Click to see the direct usage in Transformers</summary>
126
+
127
+ </details>
128
+ -->
129
+
130
+ <!--
131
+ ### Downstream Usage (Sentence Transformers)
132
+
133
+ You can finetune this model on your own dataset.
134
+
135
+ <details><summary>Click to expand</summary>
136
+
137
+ </details>
138
+ -->
139
+
140
+ <!--
141
+ ### Out-of-Scope Use
142
+
143
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
144
+ -->
145
+
146
+ ## Evaluation
147
+
148
+ ### Metrics
149
+
150
+ #### Binary Classification
151
+
152
+ * Dataset: `BinaryClassifEval`
153
+ * Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
154
+
155
+ | Metric | Value |
156
+ |:--------------------------|:--------|
157
+ | cosine_accuracy | 0.8 |
158
+ | cosine_accuracy_threshold | 0.8467 |
159
+ | cosine_f1 | 0.8889 |
160
+ | cosine_f1_threshold | 0.8467 |
161
+ | cosine_precision | 1.0 |
162
+ | cosine_recall | 0.8 |
163
+ | **cosine_ap** | **1.0** |
164
+ | cosine_mcc | 0.0 |
165
+
166
+ <!--
167
+ ## Bias, Risks and Limitations
168
+
169
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
170
+ -->
171
+
172
+ <!--
173
+ ### Recommendations
174
+
175
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
176
+ -->
177
+
178
+ ## Training Details
179
+
180
+ ### Training Dataset
181
+
182
+ #### json
183
+
184
+ * Dataset: json
185
+ * Size: 2,467 training samples
186
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
187
+ * Approximate statistics based on the first 1000 samples:
188
+ | | sentence1 | sentence2 | label |
189
+ |:--------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------|
190
+ | type | string | string | int |
191
+ | details | <ul><li>min: 142 tokens</li><li>mean: 260.2 tokens</li><li>max: 340 tokens</li></ul> | <ul><li>min: 29 tokens</li><li>mean: 34.4 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
192
+ * Samples:
193
+ | sentence1 | sentence2 | label |
194
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------|:---------------|
195
+ | <code>Type de project: L’excès de précipitations tout au long de l’année a conduit à une chute spectaculaire des rendements des céréales d’été et des protéagineux (blé, orge, pois, féverole, etc.) que produisent 90% des agriculteurs d’Île-de-France, historique grenier à blé du pays. Tributaires naturels du fleurissement des cultures, les apiculteurs professionnels de la région ont également souffert de ces dérèglements climatiques.La Région accompagne les exploitations concernées en leur apportant une aide exceptionnelle.</code> | <code>'excès de précipitations':phénomène|DIMINUE|'rendements des protéagineux':concept</code> | <code>1</code> |
196
+ | <code>Type de project: Dans le cadre de sa stratégie « Impact 2028 », la Région s’engage dans la défense de la souveraineté industrielle en renforçant son soutien à une industrie circulaire et décarbonée, porteuse d’innovations et créatrice d’emplois. PM'up Jeunes pousses industrielles soutient les projets d’implantation d’une première usine tournée vers la décarbonation, l’efficacité énergétique et la circularité des processus de production. Ces projets peuvent prendre l'une de ces formes : Une première unité de production industrielle, après une phase de prototypage,Une ligne pilote de production industrielle, en interne ou chez un tiers situé en Île-de-France, à condition que sa production soit destinée à de premières commercialisations,La transformation d’une unité de production pilote à une unité de production industrielle</code> | <code>'Région Île-de-France':organisation|soutient|'industrie décarbonée':concept</code> | <code>1</code> |
197
+ | <code>Procédures et démarches: Le dépôt des demandes de subvention se fait en ligne sur la plateforme régionale mesdemarches.iledefrance.fr : Session de dépôt unique pour les nouvelles demandes : du 30 septembre au 4 novembre 2024 (11 heures) pour des festivals qui se déroulent entre le 1er mars 2025 et le 28 février 2026 (vote à la CP de mars 2025). Pour les demandes de renouvellement, un mail est envoyé aux structures concernées par le service du Spectacle vivant en amont de chaque session de dépôt.<br>Bénéficiaires: Professionnel - Culture, Association - Fondation, Association - Régie par la loi de 1901, Association - ONG, Collectivité ou institution - Communes de 10 000 à 20 000 hab, Collectivité ou institution - Autre (GIP, copropriété, EPA...), Collectivité ou institution - Communes de 2000 à 10 000 hab, Collectivité ou institution - Communes de < 2000 hab, Collectivité ou institution - Communes de > 20 000 hab, Collectivité ou institution - Département, Collectivité ou institution - EPC...</code> | <code>'Collectivité ou institution - EPCI':bénéficiaire|PEUT_BÉNÉFICIER|'demandes de subvention':procédure</code> | <code>1</code> |
198
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
199
+ ```json
200
+ {
201
+ "scale": 20.0,
202
+ "similarity_fct": "cos_sim"
203
+ }
204
+ ```
205
+
206
+ ### Evaluation Dataset
207
+
208
+ #### json
209
+
210
+ * Dataset: json
211
+ * Size: 616 evaluation samples
212
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
213
+ * Approximate statistics based on the first 616 samples:
214
+ | | sentence1 | sentence2 | label |
215
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------|
216
+ | type | string | string | int |
217
+ | details | <ul><li>min: 31 tokens</li><li>mean: 86.2 tokens</li><li>max: 160 tokens</li></ul> | <ul><li>min: 23 tokens</li><li>mean: 25.8 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
218
+ * Samples:
219
+ | sentence1 | sentence2 | label |
220
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------|
221
+ | <code>Type de project: Le programme propose des rencontres le samedi après-midi dans une université ou une grande école réputée, entre les professionnels bénévoles et les lycéens et collégiens sous la forme d'atelier thématiques. Ces moments de rencontre touchent à une grande multitude de domaines d’activités. L'objectif est de donner l’opportunité aux jeunes les plus enclavés d’échanger avec des intervenants professionnels aux parcours atypiques et inspirants. Les intervenants suscitent les ambitions et élargissent les perspectives des élèves.</code> | <code>'rencontres':événement|impliquent|'professionnels bénévoles':groupe</code> | <code>1</code> |
222
+ | <code>Précision sure les bénéficiaires: Communes,Établissements publics de coopération intercommunale (avec ou sans fiscalité propre),Établissements publics territoriaux franciliens,Départements,Aménageurs publics et privés (lorsque ces derniers interviennent à la demande ou pour le compte d'une collectivité précitée).</code> | <code>'Aménageurs privés':entité|INTERVIENT_POUR|'Départements':entité</code> | <code>1</code> |
223
+ | <code>Date de début: non précisée<br>Date de fin (clôture): non précisée<br>Date de début de la future campagne: non précisée</code> | <code>'Date de fin':concept|EST|'non précisée':__inferred__</code> | <code>1</code> |
224
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
225
+ ```json
226
+ {
227
+ "scale": 20.0,
228
+ "similarity_fct": "cos_sim"
229
+ }
230
+ ```
231
+
232
+ ### Training Hyperparameters
233
+ #### Non-Default Hyperparameters
234
+
235
+ - `eval_strategy`: epoch
236
+ - `gradient_accumulation_steps`: 4
237
+ - `learning_rate`: 0.00037437328820906734
238
+ - `num_train_epochs`: 2
239
+ - `lr_scheduler_type`: cosine
240
+ - `warmup_steps`: 32
241
+ - `bf16`: True
242
+ - `half_precision_backend`: cpu_amp
243
+ - `load_best_model_at_end`: True
244
+ - `optim`: adamw_torch_fused
245
+ - `batch_sampler`: no_duplicates
246
+
247
+ #### All Hyperparameters
248
+ <details><summary>Click to expand</summary>
249
+
250
+ - `overwrite_output_dir`: False
251
+ - `do_predict`: False
252
+ - `eval_strategy`: epoch
253
+ - `prediction_loss_only`: True
254
+ - `per_device_train_batch_size`: 8
255
+ - `per_device_eval_batch_size`: 8
256
+ - `per_gpu_train_batch_size`: None
257
+ - `per_gpu_eval_batch_size`: None
258
+ - `gradient_accumulation_steps`: 4
259
+ - `eval_accumulation_steps`: None
260
+ - `torch_empty_cache_steps`: None
261
+ - `learning_rate`: 0.00037437328820906734
262
+ - `weight_decay`: 0.0
263
+ - `adam_beta1`: 0.9
264
+ - `adam_beta2`: 0.999
265
+ - `adam_epsilon`: 1e-08
266
+ - `max_grad_norm`: 1.0
267
+ - `num_train_epochs`: 2
268
+ - `max_steps`: -1
269
+ - `lr_scheduler_type`: cosine
270
+ - `lr_scheduler_kwargs`: {}
271
+ - `warmup_ratio`: 0.0
272
+ - `warmup_steps`: 32
273
+ - `log_level`: passive
274
+ - `log_level_replica`: warning
275
+ - `log_on_each_node`: True
276
+ - `logging_nan_inf_filter`: True
277
+ - `save_safetensors`: True
278
+ - `save_on_each_node`: False
279
+ - `save_only_model`: False
280
+ - `restore_callback_states_from_checkpoint`: False
281
+ - `no_cuda`: False
282
+ - `use_cpu`: False
283
+ - `use_mps_device`: False
284
+ - `seed`: 42
285
+ - `data_seed`: None
286
+ - `jit_mode_eval`: False
287
+ - `use_ipex`: False
288
+ - `bf16`: True
289
+ - `fp16`: False
290
+ - `fp16_opt_level`: O1
291
+ - `half_precision_backend`: cpu_amp
292
+ - `bf16_full_eval`: False
293
+ - `fp16_full_eval`: False
294
+ - `tf32`: None
295
+ - `local_rank`: 0
296
+ - `ddp_backend`: None
297
+ - `tpu_num_cores`: None
298
+ - `tpu_metrics_debug`: False
299
+ - `debug`: []
300
+ - `dataloader_drop_last`: False
301
+ - `dataloader_num_workers`: 0
302
+ - `dataloader_prefetch_factor`: None
303
+ - `past_index`: -1
304
+ - `disable_tqdm`: False
305
+ - `remove_unused_columns`: True
306
+ - `label_names`: None
307
+ - `load_best_model_at_end`: True
308
+ - `ignore_data_skip`: False
309
+ - `fsdp`: []
310
+ - `fsdp_min_num_params`: 0
311
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
312
+ - `fsdp_transformer_layer_cls_to_wrap`: None
313
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
314
+ - `deepspeed`: None
315
+ - `label_smoothing_factor`: 0.0
316
+ - `optim`: adamw_torch_fused
317
+ - `optim_args`: None
318
+ - `adafactor`: False
319
+ - `group_by_length`: False
320
+ - `length_column_name`: length
321
+ - `ddp_find_unused_parameters`: None
322
+ - `ddp_bucket_cap_mb`: None
323
+ - `ddp_broadcast_buffers`: False
324
+ - `dataloader_pin_memory`: True
325
+ - `dataloader_persistent_workers`: False
326
+ - `skip_memory_metrics`: True
327
+ - `use_legacy_prediction_loop`: False
328
+ - `push_to_hub`: False
329
+ - `resume_from_checkpoint`: None
330
+ - `hub_model_id`: None
331
+ - `hub_strategy`: every_save
332
+ - `hub_private_repo`: None
333
+ - `hub_always_push`: False
334
+ - `gradient_checkpointing`: False
335
+ - `gradient_checkpointing_kwargs`: None
336
+ - `include_inputs_for_metrics`: False
337
+ - `include_for_metrics`: []
338
+ - `eval_do_concat_batches`: True
339
+ - `fp16_backend`: auto
340
+ - `push_to_hub_model_id`: None
341
+ - `push_to_hub_organization`: None
342
+ - `mp_parameters`:
343
+ - `auto_find_batch_size`: False
344
+ - `full_determinism`: False
345
+ - `torchdynamo`: None
346
+ - `ray_scope`: last
347
+ - `ddp_timeout`: 1800
348
+ - `torch_compile`: False
349
+ - `torch_compile_backend`: None
350
+ - `torch_compile_mode`: None
351
+ - `dispatch_batches`: None
352
+ - `split_batches`: None
353
+ - `include_tokens_per_second`: False
354
+ - `include_num_input_tokens_seen`: False
355
+ - `neftune_noise_alpha`: None
356
+ - `optim_target_modules`: None
357
+ - `batch_eval_metrics`: False
358
+ - `eval_on_start`: False
359
+ - `use_liger_kernel`: False
360
+ - `eval_use_gather_object`: False
361
+ - `average_tokens_across_devices`: False
362
+ - `prompts`: None
363
+ - `batch_sampler`: no_duplicates
364
+ - `multi_dataset_batch_sampler`: proportional
365
+
366
+ </details>
367
+
368
+ ### Training Logs
369
+ | Epoch | Step | Validation Loss | BinaryClassifEval_cosine_ap |
370
+ |:-------:|:-----:|:---------------:|:---------------------------:|
371
+ | 1.0 | 1 | 0.5580 | 1.0 |
372
+ | **2.0** | **2** | **0.529** | **1.0** |
373
+
374
+ * The bold row denotes the saved checkpoint.
375
+
376
+ ### Framework Versions
377
+ - Python: 3.11.11
378
+ - Sentence Transformers: 3.4.1
379
+ - Transformers: 4.48.3
380
+ - PyTorch: 2.6.0+cpu
381
+ - Accelerate: 1.4.0
382
+ - Datasets: 3.3.2
383
+ - Tokenizers: 0.21.0
384
+
385
+ ## Citation
386
+
387
+ ### BibTeX
388
+
389
+ #### Sentence Transformers
390
+ ```bibtex
391
+ @inproceedings{reimers-2019-sentence-bert,
392
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
393
+ author = "Reimers, Nils and Gurevych, Iryna",
394
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
395
+ month = "11",
396
+ year = "2019",
397
+ publisher = "Association for Computational Linguistics",
398
+ url = "https://arxiv.org/abs/1908.10084",
399
+ }
400
+ ```
401
+
402
+ #### MultipleNegativesRankingLoss
403
+ ```bibtex
404
+ @misc{henderson2017efficient,
405
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
406
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
407
+ year={2017},
408
+ eprint={1705.00652},
409
+ archivePrefix={arXiv},
410
+ primaryClass={cs.CL}
411
+ }
412
+ ```
413
+
414
+ <!--
415
+ ## Glossary
416
+
417
+ *Clearly define terms in order to be accessible across audiences.*
418
+ -->
419
+
420
+ <!--
421
+ ## Model Card Authors
422
+
423
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
424
+ -->
425
+
426
+ <!--
427
+ ## Model Card Contact
428
+
429
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
430
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "intfloat/multilingual-e5-small",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 1536,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "tokenizer_class": "XLMRobertaTokenizer",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.48.3",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 250037
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.48.3",
5
+ "pytorch": "2.6.0+cpu"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
eval/binary_classification_evaluation_BinaryClassifEval_results.csv ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,cosine_accuracy,cosine_accuracy_threshold,cosine_f1,cosine_precision,cosine_recall,cosine_f1_threshold,cosine_ap,cosine_mcc
2
+ 1.0,5,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
3
+ 1.0,5,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
4
+ 2.0,10,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
5
+ 2.0,10,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
6
+ 1.0,1,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
7
+ 0,0,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
8
+ 0,0,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
9
+ 0,0,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
10
+ 1.0,1,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
11
+ 2.0,2,0.8,0.84666824,0.888888888888889,1.0,0.8,0.84666824,1.0,0.0
12
+ 2.0,2,0.8,0.84666824,0.888888888888889,1.0,0.8,0.84666824,1.0,0.0
13
+ 1.0,1,0.8,0.8514068,0.888888888888889,1.0,0.8,0.8514068,1.0,0.0
14
+ 2.0,2,0.8,0.8477378,0.888888888888889,1.0,0.8,0.8477378,1.0,0.0
15
+ 2.0,2,0.8,0.8477378,0.888888888888889,1.0,0.8,0.8477378,1.0,0.0
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:12b35ac02b183ee18c2678448f57ee845d28bd2e1ed082c746ba90e172a31d8c
3
+ size 470637416
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
my_test/chunks/england.c-0.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-0","content":"England is a country that is part of the United Kingdom.[7] It is located on the island of Great Britain, of which it covers about 62%, and more than 100 smaller adjacent islands. It has land borders with Scotland to the north and Wales to the west, and is otherwise surrounded by the North Sea to the east, the English Channel to the south, the Celtic Sea to the south-west, and the Irish Sea to the west. Continental Europe lies to the south-east, and Ireland to the west. At the 2021 census, the population was 56,490,048.[1] London is both the largest city and the capital.","order_int":0,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":135,"section_id":"england","section":{}}}
my_test/chunks/england.c-1.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-1","content":"The area now called England was first inhabited by modern humans during the Upper Paleolithic. It takes its name from the Angles, a Germanic tribe who settled during the 5th and 6th centuries. England became a unified state in the 10th century and has had extensive cultural and legal impact on the wider world since the Age of Discovery, which began during the 15th century.[8] The Kingdom of England, which included Wales after 1535, ceased to be a separate sovereign state on 1 May 1707, when the Acts of Union brought into effect a political union with the Kingdom of Scotland that created the Kingdom of Great Britain.[9]","order_int":1,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":126,"section_id":"england","section":{}}}
my_test/chunks/england.c-10.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-10","content":"There is debate about when Christianity was first introduced; it was no later than the 4th century, probably much earlier. According to Bede, missionaries were sent from Rome by Eleutherius at the request of the chieftain Lucius of Britain in 180 AD, to settle differences as to Eastern and Western ceremonials, which were disturbing the church. There are traditions linked to Glastonbury claiming an introduction through Joseph of Arimathea, while others claim through Lucius of Britain.[35] By 410, during the decline of the Roman Empire, Britain was left exposed by the end of Roman rule in Britain and the withdrawal of Roman army units, to defend the frontiers in continental Europe and partake in civil wars.[36] Celtic Christian monastic and missionary movements flourished. This period of Christianity was influenced by ancient Celtic culture in its sensibilities, polity, practices and theology. Local \"congregations\" were centred in the monastic community and monastic leaders were more like chieftains, as peers, rather than in the more hierarchical system of the Roman-dominated church.[37]","order_int":10,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":216,"section_id":"england","section":{}}}
my_test/chunks/england.c-11.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-11","content":"Middle Ages\nMain article: England in the Middle Ages\nStudded and decorated metallic mask of human face.\nReplica of the 7th-century ceremonial Sutton Hoo helmet from the Kingdom of East Anglia","order_int":11,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":37,"section_id":"england","section":{}}}
my_test/chunks/england.c-12.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-12","content":"Roman military withdrawals left Britain open to invasion by pagan, seafaring warriors from north-western continental Europe, chiefly the Saxons, Angles, Jutes and Frisians who had long raided the coasts of the Roman province. These groups then began to settle in increasing numbers over the course of the fifth and sixth centuries, initially in the eastern part of the country.[36] Their advance was contained for some decades after the Britons' victory at the Battle of Mount Badon, but subsequently resumed, overrunning the fertile lowlands of Britain and reducing the area under Brittonic control to a series of separate enclaves in the more rugged country to the west by the end of the 6th century. Contemporary texts describing this period are extremely scarce, giving rise to its description as a Dark Age","order_int":12,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":158,"section_id":"england","section":{}}}
my_test/chunks/england.c-13.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-13","content":". Details of the Anglo-Saxon settlement of Britain are consequently subject to considerable disagreement; the emerging consensus is that it occurred on a large scale in the south and east but was less substantial to the north and west, where Celtic languages continued to be spoken even in areas under Anglo-Saxon control.[38][39] Roman-dominated Christianity had, in general, been replaced in the conquered territories by Anglo-Saxon paganism, but was reintroduced by missionaries from Rome led by Augustine from 597.[40] Disputes between the Roman- and Celtic-dominated forms of Christianity ended in victory for the Roman tradition at the Council of Whitby (664), which was ostensibly about tonsures (clerical haircuts) and the date of Easter, but more significantly, about the differences in Roman and Celtic forms of authority, theology, and practice.[37]","order_int":13,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":176,"section_id":"england","section":{}}}
my_test/chunks/england.c-14.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-14","content":"During the settlement period the lands ruled by the incomers seem to have been fragmented into numerous tribal territories, but by the 7th century, when substantial evidence of the situation again becomes available, these had coalesced into roughly a dozen kingdoms including Northumbria, Mercia, Wessex, East Anglia, Essex, Kent and Sussex. Over the following centuries, this process of political consolidation continued.[41] The 7th century saw a struggle for hegemony between Northumbria and Mercia, which in the 8th century gave way to Mercian preeminence.[42] In the early 9th century Mercia was displaced as the foremost kingdom by Wessex. Later in that century escalating attacks by the Danes culminated in the conquest of the north and east of England, overthrowing the kingdoms of Northumbria, Mercia and East Anglia. Wessex under Alfred the Great was left as the only surviving English kingdom, and under his successors, it steadily expanded at the expense of the kingdoms of the Danelaw. This brought about the political unification of England, first accomplished under Æthelstan in 927 and definitively established after further conflicts by Eadred in 953","order_int":14,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":237,"section_id":"england","section":{}}}
my_test/chunks/england.c-15.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-15","content":". A fresh wave of Scandinavian attacks from the late 10th century ended with the conquest of this united kingdom by Sweyn Forkbeard in 1013 and again by his son Cnut in 1016, turning it into the centre of a short-lived North Sea Empire that also included Denmark and Norway. However, the native royal dynasty was restored with the accession of Edward the Confessor in 1042.","order_int":15,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":80,"section_id":"england","section":{}}}
my_test/chunks/england.c-16.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-16","content":"King Henry V at the Battle of Agincourt, 1415.\nKing Henry V at the Battle of Agincourt, fought on Saint Crispin's Day and concluded with an English victory against a larger French army in the Hundred Years' War\nA dispute over the succession to Edward led to an unsuccessful Norwegian Invasion in September 1066 close to York in the North, and the successful Norman Conquest in October 1066, accomplished by an army led by Duke William of Normandy invading at Hastings late September 1066.[43] The Normans themselves originated from Scandinavia and had settled in Normandy in the late 9th and early 10th centuries.[44] This conquest led to the almost total dispossession of the English elite and its replacement by a new French-speaking aristocracy, whose speech had a profound and permanent effect on the English language.[45]","order_int":16,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":172,"section_id":"england","section":{}}}
my_test/chunks/england.c-17.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-17","content":"Subsequently, the House of Plantagenet from Anjou inherited the English throne under Henry II, adding England to the budding Angevin Empire of fiefs the family had inherited in France including Aquitaine.[46] They reigned for three centuries, some noted monarchs being Richard I, Edward I, Edward III and Henry V.[46] The period saw changes in trade and legislation, including the signing of Magna Carta, an English legal charter used to limit the sovereign's powers by law and protect the privileges of freemen. Catholic monasticism flourished, providing philosophers, and the universities of Oxford and Cambridge were founded with royal patronage. The Principality of Wales became a Plantagenet fief during the 13th century[47] and the Lordship of Ireland was given to the English monarchy by the Pope. During the 14th century, the Plantagenets and the House of Valois claimed to be legitimate claimants to the House of Capet and of France; the two powers clashed in the Hundred Years' War.[48] The Black Death epidemic hit England; starting in 1348, it eventually killed up to half of England's inhabitants.[49]","order_int":17,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":235,"section_id":"england","section":{}}}
my_test/chunks/england.c-18.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-18","content":"Between 1453 and 1487, a civil war known as the War of the Roses waged between the two branches of the royal family, the Yorkists and Lancastrians.[50] Eventually it led to the Yorkists losing the throne entirely to a Welsh noble family the Tudors, a branch of the Lancastrians headed by Henry Tudor who invaded with Welsh and Breton mercenaries, gaining victory at the Battle of Bosworth Field where the Yorkist king Richard III was killed.[51]\n\nEarly modern period\n\nKing Henry VIII (1491–1547)","order_int":18,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":113,"section_id":"england","section":{}}}
my_test/chunks/england.c-19.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-19","content":"Queen Elizabeth I (1558–1603)\nDuring the Tudor period, England began to develop naval skills, and exploration intensified in the Age of Discovery.[52] Henry VIII broke from communion with the Catholic Church, over issues relating to his divorce, under the Acts of Supremacy in 1534 which proclaimed the monarch head of the Church of England. In contrast with much of European Protestantism, the roots of the split were more political than theological.[d] He also legally incorporated his ancestral land Wales into the Kingdom of England with the 1535–1542 acts. There were internal religious conflicts during the reigns of Henry's daughters, Mary I and Elizabeth I. The former took the country back to Catholicism while the latter broke from it again, forcefully asserting the supremacy of Anglicanism. The Elizabethan era is the epoch in the Tudor age of the reign of Queen Elizabeth I (\"the Virgin Queen\"). Historians often depict it as the golden age in English history that represented the apogee of the English Renaissance and saw the flowering of great art, drama, poetry, music and literature.[54] England during this period had a centralised, well-organised, and effective government.[55]","order_int":19,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":244,"section_id":"england","section":{}}}
my_test/chunks/england.c-2.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-2","content":"England is the origin of the English language, the English legal system (which served as the basis for the common law systems of many other countries), association football, and the Anglican branch of Christianity; its parliamentary system of government has been widely adopted by other nations.[10] The Industrial Revolution began in 18th-century England, transforming its society into the world's first industrialised nation.[11] England is home to the two oldest universities in the English-speaking world: the University of Oxford, founded in 1096, and the University of Cambridge, founded in 1209. Both universities are ranked among the most prestigious in the world.[12][13]","order_int":2,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":136,"section_id":"england","section":{}}}
my_test/chunks/england.c-20.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-20","content":"Competing with Spain, the first English colony in the Americas was founded in 1585 by explorer Walter Raleigh in Virginia and named Roanoke. The Roanoke colony failed and is known as the lost colony after it was found abandoned on the return of the late-arriving supply ship.[56] With the East India Company, England also competed with the Dutch and French in the East. During the Elizabethan period, England was at war with Spain. An armada sailed from Spain in 1588 as part of a wider plan to invade England and re-establish a Catholic monarchy. The plan was thwarted by bad coordination, stormy weather and successful harrying attacks by an English fleet under Lord Howard of Effingham. This failure did not end the threat: Spain launched two further armadas, in 1596 and 1597, but both were driven back by storms.\n\nUnion with Scotland\nFurther information: Union of the Crowns and Treaty of Union","order_int":20,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":184,"section_id":"england","section":{}}}
my_test/chunks/england.c-21.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-21","content":"King of Scotland, James VI, became King of England as James I in 1603, forming the Union of the Crowns\nThe political structure of the island changed in 1603, when the King of Scots, James VI, a kingdom which had been a long-time rival to English interests, inherited the throne of England as James I, thereby creating a personal union.[57] He styled himself King of Great Britain, although this had no basis in English law.[58] Under the auspices of James VI and I the Authorised King James Version of the Holy Bible was published in 1611. It was the standard version of the Bible read by most Protestant Christians for four hundred years until modern revisions were produced in the 20th century.","order_int":21,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":143,"section_id":"england","section":{}}}
my_test/chunks/england.c-22.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-22","content":"Based on conflicting political, religious and social positions, the English Civil War was fought between the supporters of Parliament and those of King Charles I, known colloquially as Roundheads and Cavaliers respectively. This was an interwoven part of the wider multifaceted Wars of the Three Kingdoms, involving Scotland and Ireland. The Parliamentarians were victorious, Charles I was executed and the kingdom replaced by the Commonwealth. Leader of the Parliament forces, Oliver Cromwell declared himself Lord Protector in 1653; a period of personal rule followed.[59] After Cromwell's death and the resignation of his son Richard as Lord Protector, Charles II was invited to return as monarch in 1660, in a move called the Restoration. With the reopening of theatres, fine arts, literature and performing arts flourished throughout the Restoration of the \"Merry Monarch\" Charles II.[60] After the Glorious Revolution of 1688, it was constitutionally established that King and Parliament should rule together, though Parliament would have the real power. This was established with the Bill of Rights in 1689","order_int":22,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":207,"section_id":"england","section":{}}}
my_test/chunks/england.c-23.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-23","content":". Among the statutes set down were that the law could only be made by Parliament and could not be suspended by the King, also that the King could not impose taxes or raise an army without the prior approval of Parliament.[61] Also since that time, no British monarch has entered the House of Commons when it is sitting, which is annually commemorated at the State Opening of Parliament by the British monarch when the doors of the House of Commons are slammed in the face of the monarch's messenger, symbolising the rights of Parliament and its independence from the monarch.[62] With the founding of the Royal Society in 1660, science was greatly encouraged.","order_int":23,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":131,"section_id":"england","section":{}}}
my_test/chunks/england.c-24.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-24","content":"Painting of seated male figure, with long black hair wearing a white cape and breeches.\nThe English Restoration restored the monarchy under King Charles II and peace after the English Civil War.\nIn 1666 the Great Fire of London gutted the city of London, but it was rebuilt shortly afterward with many significant buildings designed by Sir Christopher Wren.[63] By the mid-to-late 17th century, two political factions had emerged – the Tories and Whigs. Though the Tories initially supported Catholic king James II, some of them, along with the Whigs, during the Revolution of 1688 invited the Dutch Prince William of Orange to defeat James and become the king. Some English people, especially in the north, were Jacobites and continued to support James and his sons. Under the Stuart dynasty England expanded in trade, finance and prosperity. The Royal Navy developed Europe's largest merchant fleet.[64] After the parliaments of England and Scotland agreed,[65] the two countries joined in political union, to create the Kingdom of Great Britain in 1707.[57] To accommodate the union, institutions such as the law and national churches of each remained separate.[66]\n\nLate modern and contemporary periods","order_int":24,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":241,"section_id":"england","section":{}}}
my_test/chunks/england.c-25.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-25","content":"The River Thames during the Georgian period from the Terrace of Somerset House looking towards St. Paul's, c. 1750\nUnder the newly formed Kingdom of Great Britain, output from the Royal Society and other English initiatives combined with the Scottish Enlightenment to create innovations in science and engineering, while the enormous growth in British overseas trade protected by the Royal Navy paved the way for the establishment of the British Empire. Domestically it drove the Industrial Revolution, a period of profound change in the socioeconomic and cultural conditions of England, resulting in industrialised agriculture, manufacture, engineering and mining, as well as new and pioneering road, rail and water networks to facilitate their expansion and development.[67] The opening of Northwest England's Bridgewater Canal in 1761 ushered in the canal age in Britain.[68] In 1825 the world's first permanent steam locomotive-hauled passenger railway – the Stockton and Darlington Railway – opened to the public.[68]","order_int":25,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":187,"section_id":"england","section":{}}}
my_test/chunks/england.c-26.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-26","content":"During the Industrial Revolution, many workers moved from England's countryside to new and expanding urban industrial areas to work in factories, for instance at Birmingham and Manchester,[69] with the latter the world's first industrial city.[70] England maintained relative stability throughout the French Revolution, under George III and William Pitt the Younger. The regency of George IV is noted for its elegance and achievements in the fine arts and architecture.[71] During the Napoleonic Wars, Napoleon planned to invade from the south-east; however, this failed to manifest and the Napoleonic forces were defeated by the British: at sea by Horatio Nelson, and on land by Arthur Wellesley. The major victory at the Battle of Trafalgar confirmed the naval supremacy Britain had established during the course of the eighteenth century.[72] The Napoleonic Wars fostered a concept of Britishness and a united national British people, shared with the English, Scots and Welsh.[73]","order_int":26,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":191,"section_id":"england","section":{}}}
my_test/chunks/england.c-27.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-27","content":"multi-storey square industrial buildings beyond a river\nThe Battle of Trafalgar was a naval engagement between the Royal Navy and the combined fleets of France and Spain during the Napoleonic Wars.[74]\nLondon became the largest and most populous metropolitan area in the world during the Victorian era, and trade within the British Empire – as well as the standing of the British military and navy – was prestigious.[75] Technologically, this era saw many innovations that proved key to the United Kingdom's power and prosperity.[76] Political agitation at home from radicals such as the Chartists and the suffragettes enabled legislative reform and universal suffrage.[77]","order_int":27,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":131,"section_id":"england","section":{}}}
my_test/chunks/england.c-28.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-28","content":"Power shifts in east-central Europe led to World War I; hundreds of thousands of English soldiers died fighting for the United Kingdom as part of the Allies.[e] Two decades later, in World War II, the United Kingdom was again one of the Allies. Developments in warfare technology saw many cities damaged by air-raids during the Blitz. Following the war, the British Empire experienced rapid decolonisation, and there was a speeding-up of technological innovations; automobiles became the primary means of transport and Frank Whittle's development of the jet engine led to wider air travel.[79] Residential patterns were altered in England by private motoring, and by the creation of the National Health Service in 1948, providing publicly funded health care to all permanent residents free at the point of need. Combined, these prompted the reform of local government in England in the mid-20th century.[80]","order_int":28,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":181,"section_id":"england","section":{}}}
my_test/chunks/england.c-29.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-29","content":"The Victorian era is often cited as a Golden Age. Painting done by William Powell Frith to show cultural divisions.\nSince the 20th century, there has been significant population movement to England, mostly from other parts of the British Isles, but also from the Commonwealth, particularly the Indian subcontinent.[81] Since the 1970s there has been a large move away from manufacturing and an increasing emphasis on the service industry.[82] As part of the United Kingdom, the area joined a common market initiative called the European Economic Community which became the European Union.\n\nSince the late 20th century the administration of the United Kingdom has moved towards devolved governance in Scotland, Wales and Northern Ireland.[83] England and Wales continues to exist as a jurisdiction within the United Kingdom.[84] Devolution has stimulated a greater emphasis on a more English-specific identity and patriotism.[85] There is no devolved English government, but an attempt to create a similar system on a sub-regional basis was rejected by referendum.[86]","order_int":29,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":206,"section_id":"england","section":{}}}
my_test/chunks/england.c-3.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-3","content":"England's terrain chiefly consists of low hills and plains, especially in the centre and south. Upland and mountainous terrain is mostly found in the north and west, including Dartmoor, the Lake District, the Pennines, and the Shropshire Hills. The country's capital is London, the metropolitan area of which has a population of 14.2 million as of 2021, representing the United Kingdom's largest metropolitan area. England's population of 56.3 million comprises 84% of the population of the United Kingdom,[14] largely concentrated around London, the South East, and conurbations in the Midlands, the North West, the North East, and Yorkshire, which each developed as major industrial regions during the 19th century.[15]","order_int":3,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":151,"section_id":"england","section":{}}}
my_test/chunks/england.c-30.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-30","content":"Governance\nPolitics\nMain article: Politics of England\nPhotograph of rectangular floodlight building, reflected in water. The building has multiple towers including one at each end. The tower on the right includes an illuminated clock face.\nThe Palace of Westminster, the seat of the Parliament of the United Kingdom\nEngland is part of the United Kingdom, a constitutional monarchy with a parliamentary system.[87] There has not been a government of England since 1707, when the Acts of Union 1707,[88] putting into effect the terms of the Treaty of Union, joined England and Scotland to form the Kingdom of Great Britain.[65] Before the union England was ruled by its monarch and the Parliament of England.","order_int":30,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":136,"section_id":"england","section":{}}}
my_test/chunks/england.c-31.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-31","content":"Today England is governed directly by the Parliament of the United Kingdom, although other countries of the United Kingdom have devolved governments.[89] There has been debate about how to counterbalance this in England. Originally it was planned that various regions of England would be devolved, but following the proposal's rejection by the North East in a 2004 referendum, this has not been carried out.[86] In 2024, an England-only intergovernmental body, known as the Mayoral Council for England, was established to bring together ministers from the UK Government, the Mayor of London and the leaders of combined authorities.[90]\n\nIn the House of Commons which is the lower house of the British Parliament based at the Palace of Westminster, there are 543 members of parliament (MPs) for constituencies in England, out of the 650 total.[91] England is represented by 347 MPs from the Labour Party, 116 from the Conservative Party, 65 from the Liberal Democrats, five for Reform UK and four for the Green Party of England and Wales.\n\nLaw\nMain article: English law","order_int":31,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":218,"section_id":"england","section":{}}}
my_test/chunks/england.c-32.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-32","content":"The Royal Courts of Justice\nThe English law legal system, developed over the centuries, is the basis of common law[92] legal systems used in most Commonwealth countries[93] and the United States (except Louisiana). Despite now being part of the United Kingdom, the legal system of the Courts of England and Wales continued, under the Treaty of Union, as a separate legal system from the one used in Scotland. The general essence of English law is that it is made by judges sitting in courts, applying their common sense and knowledge of legal precedent – stare decisis – to the facts before them.[94]\n\nThe court system is headed by the Senior Courts of England and Wales, consisting of the Court of Appeal, the High Court of Justice for civil cases, and the Crown Court for criminal cases.[95] The Supreme Court of the United Kingdom is the highest court for criminal and civil cases in England and Wales. It was created in 2009 after constitutional changes, taking over the judicial functions of the House of Lords.[96] A decision of the Supreme Court is binding on every other court in the hierarchy, which must follow its directions.[97]","order_int":32,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":234,"section_id":"england","section":{}}}
my_test/chunks/england.c-33.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-33","content":"The Secretary of State for Justice is the minister responsible to Parliament for the judiciary, the court system and prisons and probation in England.[98] Crime increased between 1981 and 1995 but fell by 42% in the period 1995–2006.[99] The prison population doubled over the same period, giving it one of the highest incarceration rates in Western Europe at 147 per 100,000.[100] His Majesty's Prison Service, reporting to the Ministry of Justice, manages most prisons, housing 81,309 prisoners in England and Wales as of September 2022.[101]","order_int":33,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":118,"section_id":"england","section":{}}}
my_test/chunks/england.c-34.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-34","content":"Subdivisions\nMain article: Subdivisions of England\nSee also: Regions of England, Combined authority, Counties of England, and Districts of England\nimage attributionNorthumberlandDurhamLancashireCheshireDerbs.Notts.LincolnshireLeics.Staffs.ShropshireWorks.Northants.NorfolkSuffolkEssexHerts.Beds.Bucks.Oxon.Glos.SomersetWiltshireBerkshireKentSurreyHampshireDorsetDevonCornwallHeref.Worcs.BristolEast Riding\nof YorkshireRutlandCambs.Greater\nLondonNot shown: City of LondonTyne &\nWearCumbriaNorth YorkshireSouth\nYorks.West\nYorkshireGreater\nManc.MerseysideEast\nSussexWest\nSussexIsle of\nWightWest\nMidlands\nCeremonial counties of England\nThe subdivisions of England consist of up to four levels of subnational division, controlled through a variety of types of administrative entities created for the purposes of local government.","order_int":34,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":196,"section_id":"england","section":{}}}
my_test/chunks/england.c-35.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-35","content":"Outside the London region, England's highest tier is the 48 ceremonial counties.[102] These are used primarily as a geographical frame of reference. Of these, 38 developed gradually since the Middle Ages; these were reformed to 51 in 1974 and to their current number in 1996.[103] Each has a Lord Lieutenant and High Sheriff; these posts are used to represent the British monarch locally.[102] Some counties, such as Herefordshire, are only divided further into civil parishes. The royal county of Berkshire and the metropolitan counties have different types of status to other ceremonial counties.[104]\n\nThe second tier is made up of combined authorities and the 27 county-tier shire counties. In 1974, all ceremonial counties were two-tier; and with the metropolitan county tier phased out, the 1996 reform separated the ceremonial county and the administrative county tier.\n\nEngland is also divided into local government districts.[105] The district can align to a ceremonial county, or be a district tier within a shire county, be a royal or metropolitan borough, have borough or city status, or be a unitary authority.","order_int":35,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":221,"section_id":"england","section":{}}}
my_test/chunks/england.c-36.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-36","content":"At the community level, much of England is divided into civil parishes with their own councils; in Greater London only one such parish, Queen's Park, exists as of 2014 after they were abolished in 1965 until legislation allowed their recreation in 2007.\n\nLondon\nFrom 1994 until the early 2010s England was divided for a few purposes into regions; a 1998 referendum for the London Region created the London Assembly two years later.[106] A failed 2004 North East England devolution referendum cancelled further regional assembly devolution[86] with the regional structure outside London abolished.\n\nCeremonially and administratively, the region is divided between the City of London and Greater London; these are further divided into the 32 London Boroughs and the 25 Wards of the City of London.[107]\n\nGeography\nMain article: Geography of England\nLandscape and rivers","order_int":36,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":162,"section_id":"england","section":{}}}
my_test/chunks/england.c-37.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-37","content":"The Malvern Hills located in the English counties of Worcestershire and Herefordshire. The hills have been designated by the Countryside Agency as an Area of Outstanding Natural Beauty.\nGeographically, England includes the central and southern two-thirds of the island of Great Britain, plus such offshore islands as the Isle of Wight and the Isles of Scilly. It is bordered by two other countries of the United Kingdom: to the north by Scotland and to the west by Wales.\n\nEngland is closer than any other part of mainland Britain to the European continent. It is separated from France (Hauts-de-France) by a 21-mile (34 km)[108] sea gap, though the two countries are connected by the Channel Tunnel near Folkestone.[109] England also has shores on the Irish Sea, North Sea and Atlantic Ocean.","order_int":37,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":165,"section_id":"england","section":{}}}
my_test/chunks/england.c-38.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-38","content":"The ports of London, Liverpool, and Newcastle lie on the tidal rivers Thames, Mersey and Tyne respectively. At 220 miles (350 km), the Severn is the longest river flowing through England.[110] It empties into the Bristol Channel and is notable for its Severn Bore (a tidal bore), which can reach 2 metres (6.6 ft) in height.[111] However, the longest river entirely in England is the Thames, which is 215 miles (346 km) in length.[112] There are many lakes in England; the largest is Windermere, within the aptly named Lake District.[113]","order_int":38,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":133,"section_id":"england","section":{}}}
my_test/chunks/england.c-39.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-39","content":"Most of England's landscape consists of low hills and plains, with upland and mountainous terrain in the north and west of the country. The northern uplands include the Pennines, a chain of uplands dividing east and west, the Lake District mountains in Cumbria, and the Cheviot Hills, straddling the border between England and Scotland. The highest point in England, at 978 metres (3,209 ft), is Scafell Pike in the Lake District.[113] The Shropshire Hills are near Wales while Dartmoor and Exmoor are two upland areas in the south-west of the country. The approximate dividing line between terrain types is often indicated by the Tees–Exe line.[114]","order_int":39,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":146,"section_id":"england","section":{}}}
my_test/chunks/england.c-4.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-4","content":"Toponymy\nSee also: Toponymy of England\nThe name \"England\" is derived from the Old English name Englaland, which means \"land of the Angles\".[16] The Angles were one of the Germanic tribes that settled in Great Britain during the Early Middle Ages. They came from the Angeln region of what is now the German state of Schleswig-Holstein.[17] The earliest recorded use of the term, as \"Engla londe\", is in the late-ninth-century translation into Old English of Bede's Ecclesiastical History of the English People. The term was then used to mean \"the land inhabited by the English\", and it included English people in what is now south-east Scotland but was then part of the English kingdom of Northumbria. The Anglo-Saxon Chronicle recorded that the Domesday Book of 1086 covered the whole of England, meaning the English kingdom, but a few years later the Chronicle stated that King Malcolm III went \"out of Scotlande into Lothian in Englaland\", thus using it in the more ancient sense.[18]","order_int":4,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":224,"section_id":"england","section":{}}}
my_test/chunks/england.c-40.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-40","content":"The village of Glenridding and Ullswater in Cumbria.\nThe Pennines, known as the \"backbone of England\", are the oldest range of mountains in the country, originating from the end of the Paleozoic Era around 300 million years ago.[115] Their geological composition includes, among others, sandstone and limestone, and also coal. There are karst landscapes in calcite areas such as parts of Yorkshire and Derbyshire. The Pennine landscape is high moorland in upland areas, indented by fertile valleys of the region's rivers. They contain two national parks, the Yorkshire Dales and the Peak District. In the West Country, Dartmoor and Exmoor of the Southwest Peninsula include upland moorland supported by granite.[116]\n\nThe English Lowlands are in the central and southern regions of the country, consisting of green rolling hills, including the Cotswold Hills, Chiltern Hills, North and South Downs; where they meet the sea they form white rock exposures such as the cliffs of Dover. This also includes relatively flat plains such as the Salisbury Plain, Somerset Levels, South Coast Plain and The Fens.","order_int":40,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":230,"section_id":"england","section":{}}}
my_test/chunks/england.c-41.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-41","content":"Climate\nMain article: Climate of England\nEngland has a temperate maritime climate: it is mild with temperatures not much lower than 0 °C (32 °F) in winter and not much higher than 32 °C (90 °F) in summer.[117] The weather is damp relatively frequently and is changeable. The coldest months are January and February, the latter particularly on the English coast, while July is normally the warmest month. Months with mild to warm weather are May, June, September and October.[117] Rainfall is spread fairly evenly throughout the year.\n\nImportant influences on the climate of England are its proximity to the Atlantic Ocean, its northern latitude and the warming of the sea by the Gulf Stream.[117] Rainfall is higher in the west, and parts of the Lake District receive more rain than anywhere else in the country.[117] Since weather records began, the highest temperature recorded was 40.3 °C (104.5 °F) on 19 July 2022 at Coningsby, Lincolnshire,[118] while the lowest was −26.1 °C (−15.0 °F) on 10 January 1982 in Edgmond, Shropshire.[119]\n\nNature and wildlife\nMain article: Fauna of England","order_int":41,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":244,"section_id":"england","section":{}}}
my_test/chunks/england.c-42.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-42","content":"The Eurasian wren, the most numerous bird species in England[120]","order_int":42,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":14,"section_id":"england","section":{}}}
my_test/chunks/england.c-43.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-43","content":"The fauna of England is similar to that of other areas in the British Isles with a wide range of vertebrate and invertebrate life in a diverse range of habitats.[121] National nature reserves in England are designated by Natural England as key places for wildlife and natural features in England. They were established to protect the most significant areas of habitat and of geological formations. NNRs are managed on behalf of the nation, many by Natural England themselves, but also by non-governmental organisations, including the members of The Wildlife Trusts partnership, the National Trust, and the Royal Society for the Protection of Birds. There are 221 NNRs in England covering 110,000 hectares (1,100 square kilometres). Often they contain rare species or nationally important populations of plants and animals.[122] . The Environment Agency is a non-departmental public body, established in 1995 and sponsored by the Department for Environment, Food and Rural Affairs with responsibilities relating to the protection and enhancement of the environment in England.[123] The Secretary of State for Environment, Food and Rural Affairs is the minister responsible for environmental protection, agriculture, fisheries and rural communities in England.[124]","order_int":43,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":237,"section_id":"england","section":{}}}
my_test/chunks/england.c-44.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-44","content":"Red deer in Richmond Park. The park was created by Charles I in the 17th century as a deer park.[125]\nEngland has a temperate oceanic climate in most areas, lacking extremes of cold or heat, but does have a few small areas of subarctic and warmer areas in the South West. Towards the North of England the climate becomes colder and most of England's mountains and high hills are located here and have a major impact on the climate and thus the local fauna of the areas. Deciduous woodlands are common across all of England and provide a great habitat for much of England's wildlife, but these give way in northern and upland areas of England to coniferous forests (mainly plantations) which also benefit certain forms of wildlife. Some species have adapted to the expanded urban environment, particularly the red fox, which is the most successful urban mammal after the brown rat, and other animals such as common wood pigeon, both of which thrive in urban and suburban areas.[126]","order_int":44,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":197,"section_id":"england","section":{}}}
my_test/chunks/england.c-5.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-5","content":"The earliest attested reference to the Angles occurs in the 1st-century work by Tacitus, Germania, in which the Latin word Anglii is used.[19] The etymology of the tribal name itself is disputed by scholars; it has been suggested that it derives from the shape of the Angeln peninsula, an angular shape.[20] How and why a term derived from the name of this tribe, rather than others such as the Saxons, came to be used for the entire country is not known, but it seems this is related to the custom of calling the Germanic people in Britain Angli Saxones or English Saxons to distinguish them from continental Saxons (Eald-Seaxe) of Old Saxony in Germany.[21] In Scottish Gaelic, the Saxon tribe gave their name to the word for England (Sasunn);[22] similarly, the Welsh name for the English language is Saesneg. A romantic name for England is Loegria, related to the Welsh word for England, Lloegr, and made popular by its use in Arthurian legend. Albion is also applied to England in a more poetic capacity,[23] though its original meaning is the island of Britain as a whole.","order_int":5,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":250,"section_id":"england","section":{}}}
my_test/chunks/england.c-6.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"england.c-6","content":"History\nMain article: History of England\nFor a chronological guide, see Timeline of English history.\nPrehistory\nMain article: Prehistoric Britain\nSun shining through row of upright standing stones with other stones horizontally on the top.\nStonehenge, a Neolithic monument\nThe earliest known evidence of human presence in the area now known as England was that of Homo antecessor, dating to about 780,000 years ago. The oldest proto-human bones discovered in England date from 500,000 years ago.[24] Modern humans are known to have inhabited the area during the Upper Paleolithic period, though permanent settlements were only established within the last 6,000 years.[25] After the last ice age only large mammals such as mammoths, bison and woolly rhinoceros remained. Roughly 11,000 years ago, when the ice sheets began to recede, humans repopulated the area; genetic research suggests they came from the northern part of the Iberian Peninsula.[26] The sea level was lower than the present day and Britain was connected by land bridge to Ireland and Eurasia.[27] As the seas rose, it was separated from Ireland 10,000 years ago and from Eurasia two millennia later.","order_int":6,"metadata":{"splitter_name":"RecursiveCharacterTextSplitter","length":245,"section_id":"england","section":{}}}