software-si commited on
Commit
f8951e3
·
verified ·
1 Parent(s): e94337e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -207
README.md CHANGED
@@ -10,29 +10,77 @@ base_model: dbmdz/bert-base-italian-uncased
10
  pipeline_tag: text-classification
11
  library_name: sentence-transformers
12
  ---
 
13
 
14
- # CrossEncoder based on dbmdz/bert-base-italian-uncased
15
 
16
- This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [dbmdz/bert-base-italian-uncased](https://huggingface.co/dbmdz/bert-base-italian-uncased) on the json dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text pair classification.
17
 
18
- ## Model Details
19
 
20
- ### Model Description
21
- - **Model Type:** Cross Encoder
22
- - **Base model:** [dbmdz/bert-base-italian-uncased](https://huggingface.co/dbmdz/bert-base-italian-uncased) <!-- at revision 55058d75cf3bc75a67a412584491b774cb99d68a -->
23
- - **Maximum Sequence Length:** 512 tokens
24
- - **Number of Output Labels:** 3 labels
25
- - **Training Dataset:**
26
- - json
27
- <!-- - **Language:** Unknown -->
28
- <!-- - **License:** Unknown -->
29
 
30
- ### Model Sources
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
33
- - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
34
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
35
- - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
 
 
 
 
 
36
 
37
  ## Usage
38
 
@@ -63,42 +111,6 @@ print(scores.shape)
63
  # (5, 3)
64
  ```
65
 
66
- <!--
67
- ### Direct Usage (Transformers)
68
-
69
- <details><summary>Click to see the direct usage in Transformers</summary>
70
-
71
- </details>
72
- -->
73
-
74
- <!--
75
- ### Downstream Usage (Sentence Transformers)
76
-
77
- You can finetune this model on your own dataset.
78
-
79
- <details><summary>Click to expand</summary>
80
-
81
- </details>
82
- -->
83
-
84
- <!--
85
- ### Out-of-Scope Use
86
-
87
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
88
- -->
89
-
90
- <!--
91
- ## Bias, Risks and Limitations
92
-
93
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
94
- -->
95
-
96
- <!--
97
- ### Recommendations
98
-
99
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
100
- -->
101
-
102
  ## Training Details
103
 
104
  ### Training Dataset
@@ -141,142 +153,6 @@ You can finetune this model on your own dataset.
141
  | <code>modulo cucina misure 70 cm di profondità, forno alimentato elettricamente, dispone di 6 fuochi,</code> | <code>la cucina ha un forno</code> | <code>1</code> |
142
  * Loss: [<code>CrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#crossentropyloss)
143
 
144
- ### Training Hyperparameters
145
- #### Non-Default Hyperparameters
146
-
147
- - `eval_strategy`: steps
148
- - `per_device_train_batch_size`: 32
149
- - `per_device_eval_batch_size`: 32
150
- - `learning_rate`: 1e-05
151
- - `num_train_epochs`: 1
152
- - `warmup_steps`: 47424
153
- - `bf16`: True
154
- - `load_best_model_at_end`: True
155
-
156
- #### All Hyperparameters
157
- <details><summary>Click to expand</summary>
158
-
159
- - `overwrite_output_dir`: False
160
- - `do_predict`: False
161
- - `eval_strategy`: steps
162
- - `prediction_loss_only`: True
163
- - `per_device_train_batch_size`: 32
164
- - `per_device_eval_batch_size`: 32
165
- - `per_gpu_train_batch_size`: None
166
- - `per_gpu_eval_batch_size`: None
167
- - `gradient_accumulation_steps`: 1
168
- - `eval_accumulation_steps`: None
169
- - `torch_empty_cache_steps`: None
170
- - `learning_rate`: 1e-05
171
- - `weight_decay`: 0.0
172
- - `adam_beta1`: 0.9
173
- - `adam_beta2`: 0.999
174
- - `adam_epsilon`: 1e-08
175
- - `max_grad_norm`: 1.0
176
- - `num_train_epochs`: 1
177
- - `max_steps`: -1
178
- - `lr_scheduler_type`: linear
179
- - `lr_scheduler_kwargs`: {}
180
- - `warmup_ratio`: 0.0
181
- - `warmup_steps`: 47424
182
- - `log_level`: passive
183
- - `log_level_replica`: warning
184
- - `log_on_each_node`: True
185
- - `logging_nan_inf_filter`: True
186
- - `save_safetensors`: True
187
- - `save_on_each_node`: False
188
- - `save_only_model`: False
189
- - `restore_callback_states_from_checkpoint`: False
190
- - `no_cuda`: False
191
- - `use_cpu`: False
192
- - `use_mps_device`: False
193
- - `seed`: 42
194
- - `data_seed`: None
195
- - `jit_mode_eval`: False
196
- - `bf16`: True
197
- - `fp16`: False
198
- - `fp16_opt_level`: O1
199
- - `half_precision_backend`: auto
200
- - `bf16_full_eval`: False
201
- - `fp16_full_eval`: False
202
- - `tf32`: None
203
- - `local_rank`: 0
204
- - `ddp_backend`: None
205
- - `tpu_num_cores`: None
206
- - `tpu_metrics_debug`: False
207
- - `debug`: []
208
- - `dataloader_drop_last`: False
209
- - `dataloader_num_workers`: 0
210
- - `dataloader_prefetch_factor`: None
211
- - `past_index`: -1
212
- - `disable_tqdm`: False
213
- - `remove_unused_columns`: True
214
- - `label_names`: None
215
- - `load_best_model_at_end`: True
216
- - `ignore_data_skip`: False
217
- - `fsdp`: []
218
- - `fsdp_min_num_params`: 0
219
- - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
220
- - `fsdp_transformer_layer_cls_to_wrap`: None
221
- - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
222
- - `parallelism_config`: None
223
- - `deepspeed`: None
224
- - `label_smoothing_factor`: 0.0
225
- - `optim`: adamw_torch_fused
226
- - `optim_args`: None
227
- - `adafactor`: False
228
- - `group_by_length`: False
229
- - `length_column_name`: length
230
- - `project`: huggingface
231
- - `trackio_space_id`: trackio
232
- - `ddp_find_unused_parameters`: None
233
- - `ddp_bucket_cap_mb`: None
234
- - `ddp_broadcast_buffers`: False
235
- - `dataloader_pin_memory`: True
236
- - `dataloader_persistent_workers`: False
237
- - `skip_memory_metrics`: True
238
- - `use_legacy_prediction_loop`: False
239
- - `push_to_hub`: False
240
- - `resume_from_checkpoint`: None
241
- - `hub_model_id`: None
242
- - `hub_strategy`: every_save
243
- - `hub_private_repo`: None
244
- - `hub_always_push`: False
245
- - `hub_revision`: None
246
- - `gradient_checkpointing`: False
247
- - `gradient_checkpointing_kwargs`: None
248
- - `include_inputs_for_metrics`: False
249
- - `include_for_metrics`: []
250
- - `eval_do_concat_batches`: True
251
- - `fp16_backend`: auto
252
- - `push_to_hub_model_id`: None
253
- - `push_to_hub_organization`: None
254
- - `mp_parameters`:
255
- - `auto_find_batch_size`: False
256
- - `full_determinism`: False
257
- - `torchdynamo`: None
258
- - `ray_scope`: last
259
- - `ddp_timeout`: 1800
260
- - `torch_compile`: False
261
- - `torch_compile_backend`: None
262
- - `torch_compile_mode`: None
263
- - `include_tokens_per_second`: False
264
- - `include_num_input_tokens_seen`: no
265
- - `neftune_noise_alpha`: None
266
- - `optim_target_modules`: None
267
- - `batch_eval_metrics`: False
268
- - `eval_on_start`: False
269
- - `use_liger_kernel`: False
270
- - `liger_kernel_config`: None
271
- - `eval_use_gather_object`: False
272
- - `average_tokens_across_devices`: True
273
- - `prompts`: None
274
- - `batch_sampler`: batch_sampler
275
- - `multi_dataset_batch_sampler`: proportional
276
- - `router_mapping`: {}
277
- - `learning_rate_mapping`: {}
278
-
279
- </details>
280
 
281
  ### Training Logs
282
  | Epoch | Step | Training Loss | Validation Loss |
@@ -315,21 +191,3 @@ You can finetune this model on your own dataset.
315
  url = "https://arxiv.org/abs/1908.10084",
316
  }
317
  ```
318
-
319
- <!--
320
- ## Glossary
321
-
322
- *Clearly define terms in order to be accessible across audiences.*
323
- -->
324
-
325
- <!--
326
- ## Model Card Authors
327
-
328
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
329
- -->
330
-
331
- <!--
332
- ## Model Card Contact
333
-
334
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
335
- -->
 
10
  pipeline_tag: text-classification
11
  library_name: sentence-transformers
12
  ---
13
+ # 🍳 Horeca Cucine Industriali – Modello NLI Specializzato (Italiano)
14
 
15
+ ## 📌 Panoramica
16
 
17
+ Questo modello NLI (Natural Language Inference) è **il primo modello open-source specializzato esclusivamente sull’analisi semantica di schede tecniche di CUCINE INDUSTRIALI e PIANI COTTURA**.
18
 
19
+ Non è un modello generalista
20
 
21
+ Il modello è in grado di stabilire se una caratteristica **è presente (entailment), assente (contradiction) o non menzionata (neutral)** nella descrizione tecnica di un prodotto.
 
 
 
 
 
 
 
 
22
 
23
+ ---
24
+
25
+ ## 🎯 Obiettivo del Modello
26
+
27
+ Consentire a sistemi di ricerca, Q&A o RAG di:
28
+
29
+ - Comprendere realmente le schede prodotto di cucine professionali
30
+ - Verificare se una certa caratteristica è presente o meno
31
+ - Estrarre informazioni in modo intelligente tramite logica NLI
32
+
33
+ ---
34
+
35
+ ## ✅ Caratteristiche Analizzate
36
+
37
+ Il modello è stato addestrato per riconoscere e validare le seguenti caratteristiche:
38
+
39
+ - Numero di zone di cottura (es. 4 fuochi, 6 fuochi, 4 piastre)
40
+ - Tipo di cottura (gas, elettrico, induzione)
41
+ - Disposizione (su forno, a banco, top)
42
+ - Dimensioni (larghezza, profondità)
43
+ - Struttura / accessori (vano forno, vano aperto, mobile chiuso, alzatina, ecc.)
44
+
45
+ ---
46
+
47
+ ## 💡 Perché è Unico?
48
+
49
+ | Modello | Specializzazione | Lingua | Applicazione reale |
50
+ |--------|------------------|--------|--------------------|
51
+ | GPT / Llama / generalisti | ❌ No | 🌐 Multi | ❓ Limitata |
52
+ | Classici modelli NLI | ❌ No | ❓ Spesso EN | ❌ Non capiscono il dominio tecnico |
53
+ | **QUESTO MODELLO** | ✅ SOLO cucine e piani cottura | ✅ Italiano | ✅ Pensato per industria Ho.Re.Ca |
54
+
55
+ ---
56
+
57
+ ## 🔧 Architettura
58
+
59
+ - **Base model:** `dbmdz/bert-base-italian-uncased`)
60
+ - **Task:** Natural Language Inference (3 classi: entailment, contradiction, neutral)
61
+ - **Linguaggio:** Italiano
62
+ - **Dominio:** Schede tecniche di cucine industriali
63
+
64
+ ---
65
+
66
+ ## 🧾 Formato Input / Output
67
+
68
+ Questo è un modello NLI: riceve in input **premessa + ipotesi**.
69
+
70
+ **Esempio:**
71
+
72
+ - Premessa:
73
+ `Cucina a gas 4 fuochi con vano forno statico`
74
 
75
+ - Ipotesi:
76
+ `la cucina ha un forno`
77
+
78
+ - Output:
79
+ `entailment`
80
+
81
+ ---
82
+
83
+ ## 🧪 Esempio d’Uso (Python)
84
 
85
  ## Usage
86
 
 
111
  # (5, 3)
112
  ```
113
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
  ## Training Details
115
 
116
  ### Training Dataset
 
153
  | <code>modulo cucina misure 70 cm di profondità, forno alimentato elettricamente, dispone di 6 fuochi,</code> | <code>la cucina ha un forno</code> | <code>1</code> |
154
  * Loss: [<code>CrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#crossentropyloss)
155
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
 
157
  ### Training Logs
158
  | Epoch | Step | Training Loss | Validation Loss |
 
191
  url = "https://arxiv.org/abs/1908.10084",
192
  }
193
  ```