kemonito233 commited on
Commit
a6517b1
·
verified ·
1 Parent(s): cf4a703

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -5,11 +5,11 @@ tags:
5
  - text-classification
6
  - generated_from_setfit_trainer
7
  widget:
8
- - text: si, ¿que necesito saber?
9
- - text: salio a la tienda, no tarda
10
- - text: se nos murio
11
- - text: cuentame mas
12
- - text: por el momento no, muchas gracias
13
  metrics:
14
  - accuracy
15
  pipeline_tag: text-classification
@@ -28,7 +28,7 @@ model-index:
28
  split: test
29
  metrics:
30
  - type: accuracy
31
- value: 0.9523809523809523
32
  name: Accuracy
33
  ---
34
 
@@ -48,7 +48,7 @@ The model has been trained using an efficient few-shot learning technique that i
48
  - **Sentence Transformer body:** [hiiamsid/sentence_similarity_spanish_es](https://huggingface.co/hiiamsid/sentence_similarity_spanish_es)
49
  - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
50
  - **Maximum Sequence Length:** 512 tokens
51
- - **Number of Classes:** 13 classes
52
  <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
53
  <!-- - **Language:** Unknown -->
54
  <!-- - **License:** Unknown -->
@@ -60,28 +60,33 @@ The model has been trained using an efficient few-shot learning technique that i
60
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
 
62
  ### Model Labels
63
- | Label | Examples |
64
- |:------|:--------------------------------------------------------------------------------------------------------------------------|
65
- | 9 | <ul><li>'no, ahorita no'</li><li>'no autorizo que me llamen'</li><li>'no, ahorita no, oiga'</li></ul> |
66
- | 6 | <ul><li>'ya dejen de estar chingando'</li><li>'callate'</li><li>'voy a reportar este numero a la profeco'</li></ul> |
67
- | 11 | <ul><li>'dame 5 minutos y te regreso la llamada'</li><li>'estoy comiendo, provecho'</li><li>'estoy trabajando'</li></ul> |
68
- | 7 | <ul><li>'si, me puede explicar'</li><li>'cuenteme, por favor'</li><li>'expliqueme un poco mas'</li></ul> |
69
- | 1 | <ul><li>'¿quien busca?'</li><li>'¿a quien le llamo?'</li><li>'¿con quien quiere hablar usted?'</li></ul> |
70
- | 10 | <ul><li>'no se encuentra'</li><li>'no se encuentra, salio a trabajar'</li><li>'no esta, ¿gusta dejarle recado?'</li></ul> |
71
- | 5 | <ul><li>'murio hace tiempo'</li><li>'ya fallecio'</li><li>'fallecio ayer'</li></ul> |
72
- | 12 | <ul><li>'¿para que es?'</li><li>'¿de donde llaman?'</li><li>'¿que empresa es?'</li></ul> |
73
- | 3 | <ul><li>'numero equivocado'</li><li>'no vive aqui'</li><li>'esta equivocada senorita'</li></ul> |
74
- | 2 | <ul><li>'hasta luego'</li><li>'gracias, bye'</li><li>'adios'</li></ul> |
75
- | 4 | <ul><li>'deja te comunico con el'</li><li>'permiteme un segundo, no me cuelgues'</li><li>'aguantame tantito'</li></ul> |
76
- | 8 | <ul><li>'¿como?'</li><li>'mande'</li><li>'hable mas fuerte que no le oigo'</li></ul> |
77
- | 0 | <ul><li>'si, el habla'</li><li>'servidor'</li><li>'con el'</li></ul> |
 
 
 
 
 
78
 
79
  ## Evaluation
80
 
81
  ### Metrics
82
  | Label | Accuracy |
83
  |:--------|:---------|
84
- | **all** | 0.9524 |
85
 
86
  ## Uses
87
 
@@ -101,7 +106,7 @@ from setfit import SetFitModel
101
  # Download from the 🤗 Hub
102
  model = SetFitModel.from_pretrained("setfit_model_id")
103
  # Run inference
104
- preds = model("se nos murio")
105
  ```
106
 
107
  <!--
@@ -133,23 +138,28 @@ preds = model("se nos murio")
133
  ### Training Set Metrics
134
  | Training set | Min | Median | Max |
135
  |:-------------|:----|:-------|:----|
136
- | Word count | 1 | 3.7514 | 11 |
137
 
138
  | Label | Training Sample Count |
139
  |:------|:----------------------|
140
- | 0 | 7 |
141
- | 1 | 11 |
142
- | 2 | 5 |
143
- | 3 | 6 |
144
- | 4 | 9 |
145
- | 5 | 12 |
146
- | 6 | 11 |
147
- | 7 | 33 |
148
- | 8 | 13 |
149
- | 9 | 48 |
150
- | 10 | 8 |
151
  | 11 | 13 |
152
- | 12 | 5 |
 
 
 
 
 
153
 
154
  ### Training Hyperparameters
155
  - batch_size: (16, 16)
@@ -174,26 +184,36 @@ preds = model("se nos murio")
174
  ### Training Results
175
  | Epoch | Step | Training Loss | Validation Loss |
176
  |:------:|:----:|:-------------:|:---------------:|
177
- | 0.0022 | 1 | 0.25 | - |
178
- | 0.1104 | 50 | 0.1543 | - |
179
- | 0.2208 | 100 | 0.0482 | - |
180
- | 0.3311 | 150 | 0.03 | - |
181
- | 0.4415 | 200 | 0.0137 | - |
182
- | 0.5519 | 250 | 0.0122 | - |
183
- | 0.6623 | 300 | 0.0057 | - |
184
- | 0.7726 | 350 | 0.0036 | - |
185
- | 0.8830 | 400 | 0.0031 | - |
186
- | 0.9934 | 450 | 0.005 | - |
187
- | 1.0 | 453 | - | 0.0190 |
 
 
 
 
 
 
 
 
 
 
188
 
189
  ### Framework Versions
190
  - Python: 3.12.12
191
  - SetFit: 1.1.3
192
  - Sentence Transformers: 5.2.2
193
- - Transformers: 4.57.6
194
  - PyTorch: 2.9.0+cu126
195
  - Datasets: 4.0.0
196
- - Tokenizers: 0.22.2
197
 
198
  ## Citation
199
 
 
5
  - text-classification
6
  - generated_from_setfit_trainer
7
  widget:
8
+ - text: soy quien busca
9
+ - text: adios, buenas tardes
10
+ - text: no se encuentra
11
+ - text: yo le puedo pasar el mensaje
12
+ - text: quizas funcione
13
  metrics:
14
  - accuracy
15
  pipeline_tag: text-classification
 
28
  split: test
29
  metrics:
30
  - type: accuracy
31
+ value: 0.9111111111111111
32
  name: Accuracy
33
  ---
34
 
 
48
  - **Sentence Transformer body:** [hiiamsid/sentence_similarity_spanish_es](https://huggingface.co/hiiamsid/sentence_similarity_spanish_es)
49
  - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
50
  - **Maximum Sequence Length:** 512 tokens
51
+ - **Number of Classes:** 18 classes
52
  <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
53
  <!-- - **Language:** Unknown -->
54
  <!-- - **License:** Unknown -->
 
60
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
 
62
  ### Model Labels
63
+ | Label | Examples |
64
+ |:------|:-------------------------------------------------------------------------------------------------------------------------------------|
65
+ | 14 | <ul><li>'tengo otro prestamo activo'</li><li>'mi historial esta mal'</li><li>'tengo credito con otra financiera'</li></ul> |
66
+ | 11 | <ul><li>'hable mas fuerte'</li><li>'se oye muy lejos'</li><li>'se corta'</li></ul> |
67
+ | 15 | <ul><li>'ahorita voy manejando, hablame luego'</li><li>'ahorita no puedo atenderte, estoy ocupado'</li><li>'voy manejando'</li></ul> |
68
+ | 7 | <ul><li>'ya fallecio'</li><li>'ya no esta con nosotros'</li><li>'el ya no vive'</li></ul> |
69
+ | 4 | <ul><li>'adios, buenas noches'</li><li>'bueno, gracias, adios'</li><li>'listo, hasta luego'</li></ul> |
70
+ | 10 | <ul><li>'si, quiero saber'</li><li>'si, digame rapido'</li><li>'te escucho'</li></ul> |
71
+ | 12 | <ul><li>'no, joven, muchas gracias'</li><li>'no, oiga, gracias'</li><li>'no, por ahora paso, gracias'</li></ul> |
72
+ | 17 | <ul><li>'bueno, diga'</li><li>'si'</li><li>'si, diga'</li></ul> |
73
+ | 3 | <ul><li>'si, a ver de que se trata'</li><li>'tal vez si'</li><li>'esta bien, envialo'</li></ul> |
74
+ | 5 | <ul><li>'no corresponde ese numero'</li><li>'esta llamando al numero equivocado'</li><li>'aqui no vive esa persona'</li></ul> |
75
+ | 8 | <ul><li>'¿me da la direccion de sus oficinas?'</li><li>'yo no les di mi telefono'</li><li>'yo no le di mis datos a nadie'</li></ul> |
76
+ | 0 | <ul><li>'soy su hermana'</li><li>'esta bajo tratamiento'</li><li>'se siente mal'</li></ul> |
77
+ | 16 | <ul><li>'¿quien me llama?'</li><li>'¿de que empresa llaman?'</li><li>'¿quien es?'</li></ul> |
78
+ | 1 | <ul><li>'habla el senor'</li><li>'con ella habla'</li><li>'si aqui habla'</li></ul> |
79
+ | 6 | <ul><li>'un momento por favor'</li><li>'deja le hablo'</li><li>'permiteme un segundo, no me cuelgues'</li></ul> |
80
+ | 2 | <ul><li>'¿con quien quiere hablar?'</li><li>'¿quien busca?'</li><li>'¿a quien esta buscando?'</li></ul> |
81
+ | 9 | <ul><li>'no esten chingando'</li><li>'es la quinta vez que me marcan hoy'</li><li>'¡que no entiendes que no!'</li></ul> |
82
+ | 13 | <ul><li>'salio a la tienda, no tarda'</li><li>'ahorita no esta, anda de viaje'</li><li>'anda trabajando'</li></ul> |
83
 
84
  ## Evaluation
85
 
86
  ### Metrics
87
  | Label | Accuracy |
88
  |:--------|:---------|
89
+ | **all** | 0.9111 |
90
 
91
  ## Uses
92
 
 
106
  # Download from the 🤗 Hub
107
  model = SetFitModel.from_pretrained("setfit_model_id")
108
  # Run inference
109
+ preds = model("soy quien busca")
110
  ```
111
 
112
  <!--
 
138
  ### Training Set Metrics
139
  | Training set | Min | Median | Max |
140
  |:-------------|:----|:-------|:----|
141
+ | Word count | 1 | 3.9018 | 11 |
142
 
143
  | Label | Training Sample Count |
144
  |:------|:----------------------|
145
+ | 0 | 32 |
146
+ | 1 | 18 |
147
+ | 2 | 11 |
148
+ | 3 | 18 |
149
+ | 4 | 18 |
150
+ | 5 | 22 |
151
+ | 6 | 9 |
152
+ | 7 | 12 |
153
+ | 8 | 40 |
154
+ | 9 | 11 |
155
+ | 10 | 33 |
156
  | 11 | 13 |
157
+ | 12 | 48 |
158
+ | 13 | 8 |
159
+ | 14 | 36 |
160
+ | 15 | 13 |
161
+ | 16 | 18 |
162
+ | 17 | 37 |
163
 
164
  ### Training Hyperparameters
165
  - batch_size: (16, 16)
 
184
  ### Training Results
185
  | Epoch | Step | Training Loss | Validation Loss |
186
  |:------:|:----:|:-------------:|:---------------:|
187
+ | 0.0010 | 1 | 0.3888 | - |
188
+ | 0.0504 | 50 | 0.211 | - |
189
+ | 0.1007 | 100 | 0.1344 | - |
190
+ | 0.1511 | 150 | 0.0742 | - |
191
+ | 0.2014 | 200 | 0.0484 | - |
192
+ | 0.2518 | 250 | 0.0387 | - |
193
+ | 0.3021 | 300 | 0.0264 | - |
194
+ | 0.3525 | 350 | 0.0183 | - |
195
+ | 0.4028 | 400 | 0.0135 | - |
196
+ | 0.4532 | 450 | 0.0115 | - |
197
+ | 0.5035 | 500 | 0.0082 | - |
198
+ | 0.5539 | 550 | 0.0083 | - |
199
+ | 0.6042 | 600 | 0.0073 | - |
200
+ | 0.6546 | 650 | 0.009 | - |
201
+ | 0.7049 | 700 | 0.0067 | - |
202
+ | 0.7553 | 750 | 0.0075 | - |
203
+ | 0.8056 | 800 | 0.0085 | - |
204
+ | 0.8560 | 850 | 0.0073 | - |
205
+ | 0.9063 | 900 | 0.0065 | - |
206
+ | 0.9567 | 950 | 0.0076 | - |
207
+ | 1.0 | 993 | - | 0.0437 |
208
 
209
  ### Framework Versions
210
  - Python: 3.12.12
211
  - SetFit: 1.1.3
212
  - Sentence Transformers: 5.2.2
213
+ - Transformers: 4.44.2
214
  - PyTorch: 2.9.0+cu126
215
  - Datasets: 4.0.0
216
+ - Tokenizers: 0.19.1
217
 
218
  ## Citation
219
 
config.json CHANGED
@@ -1,10 +1,10 @@
1
  {
 
2
  "architectures": [
3
  "BertModel"
4
  ],
5
  "attention_probs_dropout_prob": 0.1,
6
  "classifier_dropout": null,
7
- "dtype": "float32",
8
  "gradient_checkpointing": false,
9
  "hidden_act": "gelu",
10
  "hidden_dropout_prob": 0.1,
@@ -19,7 +19,8 @@
19
  "output_past": true,
20
  "pad_token_id": 1,
21
  "position_embedding_type": "absolute",
22
- "transformers_version": "4.57.6",
 
23
  "type_vocab_size": 2,
24
  "use_cache": true,
25
  "vocab_size": 31002
 
1
  {
2
+ "_name_or_path": "hiiamsid/sentence_similarity_spanish_es",
3
  "architectures": [
4
  "BertModel"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
  "classifier_dropout": null,
 
8
  "gradient_checkpointing": false,
9
  "hidden_act": "gelu",
10
  "hidden_dropout_prob": 0.1,
 
19
  "output_past": true,
20
  "pad_token_id": 1,
21
  "position_embedding_type": "absolute",
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.44.2",
24
  "type_vocab_size": 2,
25
  "use_cache": true,
26
  "vocab_size": 31002
config_sentence_transformers.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "__version__": {
3
  "sentence_transformers": "5.2.2",
4
- "transformers": "4.57.6",
5
  "pytorch": "2.9.0+cu126"
6
  },
7
  "model_type": "SentenceTransformer",
 
1
  {
2
  "__version__": {
3
  "sentence_transformers": "5.2.2",
4
+ "transformers": "4.44.2",
5
  "pytorch": "2.9.0+cu126"
6
  },
7
  "model_type": "SentenceTransformer",
config_setfit.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "normalize_embeddings": false,
3
- "labels": null
4
  }
 
1
  {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:974fa8bccbf172e0d06c5f65a7b152e46bc76e0ace6385c0a35cac54d6cc98ba
3
  size 439425888
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69618ae50cf5d9badf705bfa0d831bafedcfe359c65440434d337da90ae5caff
3
  size 439425888
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f7f44c823415a45b366f3e81b81531a109ab18797bc982c971d50b960988f605
3
- size 80927
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:546fe46dd5fed7c63dfe7f1ca36775a3710f7f6b3c9ed2b93bdaa3aa461e94ad
3
+ size 111719
tokenizer_config.json CHANGED
@@ -41,11 +41,10 @@
41
  "special": true
42
  }
43
  },
44
- "clean_up_tokenization_spaces": false,
45
  "cls_token": "[CLS]",
46
  "do_basic_tokenize": true,
47
  "do_lower_case": false,
48
- "extra_special_tokens": {},
49
  "mask_token": "[MASK]",
50
  "max_length": 512,
51
  "model_max_length": 512,
 
41
  "special": true
42
  }
43
  },
44
+ "clean_up_tokenization_spaces": true,
45
  "cls_token": "[CLS]",
46
  "do_basic_tokenize": true,
47
  "do_lower_case": false,
 
48
  "mask_token": "[MASK]",
49
  "max_length": 512,
50
  "model_max_length": 512,