histlearn commited on
Commit
76f9a5f
·
verified ·
1 Parent(s): 233b2df

feat: ativa ensemble completo com calibração (Platt scaling)

Browse files
README.md CHANGED
@@ -20,7 +20,7 @@ note* em português, devolve a probabilidade de ela ser classificada como "útil
20
  (`label_binary_strict = 1`), junto com uma leitura opcional da contribuição de
21
  cada palavra.
22
 
23
- Arquitetura: **bge-m3 (568M params) + LoRA + cabeça linear**, idêntica ao
24
  `predict_from_text` do notebook FT-Solo em modo fiel
25
  (fold 01).
26
 
@@ -195,3 +195,4 @@ e exemplos de `curl` para os dois endpoints.
195
 
196
  Baseado no pipeline e no notebook de explicabilidade do projeto Notinhas.
197
  O código aqui é o protótipo funcional da função `predict_from_text` virado serviço.
 
 
20
  (`label_binary_strict = 1`), junto com uma leitura opcional da contribuição de
21
  cada palavra.
22
 
23
+ Arquitetura: **bge-m3 (568M params) + LoRA + cabeça linear (Ensemble e Calibração)**, idêntica ao
24
  `predict_from_text` do notebook FT-Solo em modo fiel
25
  (fold 01).
26
 
 
195
 
196
  Baseado no pipeline e no notebook de explicabilidade do projeto Notinhas.
197
  O código aqui é o protótipo funcional da função `predict_from_text` virado serviço.
198
+ \n## Calibração e Ensemble\nEste Space carrega múltiplos folds como um **ensemble**, calculando a média das\nprobabilidades de 5 versões do modelo adaptado (LoRA + Cabeça Linear) para aumentar\na robustez da classificação. Além disso, as probabilidades passam por\n**Platt scaling** com base nos parâmetros do `config.py` para melhorar a calibração.\n
app.py CHANGED
@@ -270,7 +270,7 @@ INTRO_MD = """
270
  # Notinhas — endpoint de utilidade (FT-Solo)
271
 
272
  Classificador de utilidade para **community notes em português**, baseado em
273
- **bge-m3 (568M params) + LoRA + cabeça linear** (modo fiel do FT-Solo, fold 01).
274
 
275
  - **Prever** — score + label + faixa de confiança.
276
  - **Explicar** — o mesmo + contribuição de cada palavra via leave-one-out.
 
270
  # Notinhas — endpoint de utilidade (FT-Solo)
271
 
272
  Classificador de utilidade para **community notes em português**, baseado em
273
+ **bge-m3 (568M params) + LoRA + cabeça linear** (Ensemble de 5 folds calibrados).
274
 
275
  - **Prever** — score + label + faixa de confiança.
276
  - **Explicar** — o mesmo + contribuição de cada palavra via leave-one-out.
artifacts/fold_01_adapter/adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:93d21f9a247eb8ce530e04b1f85055f7e405f5d0875ef646d6914de0d2a234a5
3
  size 28482384
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2ed8cdd935ced165fda8970fed3e43011c6a9c9a9ab31672cbf6aaf3050301e8
3
  size 28482384
artifacts/fold_01_head.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:67ae73baff19fd870815c742171fe57d174bd984ccfd7f58751a37b44bbbda9c
3
  size 6093
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1f955b9a02c7a925504cdfea082b4e2ac52ea2426ff64f8a23237f0bf9b3365
3
  size 6093
artifacts/fold_02_adapter/README.md ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-m3
3
+ library_name: peft
4
+ tags:
5
+ - base_model:adapter:BAAI/bge-m3
6
+ - lora
7
+ - transformers
8
+ ---
9
+
10
+ # Model Card for Model ID
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ <!-- Provide a longer summary of what this model is. -->
21
+
22
+
23
+
24
+ - **Developed by:** [More Information Needed]
25
+ - **Funded by [optional]:** [More Information Needed]
26
+ - **Shared by [optional]:** [More Information Needed]
27
+ - **Model type:** [More Information Needed]
28
+ - **Language(s) (NLP):** [More Information Needed]
29
+ - **License:** [More Information Needed]
30
+ - **Finetuned from model [optional]:** [More Information Needed]
31
+
32
+ ### Model Sources [optional]
33
+
34
+ <!-- Provide the basic links for the model. -->
35
+
36
+ - **Repository:** [More Information Needed]
37
+ - **Paper [optional]:** [More Information Needed]
38
+ - **Demo [optional]:** [More Information Needed]
39
+
40
+ ## Uses
41
+
42
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
43
+
44
+ ### Direct Use
45
+
46
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
47
+
48
+ [More Information Needed]
49
+
50
+ ### Downstream Use [optional]
51
+
52
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
53
+
54
+ [More Information Needed]
55
+
56
+ ### Out-of-Scope Use
57
+
58
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
59
+
60
+ [More Information Needed]
61
+
62
+ ## Bias, Risks, and Limitations
63
+
64
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
65
+
66
+ [More Information Needed]
67
+
68
+ ### Recommendations
69
+
70
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
71
+
72
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
73
+
74
+ ## How to Get Started with the Model
75
+
76
+ Use the code below to get started with the model.
77
+
78
+ [More Information Needed]
79
+
80
+ ## Training Details
81
+
82
+ ### Training Data
83
+
84
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
85
+
86
+ [More Information Needed]
87
+
88
+ ### Training Procedure
89
+
90
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
91
+
92
+ #### Preprocessing [optional]
93
+
94
+ [More Information Needed]
95
+
96
+
97
+ #### Training Hyperparameters
98
+
99
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
100
+
101
+ #### Speeds, Sizes, Times [optional]
102
+
103
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
104
+
105
+ [More Information Needed]
106
+
107
+ ## Evaluation
108
+
109
+ <!-- This section describes the evaluation protocols and provides the results. -->
110
+
111
+ ### Testing Data, Factors & Metrics
112
+
113
+ #### Testing Data
114
+
115
+ <!-- This should link to a Dataset Card if possible. -->
116
+
117
+ [More Information Needed]
118
+
119
+ #### Factors
120
+
121
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
122
+
123
+ [More Information Needed]
124
+
125
+ #### Metrics
126
+
127
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
128
+
129
+ [More Information Needed]
130
+
131
+ ### Results
132
+
133
+ [More Information Needed]
134
+
135
+ #### Summary
136
+
137
+
138
+
139
+ ## Model Examination [optional]
140
+
141
+ <!-- Relevant interpretability work for the model goes here -->
142
+
143
+ [More Information Needed]
144
+
145
+ ## Environmental Impact
146
+
147
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
148
+
149
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
150
+
151
+ - **Hardware Type:** [More Information Needed]
152
+ - **Hours used:** [More Information Needed]
153
+ - **Cloud Provider:** [More Information Needed]
154
+ - **Compute Region:** [More Information Needed]
155
+ - **Carbon Emitted:** [More Information Needed]
156
+
157
+ ## Technical Specifications [optional]
158
+
159
+ ### Model Architecture and Objective
160
+
161
+ [More Information Needed]
162
+
163
+ ### Compute Infrastructure
164
+
165
+ [More Information Needed]
166
+
167
+ #### Hardware
168
+
169
+ [More Information Needed]
170
+
171
+ #### Software
172
+
173
+ [More Information Needed]
174
+
175
+ ## Citation [optional]
176
+
177
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
178
+
179
+ **BibTeX:**
180
+
181
+ [More Information Needed]
182
+
183
+ **APA:**
184
+
185
+ [More Information Needed]
186
+
187
+ ## Glossary [optional]
188
+
189
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
190
+
191
+ [More Information Needed]
192
+
193
+ ## More Information [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Authors [optional]
198
+
199
+ [More Information Needed]
200
+
201
+ ## Model Card Contact
202
+
203
+ [More Information Needed]
204
+ ### Framework versions
205
+
206
+ - PEFT 0.18.1
artifacts/fold_02_adapter/adapter_config.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alora_invocation_tokens": null,
3
+ "alpha_pattern": {},
4
+ "arrow_config": null,
5
+ "auto_mapping": {
6
+ "base_model_class": "XLMRobertaModel",
7
+ "parent_library": "transformers.models.xlm_roberta.modeling_xlm_roberta"
8
+ },
9
+ "base_model_name_or_path": "BAAI/bge-m3",
10
+ "bias": "none",
11
+ "corda_config": null,
12
+ "ensure_weight_tying": false,
13
+ "eva_config": null,
14
+ "exclude_modules": null,
15
+ "fan_in_fan_out": false,
16
+ "inference_mode": true,
17
+ "init_lora_weights": true,
18
+ "layer_replication": null,
19
+ "layers_pattern": null,
20
+ "layers_to_transform": null,
21
+ "loftq_config": {},
22
+ "lora_alpha": 32,
23
+ "lora_bias": false,
24
+ "lora_dropout": 0.1,
25
+ "megatron_config": null,
26
+ "megatron_core": "megatron.core",
27
+ "modules_to_save": null,
28
+ "peft_type": "LORA",
29
+ "peft_version": "0.18.1",
30
+ "qalora_group_size": 16,
31
+ "r": 16,
32
+ "rank_pattern": {},
33
+ "revision": null,
34
+ "target_modules": [
35
+ "key",
36
+ "query",
37
+ "value",
38
+ "dense"
39
+ ],
40
+ "target_parameters": null,
41
+ "task_type": null,
42
+ "trainable_token_indices": null,
43
+ "use_dora": false,
44
+ "use_qalora": false,
45
+ "use_rslora": false
46
+ }
artifacts/fold_02_adapter/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29002717cb6a73275585551c51bc4f4ea2769461b49375cac759b827ec877123
3
+ size 28482384
artifacts/fold_02_head.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39d64ffe4d8b3d9ede6342b134d6abb34f1806b2b7d7927541f8e4b4bea6bb66
3
+ size 6093
artifacts/fold_03_adapter/README.md ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-m3
3
+ library_name: peft
4
+ tags:
5
+ - base_model:adapter:BAAI/bge-m3
6
+ - lora
7
+ - transformers
8
+ ---
9
+
10
+ # Model Card for Model ID
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ <!-- Provide a longer summary of what this model is. -->
21
+
22
+
23
+
24
+ - **Developed by:** [More Information Needed]
25
+ - **Funded by [optional]:** [More Information Needed]
26
+ - **Shared by [optional]:** [More Information Needed]
27
+ - **Model type:** [More Information Needed]
28
+ - **Language(s) (NLP):** [More Information Needed]
29
+ - **License:** [More Information Needed]
30
+ - **Finetuned from model [optional]:** [More Information Needed]
31
+
32
+ ### Model Sources [optional]
33
+
34
+ <!-- Provide the basic links for the model. -->
35
+
36
+ - **Repository:** [More Information Needed]
37
+ - **Paper [optional]:** [More Information Needed]
38
+ - **Demo [optional]:** [More Information Needed]
39
+
40
+ ## Uses
41
+
42
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
43
+
44
+ ### Direct Use
45
+
46
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
47
+
48
+ [More Information Needed]
49
+
50
+ ### Downstream Use [optional]
51
+
52
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
53
+
54
+ [More Information Needed]
55
+
56
+ ### Out-of-Scope Use
57
+
58
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
59
+
60
+ [More Information Needed]
61
+
62
+ ## Bias, Risks, and Limitations
63
+
64
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
65
+
66
+ [More Information Needed]
67
+
68
+ ### Recommendations
69
+
70
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
71
+
72
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
73
+
74
+ ## How to Get Started with the Model
75
+
76
+ Use the code below to get started with the model.
77
+
78
+ [More Information Needed]
79
+
80
+ ## Training Details
81
+
82
+ ### Training Data
83
+
84
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
85
+
86
+ [More Information Needed]
87
+
88
+ ### Training Procedure
89
+
90
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
91
+
92
+ #### Preprocessing [optional]
93
+
94
+ [More Information Needed]
95
+
96
+
97
+ #### Training Hyperparameters
98
+
99
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
100
+
101
+ #### Speeds, Sizes, Times [optional]
102
+
103
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
104
+
105
+ [More Information Needed]
106
+
107
+ ## Evaluation
108
+
109
+ <!-- This section describes the evaluation protocols and provides the results. -->
110
+
111
+ ### Testing Data, Factors & Metrics
112
+
113
+ #### Testing Data
114
+
115
+ <!-- This should link to a Dataset Card if possible. -->
116
+
117
+ [More Information Needed]
118
+
119
+ #### Factors
120
+
121
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
122
+
123
+ [More Information Needed]
124
+
125
+ #### Metrics
126
+
127
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
128
+
129
+ [More Information Needed]
130
+
131
+ ### Results
132
+
133
+ [More Information Needed]
134
+
135
+ #### Summary
136
+
137
+
138
+
139
+ ## Model Examination [optional]
140
+
141
+ <!-- Relevant interpretability work for the model goes here -->
142
+
143
+ [More Information Needed]
144
+
145
+ ## Environmental Impact
146
+
147
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
148
+
149
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
150
+
151
+ - **Hardware Type:** [More Information Needed]
152
+ - **Hours used:** [More Information Needed]
153
+ - **Cloud Provider:** [More Information Needed]
154
+ - **Compute Region:** [More Information Needed]
155
+ - **Carbon Emitted:** [More Information Needed]
156
+
157
+ ## Technical Specifications [optional]
158
+
159
+ ### Model Architecture and Objective
160
+
161
+ [More Information Needed]
162
+
163
+ ### Compute Infrastructure
164
+
165
+ [More Information Needed]
166
+
167
+ #### Hardware
168
+
169
+ [More Information Needed]
170
+
171
+ #### Software
172
+
173
+ [More Information Needed]
174
+
175
+ ## Citation [optional]
176
+
177
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
178
+
179
+ **BibTeX:**
180
+
181
+ [More Information Needed]
182
+
183
+ **APA:**
184
+
185
+ [More Information Needed]
186
+
187
+ ## Glossary [optional]
188
+
189
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
190
+
191
+ [More Information Needed]
192
+
193
+ ## More Information [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Authors [optional]
198
+
199
+ [More Information Needed]
200
+
201
+ ## Model Card Contact
202
+
203
+ [More Information Needed]
204
+ ### Framework versions
205
+
206
+ - PEFT 0.18.1
artifacts/fold_03_adapter/adapter_config.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alora_invocation_tokens": null,
3
+ "alpha_pattern": {},
4
+ "arrow_config": null,
5
+ "auto_mapping": {
6
+ "base_model_class": "XLMRobertaModel",
7
+ "parent_library": "transformers.models.xlm_roberta.modeling_xlm_roberta"
8
+ },
9
+ "base_model_name_or_path": "BAAI/bge-m3",
10
+ "bias": "none",
11
+ "corda_config": null,
12
+ "ensure_weight_tying": false,
13
+ "eva_config": null,
14
+ "exclude_modules": null,
15
+ "fan_in_fan_out": false,
16
+ "inference_mode": true,
17
+ "init_lora_weights": true,
18
+ "layer_replication": null,
19
+ "layers_pattern": null,
20
+ "layers_to_transform": null,
21
+ "loftq_config": {},
22
+ "lora_alpha": 32,
23
+ "lora_bias": false,
24
+ "lora_dropout": 0.1,
25
+ "megatron_config": null,
26
+ "megatron_core": "megatron.core",
27
+ "modules_to_save": null,
28
+ "peft_type": "LORA",
29
+ "peft_version": "0.18.1",
30
+ "qalora_group_size": 16,
31
+ "r": 16,
32
+ "rank_pattern": {},
33
+ "revision": null,
34
+ "target_modules": [
35
+ "key",
36
+ "query",
37
+ "value",
38
+ "dense"
39
+ ],
40
+ "target_parameters": null,
41
+ "task_type": null,
42
+ "trainable_token_indices": null,
43
+ "use_dora": false,
44
+ "use_qalora": false,
45
+ "use_rslora": false
46
+ }
artifacts/fold_03_adapter/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:12f3b602ac3df75dc2c55b5f5c56042d68c59079d60226aad005519681c0120a
3
+ size 28482384
artifacts/fold_03_head.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d84b64c684255223b70f0b8b3323e92033e5c1167ed606b475d8612bce3e9cd6
3
+ size 6093
artifacts/fold_04_adapter/README.md ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-m3
3
+ library_name: peft
4
+ tags:
5
+ - base_model:adapter:BAAI/bge-m3
6
+ - lora
7
+ - transformers
8
+ ---
9
+
10
+ # Model Card for Model ID
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ <!-- Provide a longer summary of what this model is. -->
21
+
22
+
23
+
24
+ - **Developed by:** [More Information Needed]
25
+ - **Funded by [optional]:** [More Information Needed]
26
+ - **Shared by [optional]:** [More Information Needed]
27
+ - **Model type:** [More Information Needed]
28
+ - **Language(s) (NLP):** [More Information Needed]
29
+ - **License:** [More Information Needed]
30
+ - **Finetuned from model [optional]:** [More Information Needed]
31
+
32
+ ### Model Sources [optional]
33
+
34
+ <!-- Provide the basic links for the model. -->
35
+
36
+ - **Repository:** [More Information Needed]
37
+ - **Paper [optional]:** [More Information Needed]
38
+ - **Demo [optional]:** [More Information Needed]
39
+
40
+ ## Uses
41
+
42
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
43
+
44
+ ### Direct Use
45
+
46
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
47
+
48
+ [More Information Needed]
49
+
50
+ ### Downstream Use [optional]
51
+
52
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
53
+
54
+ [More Information Needed]
55
+
56
+ ### Out-of-Scope Use
57
+
58
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
59
+
60
+ [More Information Needed]
61
+
62
+ ## Bias, Risks, and Limitations
63
+
64
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
65
+
66
+ [More Information Needed]
67
+
68
+ ### Recommendations
69
+
70
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
71
+
72
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
73
+
74
+ ## How to Get Started with the Model
75
+
76
+ Use the code below to get started with the model.
77
+
78
+ [More Information Needed]
79
+
80
+ ## Training Details
81
+
82
+ ### Training Data
83
+
84
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
85
+
86
+ [More Information Needed]
87
+
88
+ ### Training Procedure
89
+
90
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
91
+
92
+ #### Preprocessing [optional]
93
+
94
+ [More Information Needed]
95
+
96
+
97
+ #### Training Hyperparameters
98
+
99
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
100
+
101
+ #### Speeds, Sizes, Times [optional]
102
+
103
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
104
+
105
+ [More Information Needed]
106
+
107
+ ## Evaluation
108
+
109
+ <!-- This section describes the evaluation protocols and provides the results. -->
110
+
111
+ ### Testing Data, Factors & Metrics
112
+
113
+ #### Testing Data
114
+
115
+ <!-- This should link to a Dataset Card if possible. -->
116
+
117
+ [More Information Needed]
118
+
119
+ #### Factors
120
+
121
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
122
+
123
+ [More Information Needed]
124
+
125
+ #### Metrics
126
+
127
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
128
+
129
+ [More Information Needed]
130
+
131
+ ### Results
132
+
133
+ [More Information Needed]
134
+
135
+ #### Summary
136
+
137
+
138
+
139
+ ## Model Examination [optional]
140
+
141
+ <!-- Relevant interpretability work for the model goes here -->
142
+
143
+ [More Information Needed]
144
+
145
+ ## Environmental Impact
146
+
147
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
148
+
149
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
150
+
151
+ - **Hardware Type:** [More Information Needed]
152
+ - **Hours used:** [More Information Needed]
153
+ - **Cloud Provider:** [More Information Needed]
154
+ - **Compute Region:** [More Information Needed]
155
+ - **Carbon Emitted:** [More Information Needed]
156
+
157
+ ## Technical Specifications [optional]
158
+
159
+ ### Model Architecture and Objective
160
+
161
+ [More Information Needed]
162
+
163
+ ### Compute Infrastructure
164
+
165
+ [More Information Needed]
166
+
167
+ #### Hardware
168
+
169
+ [More Information Needed]
170
+
171
+ #### Software
172
+
173
+ [More Information Needed]
174
+
175
+ ## Citation [optional]
176
+
177
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
178
+
179
+ **BibTeX:**
180
+
181
+ [More Information Needed]
182
+
183
+ **APA:**
184
+
185
+ [More Information Needed]
186
+
187
+ ## Glossary [optional]
188
+
189
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
190
+
191
+ [More Information Needed]
192
+
193
+ ## More Information [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Authors [optional]
198
+
199
+ [More Information Needed]
200
+
201
+ ## Model Card Contact
202
+
203
+ [More Information Needed]
204
+ ### Framework versions
205
+
206
+ - PEFT 0.18.1
artifacts/fold_04_adapter/adapter_config.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alora_invocation_tokens": null,
3
+ "alpha_pattern": {},
4
+ "arrow_config": null,
5
+ "auto_mapping": {
6
+ "base_model_class": "XLMRobertaModel",
7
+ "parent_library": "transformers.models.xlm_roberta.modeling_xlm_roberta"
8
+ },
9
+ "base_model_name_or_path": "BAAI/bge-m3",
10
+ "bias": "none",
11
+ "corda_config": null,
12
+ "ensure_weight_tying": false,
13
+ "eva_config": null,
14
+ "exclude_modules": null,
15
+ "fan_in_fan_out": false,
16
+ "inference_mode": true,
17
+ "init_lora_weights": true,
18
+ "layer_replication": null,
19
+ "layers_pattern": null,
20
+ "layers_to_transform": null,
21
+ "loftq_config": {},
22
+ "lora_alpha": 32,
23
+ "lora_bias": false,
24
+ "lora_dropout": 0.1,
25
+ "megatron_config": null,
26
+ "megatron_core": "megatron.core",
27
+ "modules_to_save": null,
28
+ "peft_type": "LORA",
29
+ "peft_version": "0.18.1",
30
+ "qalora_group_size": 16,
31
+ "r": 16,
32
+ "rank_pattern": {},
33
+ "revision": null,
34
+ "target_modules": [
35
+ "key",
36
+ "query",
37
+ "value",
38
+ "dense"
39
+ ],
40
+ "target_parameters": null,
41
+ "task_type": null,
42
+ "trainable_token_indices": null,
43
+ "use_dora": false,
44
+ "use_qalora": false,
45
+ "use_rslora": false
46
+ }
artifacts/fold_04_adapter/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:93d21f9a247eb8ce530e04b1f85055f7e405f5d0875ef646d6914de0d2a234a5
3
+ size 28482384
artifacts/fold_04_head.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:67ae73baff19fd870815c742171fe57d174bd984ccfd7f58751a37b44bbbda9c
3
+ size 6093
artifacts/fold_05_adapter/README.md ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-m3
3
+ library_name: peft
4
+ tags:
5
+ - base_model:adapter:BAAI/bge-m3
6
+ - lora
7
+ - transformers
8
+ ---
9
+
10
+ # Model Card for Model ID
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ <!-- Provide a longer summary of what this model is. -->
21
+
22
+
23
+
24
+ - **Developed by:** [More Information Needed]
25
+ - **Funded by [optional]:** [More Information Needed]
26
+ - **Shared by [optional]:** [More Information Needed]
27
+ - **Model type:** [More Information Needed]
28
+ - **Language(s) (NLP):** [More Information Needed]
29
+ - **License:** [More Information Needed]
30
+ - **Finetuned from model [optional]:** [More Information Needed]
31
+
32
+ ### Model Sources [optional]
33
+
34
+ <!-- Provide the basic links for the model. -->
35
+
36
+ - **Repository:** [More Information Needed]
37
+ - **Paper [optional]:** [More Information Needed]
38
+ - **Demo [optional]:** [More Information Needed]
39
+
40
+ ## Uses
41
+
42
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
43
+
44
+ ### Direct Use
45
+
46
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
47
+
48
+ [More Information Needed]
49
+
50
+ ### Downstream Use [optional]
51
+
52
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
53
+
54
+ [More Information Needed]
55
+
56
+ ### Out-of-Scope Use
57
+
58
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
59
+
60
+ [More Information Needed]
61
+
62
+ ## Bias, Risks, and Limitations
63
+
64
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
65
+
66
+ [More Information Needed]
67
+
68
+ ### Recommendations
69
+
70
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
71
+
72
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
73
+
74
+ ## How to Get Started with the Model
75
+
76
+ Use the code below to get started with the model.
77
+
78
+ [More Information Needed]
79
+
80
+ ## Training Details
81
+
82
+ ### Training Data
83
+
84
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
85
+
86
+ [More Information Needed]
87
+
88
+ ### Training Procedure
89
+
90
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
91
+
92
+ #### Preprocessing [optional]
93
+
94
+ [More Information Needed]
95
+
96
+
97
+ #### Training Hyperparameters
98
+
99
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
100
+
101
+ #### Speeds, Sizes, Times [optional]
102
+
103
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
104
+
105
+ [More Information Needed]
106
+
107
+ ## Evaluation
108
+
109
+ <!-- This section describes the evaluation protocols and provides the results. -->
110
+
111
+ ### Testing Data, Factors & Metrics
112
+
113
+ #### Testing Data
114
+
115
+ <!-- This should link to a Dataset Card if possible. -->
116
+
117
+ [More Information Needed]
118
+
119
+ #### Factors
120
+
121
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
122
+
123
+ [More Information Needed]
124
+
125
+ #### Metrics
126
+
127
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
128
+
129
+ [More Information Needed]
130
+
131
+ ### Results
132
+
133
+ [More Information Needed]
134
+
135
+ #### Summary
136
+
137
+
138
+
139
+ ## Model Examination [optional]
140
+
141
+ <!-- Relevant interpretability work for the model goes here -->
142
+
143
+ [More Information Needed]
144
+
145
+ ## Environmental Impact
146
+
147
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
148
+
149
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
150
+
151
+ - **Hardware Type:** [More Information Needed]
152
+ - **Hours used:** [More Information Needed]
153
+ - **Cloud Provider:** [More Information Needed]
154
+ - **Compute Region:** [More Information Needed]
155
+ - **Carbon Emitted:** [More Information Needed]
156
+
157
+ ## Technical Specifications [optional]
158
+
159
+ ### Model Architecture and Objective
160
+
161
+ [More Information Needed]
162
+
163
+ ### Compute Infrastructure
164
+
165
+ [More Information Needed]
166
+
167
+ #### Hardware
168
+
169
+ [More Information Needed]
170
+
171
+ #### Software
172
+
173
+ [More Information Needed]
174
+
175
+ ## Citation [optional]
176
+
177
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
178
+
179
+ **BibTeX:**
180
+
181
+ [More Information Needed]
182
+
183
+ **APA:**
184
+
185
+ [More Information Needed]
186
+
187
+ ## Glossary [optional]
188
+
189
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
190
+
191
+ [More Information Needed]
192
+
193
+ ## More Information [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Authors [optional]
198
+
199
+ [More Information Needed]
200
+
201
+ ## Model Card Contact
202
+
203
+ [More Information Needed]
204
+ ### Framework versions
205
+
206
+ - PEFT 0.18.1
artifacts/fold_05_adapter/adapter_config.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alora_invocation_tokens": null,
3
+ "alpha_pattern": {},
4
+ "arrow_config": null,
5
+ "auto_mapping": {
6
+ "base_model_class": "XLMRobertaModel",
7
+ "parent_library": "transformers.models.xlm_roberta.modeling_xlm_roberta"
8
+ },
9
+ "base_model_name_or_path": "BAAI/bge-m3",
10
+ "bias": "none",
11
+ "corda_config": null,
12
+ "ensure_weight_tying": false,
13
+ "eva_config": null,
14
+ "exclude_modules": null,
15
+ "fan_in_fan_out": false,
16
+ "inference_mode": true,
17
+ "init_lora_weights": true,
18
+ "layer_replication": null,
19
+ "layers_pattern": null,
20
+ "layers_to_transform": null,
21
+ "loftq_config": {},
22
+ "lora_alpha": 32,
23
+ "lora_bias": false,
24
+ "lora_dropout": 0.1,
25
+ "megatron_config": null,
26
+ "megatron_core": "megatron.core",
27
+ "modules_to_save": null,
28
+ "peft_type": "LORA",
29
+ "peft_version": "0.18.1",
30
+ "qalora_group_size": 16,
31
+ "r": 16,
32
+ "rank_pattern": {},
33
+ "revision": null,
34
+ "target_modules": [
35
+ "key",
36
+ "query",
37
+ "value",
38
+ "dense"
39
+ ],
40
+ "target_parameters": null,
41
+ "task_type": null,
42
+ "trainable_token_indices": null,
43
+ "use_dora": false,
44
+ "use_qalora": false,
45
+ "use_rslora": false
46
+ }
artifacts/fold_05_adapter/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c5cedd1f3f107034209d47e93d02c3394225b521b6fc2ad8a3cca690e83fd802
3
+ size 28482384
artifacts/fold_05_head.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee5aaf2a544bfa90a7969e342a194aef0fef716465e4b7de89922a8cac6271fe
3
+ size 6093
config.py CHANGED
@@ -1,4 +1,4 @@
1
- """Constantes compartilhadas pelo Space (bge-m3 FT-Solo)."""
2
  from __future__ import annotations
3
 
4
  import os
@@ -17,13 +17,32 @@ TASK_PROMPT = None
17
  # Paths
18
  ROOT = Path(__file__).resolve().parent
19
  ARTIFACTS_DIR = ROOT / "artifacts"
20
- ADAPTER_PATH = ARTIFACTS_DIR / "fold_01_adapter"
21
- HEAD_PATH = ARTIFACTS_DIR / "fold_01_head.pt"
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  # Classificação
24
  THRESHOLD_UTIL = 0.5
25
  CONFIDENCE_BOUNDS_ALTA = (0.10, 0.90)
26
  CONFIDENCE_BOUNDS_MEDIA = (0.30, 0.70)
27
 
 
 
 
 
 
 
 
 
28
  # Secret opcional
29
  HF_TOKEN = os.environ.get("HF_TOKEN")
 
1
+ """Constantes compartilhadas pelo Space (bge-m3 Ensemble calibrado)."""
2
  from __future__ import annotations
3
 
4
  import os
 
17
  # Paths
18
  ROOT = Path(__file__).resolve().parent
19
  ARTIFACTS_DIR = ROOT / "artifacts"
20
+
21
+ # Lista de TODOS os folds disponíveis para ensemble.
22
+ MODEL_FOLDS = [
23
+ "fold_01",
24
+ "fold_02",
25
+ "fold_03",
26
+ "fold_04",
27
+ "fold_05"
28
+ ]
29
+
30
+ # Nome do arquivo de cabeça para cada fold. Pode ser ajustado se o padrão mudar.
31
+ HEAD_FILENAME = "{fold}_head.pt"
32
+ ADAPTER_DIRNAME = "{fold}_adapter"
33
 
34
  # Classificação
35
  THRESHOLD_UTIL = 0.5
36
  CONFIDENCE_BOUNDS_ALTA = (0.10, 0.90)
37
  CONFIDENCE_BOUNDS_MEDIA = (0.30, 0.70)
38
 
39
+ # Parâmetros de calibração para Platt Scaling: P_calib = 1/(1 + exp(a * logit + b)).
40
+ # Ajuste estes valores com base em um conjunto de calibração.
41
+ CALIB_A = 1.0
42
+ CALIB_B = 0.0
43
+
44
+ # Parâmetro de temperature scaling. Defina TEMPERATURE != 1.0 para aplicar scaling.
45
+ TEMPERATURE = 1.0
46
+
47
  # Secret opcional
48
  HF_TOKEN = os.environ.get("HF_TOKEN")
inference.py CHANGED
@@ -1,13 +1,16 @@
1
- """Carregamento do modelo e inferência.
2
 
3
- Serve o FT-Solo com base BAAI/bge-m3 + LoRA do fold 01 + cabeça linear.
4
- Pooling: mean sobre tokens reais (attention_mask). Sem prompt de instrução.
 
 
 
5
  """
6
  from __future__ import annotations
7
 
8
  import logging
9
  from functools import lru_cache
10
- from typing import Iterable
11
 
12
  import numpy as np
13
  import torch
@@ -17,75 +20,59 @@ from peft import PeftModel
17
  from transformers import AutoModel, AutoTokenizer
18
 
19
  from config import (
20
- ADAPTER_PATH,
21
  BATCH_SIZE,
22
- HEAD_PATH,
 
 
 
 
23
  HF_TOKEN,
24
  MAX_LENGTH,
25
  MODEL_NAME,
 
26
  )
27
 
28
  logger = logging.getLogger(__name__)
29
 
30
  # ---------------------------------------------------------------------------
31
- # Dispositivo e dtype — lógica direta do notebook
32
  # ---------------------------------------------------------------------------
33
  DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
34
 
35
  if DEVICE == "cuda":
36
  AMP_DTYPE = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16
37
  else:
38
- # Em CPU usamos float16 nos pesos para caber em RAM. As operações em CPU
39
- # rodam em fp32 via upcast automático; o dtype aqui só controla armazenamento.
40
- # O autocast fica desligado (enabled=False abaixo) — fp16 ativo em CPU é instável.
41
  AMP_DTYPE = torch.float16
42
 
43
 
44
- # ---------------------------------------------------------------------------
45
- # Utilitários
46
- # ---------------------------------------------------------------------------
47
  def build_instruction_text(text: str) -> str:
48
- """bge-m3 não usa prompt de instrução retorna o texto cru."""
49
  return text if isinstance(text, str) else ""
50
 
51
 
52
- def mean_pool(
53
- last_hidden_states: torch.Tensor, attention_mask: torch.Tensor
54
- ) -> torch.Tensor:
55
  """Mean pooling sobre os tokens reais (mascara padding)."""
56
  mask = attention_mask.unsqueeze(-1).float()
57
  return (last_hidden_states * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
58
 
59
 
60
- # ---------------------------------------------------------------------------
61
- # Carregamento preguiçoso e cacheado
62
- # ---------------------------------------------------------------------------
63
  @lru_cache(maxsize=1)
64
- def load_model():
65
- """Retorna (tokenizer, encoder, head). Carregado uma única vez por processo."""
66
- if not ADAPTER_PATH.exists():
67
- raise FileNotFoundError(
68
- f"Adapter LoRA não encontrado em {ADAPTER_PATH}. "
69
- "Suba a pasta fold_01_adapter/ em artifacts/ antes de iniciar o Space."
70
- )
71
- if not HEAD_PATH.exists():
72
- raise FileNotFoundError(
73
- f"Cabeça classificadora não encontrada em {HEAD_PATH}. "
74
- "Suba o fold_01_head.pt em artifacts/ antes de iniciar o Space."
75
- )
76
 
77
  logger.info("Carregando tokenizer de %s", MODEL_NAME)
78
- tokenizer = AutoTokenizer.from_pretrained(
79
- MODEL_NAME, padding_side="right", token=HF_TOKEN
80
- )
81
  if tokenizer.pad_token is None:
82
  tokenizer.pad_token = tokenizer.eos_token
83
 
84
  logger.info(
85
- "Carregando encoder base %s (dtype=%s, device=%s)",
86
- MODEL_NAME,
87
- AMP_DTYPE,
88
- DEVICE,
89
  )
90
  base_encoder = AutoModel.from_pretrained(
91
  MODEL_NAME,
@@ -94,113 +81,112 @@ def load_model():
94
  token=HF_TOKEN,
95
  ).to(DEVICE)
96
 
97
- logger.info("Anexando adapter LoRA de %s", ADAPTER_PATH)
98
- encoder = PeftModel.from_pretrained(
99
- base_encoder, str(ADAPTER_PATH), is_trainable=False
100
- ).to(DEVICE)
101
- encoder.eval()
102
-
103
- logger.info("Carregando cabeça linear de %s", HEAD_PATH)
104
- head_payload = torch.load(HEAD_PATH, map_location="cpu")
105
- # Suporta tanto {"state_dict": {...}} quanto o state_dict direto.
106
- head_state = (
107
- head_payload["state_dict"]
108
- if isinstance(head_payload, dict) and "state_dict" in head_payload
109
- else head_payload
110
- )
111
- in_feat = int(head_state["weight"].shape[1])
112
- head = nn.Linear(in_feat, 1)
113
- head.load_state_dict(head_state)
114
- head = head.to(DEVICE).eval()
115
 
116
- logger.info("Modelo pronto. In_features da cabeça: %d", in_feat)
117
- return tokenizer, encoder, head
 
 
118
 
119
 
120
  def warmup() -> None:
121
- """Força o carregamento agora. Útil para que o primeiro request não pague cold-start."""
122
- load_model()
123
 
124
 
125
- # ---------------------------------------------------------------------------
126
- # Predição — lógica do predict_from_text do notebook, preservada
127
- # ---------------------------------------------------------------------------
128
  @torch.no_grad()
129
- def predict_batch(
130
- texts: Iterable[str], batch_size: int = BATCH_SIZE
131
- ) -> np.ndarray:
132
- """Probabilidade de 'útil' para cada texto. Retorna np.array de shape (N,)."""
133
- tokenizer, encoder, head = load_model()
134
-
135
  if isinstance(texts, str):
136
  texts = [texts]
137
  texts = list(texts)
138
  if not texts:
139
  return np.zeros(0, dtype=np.float64)
140
 
141
- preds = []
 
 
 
142
  autocast_device = "cuda" if DEVICE == "cuda" else "cpu"
143
 
144
- for i in range(0, len(texts), batch_size):
145
- batch = texts[i : i + batch_size]
146
- instr = [build_instruction_text(t) for t in batch]
147
- toks = tokenizer(
148
- instr,
149
- padding=True,
150
- truncation=True,
151
- max_length=MAX_LENGTH,
152
- return_tensors="pt",
153
- ).to(DEVICE)
154
-
155
- with torch.inference_mode(), torch.autocast(
156
- device_type=autocast_device,
157
- dtype=AMP_DTYPE,
158
- enabled=(DEVICE == "cuda"),
159
- ):
160
- out = encoder(**toks)
161
- emb = mean_pool(out.last_hidden_state, toks["attention_mask"])
162
- emb = F.normalize(emb, p=2, dim=1)
163
- # Em CPU sem autocast, o encoder sai em fp16 e a head permanece em fp32 →
164
- # F.linear recusa. Igualar ao dtype da head resolve (inofensivo em GPU).
165
- logits = head(emb.to(head.weight.dtype)).squeeze(-1)
166
- p = torch.sigmoid(logits).float().cpu().numpy()
167
- preds.append(p)
168
-
169
- # Clip nos mesmos limites usados no notebook (evita proba exatamente 0 ou 1).
170
- return np.clip(np.concatenate(preds).astype(np.float64), 1e-6, 1 - 1e-6)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
171
 
172
 
173
  def predict_one(text: str) -> float:
174
- """Atalho: retorna a probabilidade escalar para um único texto."""
175
  return float(predict_batch([text])[0])
176
 
177
 
178
- # ---------------------------------------------------------------------------
179
- # Explicação — occlusion word-level (leave-one-out)
180
- # ---------------------------------------------------------------------------
181
  def explain_occlusion(text: str, batch_size: int = BATCH_SIZE) -> dict:
182
- """Importância por palavra via deixar-uma-fora.
183
-
184
- Para cada palavra separada por espaço: calcula Δ = P(texto) − P(texto sem a palavra).
185
- Δ > 0 → a palavra estava puxando para 'útil'
186
- Δ < 0 → a palavra estava puxando para 'não-útil'
187
-
188
- Custo: (N + 1) forward passes — ~metade do SHAP Partition do notebook,
189
- resultado visual comparável para notas curtas.
190
  """
191
  words = text.split()
192
  if not words:
193
  p = predict_one(text)
194
  return {"proba_full": p, "tokens": [], "contributions": []}
195
-
196
  variants = [" ".join(words[:i] + words[i + 1 :]) for i in range(len(words))]
197
  all_texts = [text] + variants
198
  probs = predict_batch(all_texts, batch_size=batch_size)
199
  p_full = float(probs[0])
200
  contribs = (p_full - probs[1:]).tolist()
201
-
202
- return {
203
- "proba_full": p_full,
204
- "tokens": words,
205
- "contributions": contribs,
206
- }
 
1
+ """Carregamento do modelo e inferência (calibrado e ensemblado).
2
 
3
+ Serve o ensemble calibrado com base BAAI/bge-m3 + LoRA.
4
+ Carrega todas as combinações definidas em config.MODEL_FOLDS, roda a
5
+ inferência para cada uma e calcula a média das probabilidades.
6
+ As probabilidades brutas passam por uma transformação paramétrica
7
+ (Platt scaling / temperature scaling).
8
  """
9
  from __future__ import annotations
10
 
11
  import logging
12
  from functools import lru_cache
13
+ from typing import Iterable, List, Tuple
14
 
15
  import numpy as np
16
  import torch
 
20
  from transformers import AutoModel, AutoTokenizer
21
 
22
  from config import (
23
+ ARTIFACTS_DIR,
24
  BATCH_SIZE,
25
+ CALIB_A,
26
+ CALIB_B,
27
+ HEAD_FILENAME,
28
+ MODEL_FOLDS,
29
+ ADAPTER_DIRNAME,
30
  HF_TOKEN,
31
  MAX_LENGTH,
32
  MODEL_NAME,
33
+ TEMPERATURE,
34
  )
35
 
36
  logger = logging.getLogger(__name__)
37
 
38
  # ---------------------------------------------------------------------------
39
+ # Dispositivo e dtype
40
  # ---------------------------------------------------------------------------
41
  DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
42
 
43
  if DEVICE == "cuda":
44
  AMP_DTYPE = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16
45
  else:
 
 
 
46
  AMP_DTYPE = torch.float16
47
 
48
 
 
 
 
49
  def build_instruction_text(text: str) -> str:
50
+ """Retorna o texto sem prompt de instrução (bge-m3 não usa prompts)."""
51
  return text if isinstance(text, str) else ""
52
 
53
 
54
+ def mean_pool(last_hidden_states: torch.Tensor, attention_mask: torch.Tensor) -> torch.Tensor:
 
 
55
  """Mean pooling sobre os tokens reais (mascara padding)."""
56
  mask = attention_mask.unsqueeze(-1).float()
57
  return (last_hidden_states * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
58
 
59
 
 
 
 
60
  @lru_cache(maxsize=1)
61
+ def load_models() -> List[Tuple[AutoTokenizer, PeftModel, nn.Module]]:
62
+ """
63
+ Carrega todas as combinações (tokenizer, encoder, head) definidas
64
+ em config.MODEL_FOLDS. Retorna uma lista de tuplas.
65
+ O tokenizer e o encoder base são compartilhados para economizar memória.
66
+ """
67
+ models = []
 
 
 
 
 
68
 
69
  logger.info("Carregando tokenizer de %s", MODEL_NAME)
70
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, padding_side="right", token=HF_TOKEN)
 
 
71
  if tokenizer.pad_token is None:
72
  tokenizer.pad_token = tokenizer.eos_token
73
 
74
  logger.info(
75
+ "Carregando encoder base %s (dtype=%s, device=%s)", MODEL_NAME, AMP_DTYPE, DEVICE
 
 
 
76
  )
77
  base_encoder = AutoModel.from_pretrained(
78
  MODEL_NAME,
 
81
  token=HF_TOKEN,
82
  ).to(DEVICE)
83
 
84
+ for fold in MODEL_FOLDS:
85
+ adapter_dir = ARTIFACTS_DIR / ADAPTER_DIRNAME.format(fold=fold)
86
+ head_path = ARTIFACTS_DIR / HEAD_FILENAME.format(fold=fold)
87
+ if not adapter_dir.exists() or not head_path.exists():
88
+ raise FileNotFoundError(
89
+ f"Artifacts do fold '{fold}' não encontrados em {adapter_dir} e {head_path}"
90
+ )
91
+ logger.info("Anexando adapter LoRA de %s", adapter_dir)
92
+ encoder = PeftModel.from_pretrained(base_encoder, str(adapter_dir), is_trainable=False).to(DEVICE)
93
+ encoder.eval()
94
+
95
+ logger.info("Carregando cabeça linear de %s", head_path)
96
+ head_payload = torch.load(head_path, map_location="cpu")
97
+ head_state = head_payload.get("state_dict", head_payload) if isinstance(head_payload, dict) else head_payload
98
+ in_feat = int(head_state["weight"].shape[1])
99
+ head = nn.Linear(in_feat, 1)
100
+ head.load_state_dict(head_state)
101
+ head = head.to(DEVICE).eval()
102
 
103
+ models.append((tokenizer, encoder, head))
104
+
105
+ logger.info("%d modelos de ensemble carregados.", len(models))
106
+ return models
107
 
108
 
109
  def warmup() -> None:
110
+ """Carrega todos os modelos imediatamente para evitar cold-start."""
111
+ load_models()
112
 
113
 
 
 
 
114
  @torch.no_grad()
115
+ def predict_batch(texts: Iterable[str], batch_size: int = BATCH_SIZE) -> np.ndarray:
116
+ """Retorna a probabilidade calibrada de 'útil' para cada texto, em média entre folds."""
 
 
 
 
117
  if isinstance(texts, str):
118
  texts = [texts]
119
  texts = list(texts)
120
  if not texts:
121
  return np.zeros(0, dtype=np.float64)
122
 
123
+ # Coleção de predições por fold
124
+ fold_preds: List[np.ndarray] = []
125
+ models = load_models()
126
+ # Determina dtype para autocast
127
  autocast_device = "cuda" if DEVICE == "cuda" else "cpu"
128
 
129
+ for tokenizer, encoder, head in models:
130
+ preds = []
131
+ for i in range(0, len(texts), batch_size):
132
+ batch = texts[i : i + batch_size]
133
+ instr = [build_instruction_text(t) for t in batch]
134
+ toks = tokenizer(
135
+ instr,
136
+ padding=True,
137
+ truncation=True,
138
+ max_length=MAX_LENGTH,
139
+ return_tensors="pt",
140
+ ).to(DEVICE)
141
+ with torch.inference_mode(), torch.autocast(
142
+ device_type=autocast_device, dtype=AMP_DTYPE, enabled=(DEVICE == "cuda")
143
+ ):
144
+ out = encoder(**toks)
145
+ emb = mean_pool(out.last_hidden_state, toks["attention_mask"])
146
+ emb = F.normalize(emb, p=2, dim=1)
147
+ logits = head(emb.to(head.weight.dtype)).squeeze(-1)
148
+ # Temperature scaling (divide os logits por TEMPERATURE)
149
+ if TEMPERATURE != 1.0:
150
+ logits = logits / TEMPERATURE
151
+ # Calcula p via sigmóide nos logits (pré-calibração)
152
+ p = torch.sigmoid(logits).float().cpu().numpy()
153
+ preds.append(p)
154
+ preds_full = np.concatenate(preds).astype(np.float64)
155
+ # Clip para evitar 0 ou 1 exatos
156
+ preds_full = np.clip(preds_full, 1e-6, 1 - 1e-6)
157
+ # Converte p em logit para aplicar calibração Platt: z = log(p/(1-p))
158
+ if CALIB_A != 1.0 or CALIB_B != 0.0:
159
+ logits_np = np.log(preds_full / (1.0 - preds_full))
160
+ calibrated = 1.0 / (1.0 + np.exp(CALIB_A * logits_np + CALIB_B))
161
+ else:
162
+ calibrated = preds_full
163
+ fold_preds.append(calibrated)
164
+
165
+ # Média do Ensemble
166
+ if len(fold_preds) > 1:
167
+ final = np.mean(fold_preds, axis=0)
168
+ else:
169
+ final = fold_preds[0]
170
+ return final
171
 
172
 
173
  def predict_one(text: str) -> float:
174
+ """Retorna a probabilidade calibrada para um único texto."""
175
  return float(predict_batch([text])[0])
176
 
177
 
 
 
 
178
  def explain_occlusion(text: str, batch_size: int = BATCH_SIZE) -> dict:
179
+ """
180
+ Explicação leave-one-out por palavra, usando a média do ensemble e aplicando calibração.
181
+ Δ = P(texto completo) − P(texto sem a palavra).
 
 
 
 
 
182
  """
183
  words = text.split()
184
  if not words:
185
  p = predict_one(text)
186
  return {"proba_full": p, "tokens": [], "contributions": []}
 
187
  variants = [" ".join(words[:i] + words[i + 1 :]) for i in range(len(words))]
188
  all_texts = [text] + variants
189
  probs = predict_batch(all_texts, batch_size=batch_size)
190
  p_full = float(probs[0])
191
  contribs = (p_full - probs[1:]).tolist()
192
+ return {"proba_full": p_full, "tokens": words, "contributions": contribs}