Spaces:

histlearn
/

communitynotesbr

Sleeping

App Files Files Community

histlearn commited on Apr 24

Commit

76f9a5f

verified ·

1 Parent(s): 233b2df

feat: ativa ensemble completo com calibração (Platt scaling)

Browse files

Files changed (22) hide show

README.md +2 -1
app.py +1 -1
artifacts/fold_01_adapter/adapter_model.safetensors +1 -1
artifacts/fold_01_head.pt +1 -1
artifacts/fold_02_adapter/README.md +206 -0
artifacts/fold_02_adapter/adapter_config.json +46 -0
artifacts/fold_02_adapter/adapter_model.safetensors +3 -0
artifacts/fold_02_head.pt +3 -0
artifacts/fold_03_adapter/README.md +206 -0
artifacts/fold_03_adapter/adapter_config.json +46 -0
artifacts/fold_03_adapter/adapter_model.safetensors +3 -0
artifacts/fold_03_head.pt +3 -0
artifacts/fold_04_adapter/README.md +206 -0
artifacts/fold_04_adapter/adapter_config.json +46 -0
artifacts/fold_04_adapter/adapter_model.safetensors +3 -0
artifacts/fold_04_head.pt +3 -0
artifacts/fold_05_adapter/README.md +206 -0
artifacts/fold_05_adapter/adapter_config.json +46 -0
artifacts/fold_05_adapter/adapter_model.safetensors +3 -0
artifacts/fold_05_head.pt +3 -0
config.py +22 -3
inference.py +103 -117

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ note* em português, devolve a probabilidade de ela ser classificada como "útil
 (`label_binary_strict = 1`), junto com uma leitura opcional da contribuição de
 cada palavra.
-Arquitetura: **bge-m3 (568M params) + LoRA + cabeça linear**, idêntica ao
 `predict_from_text` do notebook FT-Solo em modo fiel
 (fold 01).
@@ -195,3 +195,4 @@ e exemplos de `curl` para os dois endpoints.
 Baseado no pipeline e no notebook de explicabilidade do projeto Notinhas.
 O código aqui é o protótipo funcional da função `predict_from_text` virado serviço.

 (`label_binary_strict = 1`), junto com uma leitura opcional da contribuição de
 cada palavra.
+Arquitetura: **bge-m3 (568M params) + LoRA + cabeça linear (Ensemble e Calibração)**, idêntica ao
 `predict_from_text` do notebook FT-Solo em modo fiel
 (fold 01).
 Baseado no pipeline e no notebook de explicabilidade do projeto Notinhas.
 O código aqui é o protótipo funcional da função `predict_from_text` virado serviço.
+\n## Calibração e Ensemble\nEste Space carrega múltiplos folds como um **ensemble**, calculando a média das\nprobabilidades de 5 versões do modelo adaptado (LoRA + Cabeça Linear) para aumentar\na robustez da classificação. Além disso, as probabilidades passam por\n**Platt scaling** com base nos parâmetros do `config.py` para melhorar a calibração.\n

app.py CHANGED Viewed

@@ -270,7 +270,7 @@ INTRO_MD = """
 # Notinhas — endpoint de utilidade (FT-Solo)
 Classificador de utilidade para **community notes em português**, baseado em
-**bge-m3 (568M params) + LoRA + cabeça linear** (modo fiel do FT-Solo, fold 01).
 - **Prever** — score + label + faixa de confiança.
 - **Explicar** — o mesmo + contribuição de cada palavra via leave-one-out.

 # Notinhas — endpoint de utilidade (FT-Solo)
 Classificador de utilidade para **community notes em português**, baseado em
+**bge-m3 (568M params) + LoRA + cabeça linear** (Ensemble de 5 folds calibrados).
 - **Prever** — score + label + faixa de confiança.
 - **Explicar** — o mesmo + contribuição de cada palavra via leave-one-out.

artifacts/fold_01_adapter/adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:93d21f9a247eb8ce530e04b1f85055f7e405f5d0875ef646d6914de0d2a234a5
 size 28482384

 version https://git-lfs.github.com/spec/v1
+oid sha256:2ed8cdd935ced165fda8970fed3e43011c6a9c9a9ab31672cbf6aaf3050301e8
 size 28482384

artifacts/fold_01_head.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:67ae73baff19fd870815c742171fe57d174bd984ccfd7f58751a37b44bbbda9c
 size 6093

 version https://git-lfs.github.com/spec/v1
+oid sha256:d1f955b9a02c7a925504cdfea082b4e2ac52ea2426ff64f8a23237f0bf9b3365
 size 6093

artifacts/fold_02_adapter/README.md ADDED Viewed

	@@ -0,0 +1,206 @@

+---
+base_model: BAAI/bge-m3
+library_name: peft
+tags:
+- base_model:adapter:BAAI/bge-m3
+- lora
+- transformers
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.18.1

artifacts/fold_02_adapter/adapter_config.json ADDED Viewed

	@@ -0,0 +1,46 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "XLMRobertaModel",
+    "parent_library": "transformers.models.xlm_roberta.modeling_xlm_roberta"
+  },
+  "base_model_name_or_path": "BAAI/bge-m3",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "key",
+    "query",
+    "value",
+    "dense"
+  ],
+  "target_parameters": null,
+  "task_type": null,
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

artifacts/fold_02_adapter/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:29002717cb6a73275585551c51bc4f4ea2769461b49375cac759b827ec877123
+size 28482384

artifacts/fold_02_head.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:39d64ffe4d8b3d9ede6342b134d6abb34f1806b2b7d7927541f8e4b4bea6bb66
+size 6093

artifacts/fold_03_adapter/README.md ADDED Viewed

	@@ -0,0 +1,206 @@

+---
+base_model: BAAI/bge-m3
+library_name: peft
+tags:
+- base_model:adapter:BAAI/bge-m3
+- lora
+- transformers
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.18.1

artifacts/fold_03_adapter/adapter_config.json ADDED Viewed

	@@ -0,0 +1,46 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "XLMRobertaModel",
+    "parent_library": "transformers.models.xlm_roberta.modeling_xlm_roberta"
+  },
+  "base_model_name_or_path": "BAAI/bge-m3",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "key",
+    "query",
+    "value",
+    "dense"
+  ],
+  "target_parameters": null,
+  "task_type": null,
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

artifacts/fold_03_adapter/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:12f3b602ac3df75dc2c55b5f5c56042d68c59079d60226aad005519681c0120a
+size 28482384

artifacts/fold_03_head.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d84b64c684255223b70f0b8b3323e92033e5c1167ed606b475d8612bce3e9cd6
+size 6093

artifacts/fold_04_adapter/README.md ADDED Viewed

	@@ -0,0 +1,206 @@

+---
+base_model: BAAI/bge-m3
+library_name: peft
+tags:
+- base_model:adapter:BAAI/bge-m3
+- lora
+- transformers
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.18.1

artifacts/fold_04_adapter/adapter_config.json ADDED Viewed

	@@ -0,0 +1,46 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "XLMRobertaModel",
+    "parent_library": "transformers.models.xlm_roberta.modeling_xlm_roberta"
+  },
+  "base_model_name_or_path": "BAAI/bge-m3",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "key",
+    "query",
+    "value",
+    "dense"
+  ],
+  "target_parameters": null,
+  "task_type": null,
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

artifacts/fold_04_adapter/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:93d21f9a247eb8ce530e04b1f85055f7e405f5d0875ef646d6914de0d2a234a5
+size 28482384

artifacts/fold_04_head.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:67ae73baff19fd870815c742171fe57d174bd984ccfd7f58751a37b44bbbda9c
+size 6093

artifacts/fold_05_adapter/README.md ADDED Viewed

	@@ -0,0 +1,206 @@

+---
+base_model: BAAI/bge-m3
+library_name: peft
+tags:
+- base_model:adapter:BAAI/bge-m3
+- lora
+- transformers
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.18.1

artifacts/fold_05_adapter/adapter_config.json ADDED Viewed

	@@ -0,0 +1,46 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "XLMRobertaModel",
+    "parent_library": "transformers.models.xlm_roberta.modeling_xlm_roberta"
+  },
+  "base_model_name_or_path": "BAAI/bge-m3",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "key",
+    "query",
+    "value",
+    "dense"
+  ],
+  "target_parameters": null,
+  "task_type": null,
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

artifacts/fold_05_adapter/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c5cedd1f3f107034209d47e93d02c3394225b521b6fc2ad8a3cca690e83fd802
+size 28482384

artifacts/fold_05_head.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ee5aaf2a544bfa90a7969e342a194aef0fef716465e4b7de89922a8cac6271fe
+size 6093

config.py CHANGED Viewed

@@ -1,4 +1,4 @@
-"""Constantes compartilhadas pelo Space (bge-m3 FT-Solo)."""
 from __future__ import annotations
 import os
@@ -17,13 +17,32 @@ TASK_PROMPT = None
 # Paths
 ROOT = Path(__file__).resolve().parent
 ARTIFACTS_DIR = ROOT / "artifacts"
-ADAPTER_PATH = ARTIFACTS_DIR / "fold_01_adapter"
-HEAD_PATH    = ARTIFACTS_DIR / "fold_01_head.pt"
 # Classificação
 THRESHOLD_UTIL = 0.5
 CONFIDENCE_BOUNDS_ALTA  = (0.10, 0.90)
 CONFIDENCE_BOUNDS_MEDIA = (0.30, 0.70)
 # Secret opcional
 HF_TOKEN = os.environ.get("HF_TOKEN")

+"""Constantes compartilhadas pelo Space (bge-m3 Ensemble calibrado)."""
 from __future__ import annotations
 import os
 # Paths
 ROOT = Path(__file__).resolve().parent
 ARTIFACTS_DIR = ROOT / "artifacts"
+# Lista de TODOS os folds disponíveis para ensemble.
+MODEL_FOLDS = [
+    "fold_01",
+    "fold_02",
+    "fold_03",
+    "fold_04",
+    "fold_05"
+]
+# Nome do arquivo de cabeça para cada fold. Pode ser ajustado se o padrão mudar.
+HEAD_FILENAME = "{fold}_head.pt"
+ADAPTER_DIRNAME = "{fold}_adapter"
 # Classificação
 THRESHOLD_UTIL = 0.5
 CONFIDENCE_BOUNDS_ALTA  = (0.10, 0.90)
 CONFIDENCE_BOUNDS_MEDIA = (0.30, 0.70)
+# Parâmetros de calibração para Platt Scaling: P_calib = 1/(1 + exp(a * logit + b)).
+# Ajuste estes valores com base em um conjunto de calibração.
+CALIB_A = 1.0
+CALIB_B = 0.0
+# Parâmetro de temperature scaling. Defina TEMPERATURE != 1.0 para aplicar scaling.
+TEMPERATURE = 1.0
 # Secret opcional
 HF_TOKEN = os.environ.get("HF_TOKEN")

inference.py CHANGED Viewed

@@ -1,13 +1,16 @@
-"""Carregamento do modelo e inferência.
-Serve o FT-Solo com base BAAI/bge-m3 + LoRA do fold 01 + cabeça linear.
-Pooling: mean sobre tokens reais (attention_mask). Sem prompt de instrução.
 """
 from __future__ import annotations
 import logging
 from functools import lru_cache
-from typing import Iterable
 import numpy as np
 import torch
@@ -17,75 +20,59 @@ from peft import PeftModel
 from transformers import AutoModel, AutoTokenizer
 from config import (
-    ADAPTER_PATH,
     BATCH_SIZE,
-    HEAD_PATH,
     HF_TOKEN,
     MAX_LENGTH,
     MODEL_NAME,
 )
 logger = logging.getLogger(__name__)
 # ---------------------------------------------------------------------------
-# Dispositivo e dtype — lógica direta do notebook
 # ---------------------------------------------------------------------------
 DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
 if DEVICE == "cuda":
     AMP_DTYPE = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16
 else:
-    # Em CPU usamos float16 nos pesos para caber em RAM. As operações em CPU
-    # rodam em fp32 via upcast automático; o dtype aqui só controla armazenamento.
-    # O autocast fica desligado (enabled=False abaixo) — fp16 ativo em CPU é instável.
     AMP_DTYPE = torch.float16
-# ---------------------------------------------------------------------------
-# Utilitários
-# ---------------------------------------------------------------------------
 def build_instruction_text(text: str) -> str:
-    """bge-m3 não usa prompt de instrução — retorna o texto cru."""
     return text if isinstance(text, str) else ""
-def mean_pool(
-    last_hidden_states: torch.Tensor, attention_mask: torch.Tensor
-) -> torch.Tensor:
     """Mean pooling sobre os tokens reais (mascara padding)."""
     mask = attention_mask.unsqueeze(-1).float()
     return (last_hidden_states * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
-# ---------------------------------------------------------------------------
-# Carregamento preguiçoso e cacheado
-# ---------------------------------------------------------------------------
 @lru_cache(maxsize=1)
-def load_model():
-    """Retorna (tokenizer, encoder, head). Carregado uma única vez por processo."""
-    if not ADAPTER_PATH.exists():
-        raise FileNotFoundError(
-            f"Adapter LoRA não encontrado em {ADAPTER_PATH}. "
-            "Suba a pasta fold_01_adapter/ em artifacts/ antes de iniciar o Space."
-        )
-    if not HEAD_PATH.exists():
-        raise FileNotFoundError(
-            f"Cabeça classificadora não encontrada em {HEAD_PATH}. "
-            "Suba o fold_01_head.pt em artifacts/ antes de iniciar o Space."
-        )
     logger.info("Carregando tokenizer de %s", MODEL_NAME)
-    tokenizer = AutoTokenizer.from_pretrained(
-        MODEL_NAME, padding_side="right", token=HF_TOKEN
-    )
     if tokenizer.pad_token is None:
         tokenizer.pad_token = tokenizer.eos_token
     logger.info(
-        "Carregando encoder base %s (dtype=%s, device=%s)",
-        MODEL_NAME,
-        AMP_DTYPE,
-        DEVICE,
     )
     base_encoder = AutoModel.from_pretrained(
         MODEL_NAME,
@@ -94,113 +81,112 @@ def load_model():
         token=HF_TOKEN,
     ).to(DEVICE)
-    logger.info("Anexando adapter LoRA de %s", ADAPTER_PATH)
-    encoder = PeftModel.from_pretrained(
-        base_encoder, str(ADAPTER_PATH), is_trainable=False
-    ).to(DEVICE)
-    encoder.eval()
-    logger.info("Carregando cabeça linear de %s", HEAD_PATH)
-    head_payload = torch.load(HEAD_PATH, map_location="cpu")
-    # Suporta tanto {"state_dict": {...}} quanto o state_dict direto.
-    head_state = (
-        head_payload["state_dict"]
-        if isinstance(head_payload, dict) and "state_dict" in head_payload
-        else head_payload
-    )
-    in_feat = int(head_state["weight"].shape[1])
-    head = nn.Linear(in_feat, 1)
-    head.load_state_dict(head_state)
-    head = head.to(DEVICE).eval()
-    logger.info("Modelo pronto. In_features da cabeça: %d", in_feat)
-    return tokenizer, encoder, head
 def warmup() -> None:
-    """Força o carregamento agora. Útil para que o primeiro request não pague cold-start."""
-    load_model()
-# ---------------------------------------------------------------------------
-# Predição — lógica do predict_from_text do notebook, preservada
-# ---------------------------------------------------------------------------
 @torch.no_grad()
-def predict_batch(
-    texts: Iterable[str], batch_size: int = BATCH_SIZE
-) -> np.ndarray:
-    """Probabilidade de 'útil' para cada texto. Retorna np.array de shape (N,)."""
-    tokenizer, encoder, head = load_model()
     if isinstance(texts, str):
         texts = [texts]
     texts = list(texts)
     if not texts:
         return np.zeros(0, dtype=np.float64)
-    preds = []
     autocast_device = "cuda" if DEVICE == "cuda" else "cpu"
-    for i in range(0, len(texts), batch_size):
-        batch = texts[i : i + batch_size]
-        instr = [build_instruction_text(t) for t in batch]
-        toks = tokenizer(
-            instr,
-            padding=True,
-            truncation=True,
-            max_length=MAX_LENGTH,
-            return_tensors="pt",
-        ).to(DEVICE)
-        with torch.inference_mode(), torch.autocast(
-            device_type=autocast_device,
-            dtype=AMP_DTYPE,
-            enabled=(DEVICE == "cuda"),
-        ):
-            out = encoder(**toks)
-            emb = mean_pool(out.last_hidden_state, toks["attention_mask"])
-            emb = F.normalize(emb, p=2, dim=1)
-            # Em CPU sem autocast, o encoder sai em fp16 e a head permanece em fp32 →
-            # F.linear recusa. Igualar ao dtype da head resolve (inofensivo em GPU).
-            logits = head(emb.to(head.weight.dtype)).squeeze(-1)
-            p = torch.sigmoid(logits).float().cpu().numpy()
-        preds.append(p)
-    # Clip nos mesmos limites usados no notebook (evita proba exatamente 0 ou 1).
-    return np.clip(np.concatenate(preds).astype(np.float64), 1e-6, 1 - 1e-6)
 def predict_one(text: str) -> float:
-    """Atalho: retorna a probabilidade escalar para um único texto."""
     return float(predict_batch([text])[0])
-# ---------------------------------------------------------------------------
-# Explicação — occlusion word-level (leave-one-out)
-# ---------------------------------------------------------------------------
 def explain_occlusion(text: str, batch_size: int = BATCH_SIZE) -> dict:
-    """Importância por palavra via deixar-uma-fora.
-    Para cada palavra separada por espaço: calcula Δ = P(texto) − P(texto sem a palavra).
-        Δ > 0 → a palavra estava puxando para 'útil'
-        Δ < 0 → a palavra estava puxando para 'não-útil'
-    Custo: (N + 1) forward passes — ~metade do SHAP Partition do notebook,
-    resultado visual comparável para notas curtas.
     """
     words = text.split()
     if not words:
         p = predict_one(text)
         return {"proba_full": p, "tokens": [], "contributions": []}
     variants = [" ".join(words[:i] + words[i + 1 :]) for i in range(len(words))]
     all_texts = [text] + variants
     probs = predict_batch(all_texts, batch_size=batch_size)
     p_full = float(probs[0])
     contribs = (p_full - probs[1:]).tolist()
-    return {
-        "proba_full": p_full,
-        "tokens": words,
-        "contributions": contribs,
-    }

+"""Carregamento do modelo e inferência (calibrado e ensemblado).
+Serve o ensemble calibrado com base BAAI/bge-m3 + LoRA.
+Carrega todas as combinações definidas em config.MODEL_FOLDS, roda a
+inferência para cada uma e calcula a média das probabilidades.
+As probabilidades brutas passam por uma transformação paramétrica
+(Platt scaling / temperature scaling).
 """
 from __future__ import annotations
 import logging
 from functools import lru_cache
+from typing import Iterable, List, Tuple
 import numpy as np
 import torch
 from transformers import AutoModel, AutoTokenizer
 from config import (
+    ARTIFACTS_DIR,
     BATCH_SIZE,
+    CALIB_A,
+    CALIB_B,
+    HEAD_FILENAME,
+    MODEL_FOLDS,
+    ADAPTER_DIRNAME,
     HF_TOKEN,
     MAX_LENGTH,
     MODEL_NAME,
+    TEMPERATURE,
 )
 logger = logging.getLogger(__name__)
 # ---------------------------------------------------------------------------
+# Dispositivo e dtype
 # ---------------------------------------------------------------------------
 DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
 if DEVICE == "cuda":
     AMP_DTYPE = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16
 else:
     AMP_DTYPE = torch.float16
 def build_instruction_text(text: str) -> str:
+    """Retorna o texto sem prompt de instrução (bge-m3 não usa prompts)."""
     return text if isinstance(text, str) else ""
+def mean_pool(last_hidden_states: torch.Tensor, attention_mask: torch.Tensor) -> torch.Tensor:
     """Mean pooling sobre os tokens reais (mascara padding)."""
     mask = attention_mask.unsqueeze(-1).float()
     return (last_hidden_states * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
 @lru_cache(maxsize=1)
+def load_models() -> List[Tuple[AutoTokenizer, PeftModel, nn.Module]]:
+    """
+    Carrega todas as combinações (tokenizer, encoder, head) definidas
+    em config.MODEL_FOLDS. Retorna uma lista de tuplas.
+    O tokenizer e o encoder base são compartilhados para economizar memória.
+    """
+    models = []
     logger.info("Carregando tokenizer de %s", MODEL_NAME)
+    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, padding_side="right", token=HF_TOKEN)
     if tokenizer.pad_token is None:
         tokenizer.pad_token = tokenizer.eos_token
     logger.info(
+        "Carregando encoder base %s (dtype=%s, device=%s)", MODEL_NAME, AMP_DTYPE, DEVICE
     )
     base_encoder = AutoModel.from_pretrained(
         MODEL_NAME,
         token=HF_TOKEN,
     ).to(DEVICE)
+    for fold in MODEL_FOLDS:
+        adapter_dir = ARTIFACTS_DIR / ADAPTER_DIRNAME.format(fold=fold)
+        head_path = ARTIFACTS_DIR / HEAD_FILENAME.format(fold=fold)
+        if not adapter_dir.exists() or not head_path.exists():
+            raise FileNotFoundError(
+                f"Artifacts do fold '{fold}' não encontrados em {adapter_dir} e {head_path}"
+            )
+        logger.info("Anexando adapter LoRA de %s", adapter_dir)
+        encoder = PeftModel.from_pretrained(base_encoder, str(adapter_dir), is_trainable=False).to(DEVICE)
+        encoder.eval()
+        logger.info("Carregando cabeça linear de %s", head_path)
+        head_payload = torch.load(head_path, map_location="cpu")
+        head_state = head_payload.get("state_dict", head_payload) if isinstance(head_payload, dict) else head_payload
+        in_feat = int(head_state["weight"].shape[1])
+        head = nn.Linear(in_feat, 1)
+        head.load_state_dict(head_state)
+        head = head.to(DEVICE).eval()
+        models.append((tokenizer, encoder, head))
+    logger.info("%d modelos de ensemble carregados.", len(models))
+    return models
 def warmup() -> None:
+    """Carrega todos os modelos imediatamente para evitar cold-start."""
+    load_models()
 @torch.no_grad()
+def predict_batch(texts: Iterable[str], batch_size: int = BATCH_SIZE) -> np.ndarray:
+    """Retorna a probabilidade calibrada de 'útil' para cada texto, em média entre folds."""
     if isinstance(texts, str):
         texts = [texts]
     texts = list(texts)
     if not texts:
         return np.zeros(0, dtype=np.float64)
+    # Coleção de predições por fold
+    fold_preds: List[np.ndarray] = []
+    models = load_models()
+    # Determina dtype para autocast
     autocast_device = "cuda" if DEVICE == "cuda" else "cpu"
+    for tokenizer, encoder, head in models:
+        preds = []
+        for i in range(0, len(texts), batch_size):
+            batch = texts[i : i + batch_size]
+            instr = [build_instruction_text(t) for t in batch]
+            toks = tokenizer(
+                instr,
+                padding=True,
+                truncation=True,
+                max_length=MAX_LENGTH,
+                return_tensors="pt",
+            ).to(DEVICE)
+            with torch.inference_mode(), torch.autocast(
+                device_type=autocast_device, dtype=AMP_DTYPE, enabled=(DEVICE == "cuda")
+            ):
+                out = encoder(**toks)
+                emb = mean_pool(out.last_hidden_state, toks["attention_mask"])
+                emb = F.normalize(emb, p=2, dim=1)
+                logits = head(emb.to(head.weight.dtype)).squeeze(-1)
+                # Temperature scaling (divide os logits por TEMPERATURE)
+                if TEMPERATURE != 1.0:
+                    logits = logits / TEMPERATURE
+                # Calcula p via sigmóide nos logits (pré-calibração)
+                p = torch.sigmoid(logits).float().cpu().numpy()
+            preds.append(p)
+        preds_full = np.concatenate(preds).astype(np.float64)
+        # Clip para evitar 0 ou 1 exatos
+        preds_full = np.clip(preds_full, 1e-6, 1 - 1e-6)
+        # Converte p em logit para aplicar calibração Platt: z = log(p/(1-p))
+        if CALIB_A != 1.0 or CALIB_B != 0.0:
+            logits_np = np.log(preds_full / (1.0 - preds_full))
+            calibrated = 1.0 / (1.0 + np.exp(CALIB_A * logits_np + CALIB_B))
+        else:
+            calibrated = preds_full
+        fold_preds.append(calibrated)
+    # Média do Ensemble
+    if len(fold_preds) > 1:
+        final = np.mean(fold_preds, axis=0)
+    else:
+        final = fold_preds[0]
+    return final
 def predict_one(text: str) -> float:
+    """Retorna a probabilidade calibrada para um único texto."""
     return float(predict_batch([text])[0])
 def explain_occlusion(text: str, batch_size: int = BATCH_SIZE) -> dict:
+    """
+    Explicação leave-one-out por palavra, usando a média do ensemble e aplicando calibração.
+    Δ = P(texto completo) − P(texto sem a palavra).
     """
     words = text.split()
     if not words:
         p = predict_one(text)
         return {"proba_full": p, "tokens": [], "contributions": []}
     variants = [" ".join(words[:i] + words[i + 1 :]) for i in range(len(words))]
     all_texts = [text] + variants
     probs = predict_batch(all_texts, batch_size=batch_size)
     p_full = float(probs[0])
     contribs = (p_full - probs[1:]).tolist()
+    return {"proba_full": p_full, "tokens": words, "contributions": contribs}