tcepi
/

mbp_pas_model

Text Classification

binary-classification

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

mbp_pas_model / README.md

vic35get's picture

Add detailed Model Card with metrics

abe7f5f verified 3 days ago

|

history blame contribute delete

3.5 kB

	---
	license: mit
	language:
	- pt
	library_name: transformers
	tags:
	- text-classification
	- binary-classification
	- modernbert
	- pytorch
	- transformers
	datasets:
	- tcepi/mbp_pas_dataset
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	- roc_auc
	base_model: answerdotai/ModernBERT-base
	pipeline_tag: text-classification
	model-index:
	- name: mbp_pas_model
	results:
	- task:
	type: text-classification
	name: Binary Text Classification
	dataset:
	name: tcepi/mbp_pas_dataset
	type: tcepi/mbp_pas_dataset
	split: test
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.9861
	- name: F1
	type: f1
	value: 0.9863
	- name: Precision
	type: precision
	value: 0.9796
	- name: Recall
	type: recall
	value: 0.9931
	- name: ROC-AUC
	type: roc_auc
	value: 0.9988
	---

	# MBP PAS Classification Model

	Este modelo é um fine-tune do [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) para classificação binária, treinado no dataset [tcepi/mbp_pas_dataset](https://huggingface.co/datasets/tcepi/mbp_pas_dataset).

	## Descrição do Modelo

	- Modelo Base: answerdotai/ModernBERT-base
	- Tarefa: Classificação Binária de Texto
	- Linguagem: Português (pt)
	- Framework: PyTorch + Transformers

	## Métricas de Performance

	### Conjunto de Teste

	\| Métrica \| Valor \|
	\|---------\|-------\|
	\| Accuracy \| 0.9861 \|
	\| F1-Score \| 0.9863 \|
	\| Precision \| 0.9796 \|
	\| Recall \| 0.9931 \|
	\| ROC-AUC \| 0.9988 \|
	\| Specificity \| 0.9789 \|

	### Matriz de Confusão

	\| \| Predito Negativo \| Predito Positivo \|
	\|--\|-----------------\|-----------------\|
	\| Real Negativo \| 139 (TN) \| 3 (FP) \|
	\| Real Positivo \| 1 (FN) \| 144 (TP) \|

	### Relatório de Classificação

	```
	precision recall f1-score support

	Negativo 0.9929 0.9789 0.9858 142
	Positivo 0.9796 0.9931 0.9863 145

	accuracy 0.9861 287
	macro avg 0.9862 0.9860 0.9861 287
	weighted avg 0.9862 0.9861 0.9861 287

	```

	## Uso

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Carregar modelo e tokenizer
	tokenizer = AutoTokenizer.from_pretrained("tcepi/mbp_pas_model")
	model = AutoModelForSequenceClassification.from_pretrained("tcepi/mbp_pas_model")

	# Classificar texto
	text = "Seu texto aqui"
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

	with torch.no_grad():
	outputs = model(**inputs)
	predictions = torch.softmax(outputs.logits, dim=-1)
	predicted_class = torch.argmax(predictions, dim=-1).item()

	print(f"Classe predita: {model.config.id2label[predicted_class]}")
	print(f"Probabilidades: {predictions.tolist()}")
	```

	## Treinamento

	### Hiperparâmetros

	- Épocas: 5
	- Learning Rate: 2e-5
	- Batch Size: 8
	- Weight Decay: 0.01
	- Warmup Ratio: 0.1
	- Mixed Precision: FP16
	- Optimizer: AdamW

	### Informações de Treinamento

	- Tempo Total: 186.64 segundos
	- Samples/segundo: 55.19
	- Loss Final: 0.1391

	## Dataset

	O modelo foi treinado usando o dataset [tcepi/mbp_pas_dataset](https://huggingface.co/datasets/tcepi/mbp_pas_dataset).

	## Limitações

	- O modelo foi treinado especificamente para o domínio do dataset MBP/PAS
	- Performance pode variar em textos de outros domínios
	- Recomenda-se avaliar o modelo antes de usar em produção