Text Classification
Transformers
Safetensors
English
bert
multilabel
classification
finetune
finance
regulatory
text
risk
text-embeddings-inference
Instructions to use yirifiai1/BERT_Regulatory_Text_Classification_01 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yirifiai1/BERT_Regulatory_Text_Classification_01 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="yirifiai1/BERT_Regulatory_Text_Classification_01")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("yirifiai1/BERT_Regulatory_Text_Classification_01") model = AutoModelForSequenceClassification.from_pretrained("yirifiai1/BERT_Regulatory_Text_Classification_01") - Notebooks
- Google Colab
- Kaggle
This model is a fine-tuned version of the BERT language model, specifically adapted for multi-label classification tasks in the financial regulatory domain. It is built upon the pre-trained ProsusAI/finbert model, which has been further fine-tuned using a diverse dataset of financial regulatory texts. This allows the model to accurately classify text into multiple relevant categories simultaneously.
Model Architecture
- Base Model: BERT
- Pre-trained Model: ProsusAI/finbert
- Task: Multi-label classification
Performance
Performance metrics on the validation set:
- F1 Score: 0.8637
- ROC AUC: 0.9044
- Accuracy: 0.6155
Limitations and Ethical Considerations
- This model's performance may vary depending on the specific nature of the text data and label distribution.
- Class imbalance in the dataset.
Dataset Information
- Training Dataset: Number of samples: 6562
- Validation Dataset: Number of samples: 929
- Test Dataset: Number of samples: 1884
Training Details
- Training Strategy: Fine-tuning BERT with a randomly initialized classification head.
- Optimizer: Adam
- Learning Rate: 1e-4
- Batch Size: 16
- Number of Epochs: 2
- Evaluation Strategy: Epoch
- Weight Decay: 0.01
- Metric for Best Model: F1 Score
- Downloads last month
- 22