billfass
/

my_bert_model

Model card Files Files and versions

my_bert_model / README.md

billfass

Initial commit of BertModel and tokenizer

715c3cc over 2 years ago

|

history blame contribute delete

1.72 kB



	# Custom BERT Model for Text Classification

	## Model Description

	This is a custom BERT model fine-tuned for text classification. The model was trained using a subset of a publicly available dataset and is capable of classifying text into 3 classes.

	## Training Details

	- Architecture: BERT Base Multilingual Cased
	- Training data: Custom dataset
	- Preprocessing: Tokenized using BERT's tokenizer, with a max sequence length of 80.
	- Fine-tuning: The model was trained for 1 epoch with a learning rate of 2e-5, using AdamW optimizer and Cross-Entropy Loss.
	- Evaluation Metrics: Accuracy on a held-out validation set.

	## How to Use

	### Dependencies
	- Transformers 4.x
	- Torch 1.x

	### Code Snippet

	For classification:

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	tokenizer = AutoTokenizer.from_pretrained("billfass/my_bert_model")
	model = AutoModelForSequenceClassification.from_pretrained("billfass/my_bert_model")

	text = "Your example text here."

	inputs = tokenizer(text, padding=True, truncation=True, max_length=80, return_tensors="pt")
	labels = torch.tensor([1]).unsqueeze(0) # Batch size 1

	outputs = model(**inputs, labels=labels)
	loss = outputs.loss
	logits = outputs.logits

	# To get probabilities:
	probs = torch.softmax(logits, dim=-1)
	```

	## Limitations and Bias

	- Trained on a specific dataset, so may not generalize well to other kinds of text.
	- Uses multilingual cased BERT, so it's not optimized for any specific language.

	## Authors

	- Fassinou Bile
	- billfass2010@gmail.com

	## Acknowledgments

	Special thanks to Hugging Face for providing the Transformers library that made this project possible.

	---