Instructions to use krishjothi/WG_Bert with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use krishjothi/WG_Bert with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="krishjothi/WG_Bert")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("krishjothi/WG_Bert") model = AutoModelForTokenClassification.from_pretrained("krishjothi/WG_Bert") - Notebooks
- Google Colab
- Kaggle
Commit ·
caaa61b
1
Parent(s): 73b1e0e
Update README.md
Browse files
README.md
CHANGED
|
@@ -1 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
WG-BERT (Warranty and Goodwill) is a pretrained encoder based model to analyze automotive entities in automotive-related texts. WG-BERT is trained by continually pretraining the BERT language model in the automotive domain by using a corpus of automotive (workshop feedback) texts via the masked language modeling (MLM) approach. WG-BERT is further fine-tuned for automotive entity recognition (subtask of Named Entity Recognition (NER)) to extract components and their complaints out of automotive texts. The dataset for continual pretraining consists of ~ 4 million sentences. The dataset for fine-tuning consists of ~5.500 gold annotated sentences by automotive domain experts. We choose the BERT-base-uncased as the base model.
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
metrics:
|
| 5 |
+
- f1
|
| 6 |
+
pipeline_tag: token-classification
|
| 7 |
+
tags:
|
| 8 |
+
- automotive
|
| 9 |
+
---
|
| 10 |
WG-BERT (Warranty and Goodwill) is a pretrained encoder based model to analyze automotive entities in automotive-related texts. WG-BERT is trained by continually pretraining the BERT language model in the automotive domain by using a corpus of automotive (workshop feedback) texts via the masked language modeling (MLM) approach. WG-BERT is further fine-tuned for automotive entity recognition (subtask of Named Entity Recognition (NER)) to extract components and their complaints out of automotive texts. The dataset for continual pretraining consists of ~ 4 million sentences. The dataset for fine-tuning consists of ~5.500 gold annotated sentences by automotive domain experts. We choose the BERT-base-uncased as the base model.
|