Model V0 Datacard Update
#2
by
Tihsrah-CD
- opened
README.md
CHANGED
|
@@ -9,13 +9,13 @@ pipeline_tag: text-classification
|
|
| 9 |
|
| 10 |
# Topic Classifier
|
| 11 |
|
| 12 |
-
This repository contains the
|
| 13 |
|
| 14 |
## Model Details
|
| 15 |
|
| 16 |
### Model Description
|
| 17 |
|
| 18 |
-
The
|
| 19 |
|
| 20 |
- **Developed by:** DAXA.AI
|
| 21 |
- **Funded by:** Open Source
|
|
@@ -26,14 +26,14 @@ The Topic Classifier is a BERT-based model, fine-tuned from the `distilbert-base
|
|
| 26 |
|
| 27 |
### Model Sources
|
| 28 |
|
| 29 |
-
- **Repository:** [https://huggingface.co/daxa-ai/
|
| 30 |
- **Demo:** [https://huggingface.co/spaces/daxa-ai/Topic-Classifier-2](https://huggingface.co/spaces/daxa-ai/Topic-Classifier-2)
|
| 31 |
|
| 32 |
## Usage
|
| 33 |
|
| 34 |
### How to Get Started with the Model
|
| 35 |
|
| 36 |
-
To use the
|
| 37 |
|
| 38 |
```python
|
| 39 |
# Import necessary libraries
|
|
@@ -43,8 +43,8 @@ import joblib
|
|
| 43 |
from huggingface_hub import hf_hub_url, cached_download
|
| 44 |
|
| 45 |
# Load the tokenizer and model
|
| 46 |
-
tokenizer = AutoTokenizer.from_pretrained("daxa-ai/
|
| 47 |
-
model = AutoModelForSequenceClassification.from_pretrained("daxa-ai/
|
| 48 |
|
| 49 |
# Example text
|
| 50 |
text = "Please enter your text here."
|
|
@@ -58,7 +58,7 @@ probabilities = torch.nn.functional.softmax(output.logits, dim=-1)
|
|
| 58 |
predicted_label = torch.argmax(probabilities, dim=-1)
|
| 59 |
|
| 60 |
# URL of your Hugging Face model repository
|
| 61 |
-
REPO_NAME = "daxa-ai/
|
| 62 |
|
| 63 |
# Path to the label encoder file in the repository
|
| 64 |
LABEL_ENCODER_FILE = "label_encoder.joblib"
|
|
@@ -161,6 +161,6 @@ def predict_fn(data, model_and_tokenizer):
|
|
| 161 |
|
| 162 |
## Conclusion
|
| 163 |
|
| 164 |
-
The
|
| 165 |
|
| 166 |
For more information or to try the model yourself, check out the public space [here](https://huggingface.co/spaces/daxa-ai/Topic-Classifier-2).
|
|
|
|
| 9 |
|
| 10 |
# Topic Classifier
|
| 11 |
|
| 12 |
+
This repository contains the Pebblo Classifier model developed by DAXA.AI. The Pebblo Classifier is a machine learning model designed to categorize text documents across various domains, such as corporate documents, financial texts, harmful content, and medical documents.
|
| 13 |
|
| 14 |
## Model Details
|
| 15 |
|
| 16 |
### Model Description
|
| 17 |
|
| 18 |
+
The Pebblo Classifier is a BERT-based model, fine-tuned from the distilbert-base-uncased model. It is intended for categorizing text into specific topics, including "CORPORATE_DOCUMENTS," "FINANCIAL," "HARMFUL," and "MEDICAL." This model streamlines text classification tasks across multiple sectors, making it suitable for various business use cases.
|
| 19 |
|
| 20 |
- **Developed by:** DAXA.AI
|
| 21 |
- **Funded by:** Open Source
|
|
|
|
| 26 |
|
| 27 |
### Model Sources
|
| 28 |
|
| 29 |
+
- **Repository:** [https://huggingface.co/daxa-ai/pebblo-classifier-v2](https://huggingface.co/daxa-ai/pebblo-classifier-v2)
|
| 30 |
- **Demo:** [https://huggingface.co/spaces/daxa-ai/Topic-Classifier-2](https://huggingface.co/spaces/daxa-ai/Topic-Classifier-2)
|
| 31 |
|
| 32 |
## Usage
|
| 33 |
|
| 34 |
### How to Get Started with the Model
|
| 35 |
|
| 36 |
+
To use the Pebblo Classifier in your Python project, you can follow the steps below:
|
| 37 |
|
| 38 |
```python
|
| 39 |
# Import necessary libraries
|
|
|
|
| 43 |
from huggingface_hub import hf_hub_url, cached_download
|
| 44 |
|
| 45 |
# Load the tokenizer and model
|
| 46 |
+
tokenizer = AutoTokenizer.from_pretrained("daxa-ai/pebblo-classifier-v2")
|
| 47 |
+
model = AutoModelForSequenceClassification.from_pretrained("daxa-ai/pebblo-classifier-v2")
|
| 48 |
|
| 49 |
# Example text
|
| 50 |
text = "Please enter your text here."
|
|
|
|
| 58 |
predicted_label = torch.argmax(probabilities, dim=-1)
|
| 59 |
|
| 60 |
# URL of your Hugging Face model repository
|
| 61 |
+
REPO_NAME = "daxa-ai/pebblo-classifier-v2"
|
| 62 |
|
| 63 |
# Path to the label encoder file in the repository
|
| 64 |
LABEL_ENCODER_FILE = "label_encoder.joblib"
|
|
|
|
| 161 |
|
| 162 |
## Conclusion
|
| 163 |
|
| 164 |
+
The Pebblo Classifier achieves high accuracy, precision, recall, and F1-score, making it a reliable model for categorizing text across the domains of corporate documents, financial content, harmful content, and medical texts. The model is optimized for immediate deployment and works efficiently in real-world applications.
|
| 165 |
|
| 166 |
For more information or to try the model yourself, check out the public space [here](https://huggingface.co/spaces/daxa-ai/Topic-Classifier-2).
|