Update README.md
Browse files
README.md
CHANGED
|
@@ -27,10 +27,13 @@ model-index:
|
|
| 27 |
---
|
| 28 |
|
| 29 |
|
| 30 |
-
# Information Content Classification using SetFit with Base sentence-transformers/paraphrase-mpnet-base-v2
|
| 31 |
|
| 32 |
-
|
| 33 |
-
contains information that could be useful by itself to answer a RAG-type question.
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
The model has been trained using an efficient few-shot learning technique that involves:
|
| 36 |
|
|
@@ -54,18 +57,14 @@ The model has been trained using an efficient few-shot learning technique that i
|
|
| 54 |
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
|
| 55 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
| 56 |
|
| 57 |
-
### Model Labels
|
| 58 |
-
| Label | Examples |
|
| 59 |
-
|:------|:----------------------------------------------------------------------------------------------|
|
| 60 |
-
| 1 | <ul><li>'Paris is in France'</li><li>'Time != Money'</li><li>'TBA - to be announced'</li></ul> |
|
| 61 |
-
| 0 | <ul><li>'Food delivery'</li><li>'She was not aware of the birds'</li><li>'The Eiffel Tower'</li></ul> |
|
| 62 |
-
|
| 63 |
|
| 64 |
## Uses
|
| 65 |
|
| 66 |
-
###
|
|
|
|
|
|
|
| 67 |
|
| 68 |
-
|
| 69 |
|
| 70 |
```bash
|
| 71 |
pip install setfit
|
|
@@ -77,9 +76,11 @@ Then you can load this model and run inference.
|
|
| 77 |
from setfit import SetFitModel
|
| 78 |
|
| 79 |
# Download from the 🤗 Hub
|
| 80 |
-
model = SetFitModel.from_pretrained(
|
| 81 |
# Run inference
|
| 82 |
preds = model("Paris is in France")
|
|
|
|
|
|
|
| 83 |
```
|
| 84 |
|
| 85 |
### Framework Versions
|
|
|
|
| 27 |
---
|
| 28 |
|
| 29 |
|
| 30 |
+
# Onyx Information Content Classification using SetFit with Base sentence-transformers/paraphrase-mpnet-base-v2
|
| 31 |
|
| 32 |
+
The model is for use by the [Onyx Enterprise Search](https://github.com/onyx-dot-app/onyx) system to identify whether a short
|
| 33 |
+
text segment contains information that could be useful by itself to answer a RAG-type question.
|
| 34 |
+
|
| 35 |
+
It is based on the [SetFit](https://github.com/huggingface/setfit) approach, using [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model.
|
| 36 |
+
A trained [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
|
| 37 |
|
| 38 |
The model has been trained using an efficient few-shot learning technique that involves:
|
| 39 |
|
|
|
|
| 57 |
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
|
| 58 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
| 59 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
## Uses
|
| 62 |
|
| 63 |
+
### Use for Inference
|
| 64 |
+
|
| 65 |
+
The model is for use by the Onyx Enterprise Search system.
|
| 66 |
|
| 67 |
+
To test it locally, first install the SetFit library:
|
| 68 |
|
| 69 |
```bash
|
| 70 |
pip install setfit
|
|
|
|
| 76 |
from setfit import SetFitModel
|
| 77 |
|
| 78 |
# Download from the 🤗 Hub
|
| 79 |
+
model = SetFitModel.from_pretrained("onyx-dot-app/information-content-model")
|
| 80 |
# Run inference
|
| 81 |
preds = model("Paris is in France")
|
| 82 |
+
or:
|
| 83 |
+
pred_probability = model.predict_proba("Paris is in France")
|
| 84 |
```
|
| 85 |
|
| 86 |
### Framework Versions
|