Efficient Few-Shot Learning Without Prompts
Paper
• 2209.11055 • Published
• 4
This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
| Label | Examples |
|---|---|
| 79 |
|
| 86 |
|
| 37 |
|
| 82 |
|
| 95 |
|
| 83 |
|
| 107 |
|
| 19 |
|
| 102 |
|
| 35 |
|
| 18 |
|
| 65 |
|
| 68 |
|
| 40 |
|
| 50 |
|
| 113 |
|
| 75 |
|
| 11 |
|
| 38 |
|
| 63 |
|
| 44 |
|
| 115 |
|
| 42 |
|
| 97 |
|
| 70 |
|
| 34 |
|
| 10 |
|
| 15 |
|
| 77 |
|
| 43 |
|
| 7 |
|
| 17 |
|
| 8 |
|
| 103 |
|
| 26 |
|
| 99 |
|
| 33 |
|
| 64 |
|
| 96 |
|
| 1 |
|
| 62 |
|
| 39 |
|
| 60 |
|
| 92 |
|
| 114 |
|
| 105 |
|
| 90 |
|
| 91 |
|
| 45 |
|
| 59 |
|
| 46 |
|
| 21 |
|
| 69 |
|
| 101 |
|
| 61 |
|
| 104 |
|
| 32 |
|
| 51 |
|
| 48 |
|
| 87 |
|
| 22 |
|
| 41 |
|
| 93 |
|
| 71 |
|
| 2 |
|
| 89 |
|
| 20 |
|
| 52 |
|
| 55 |
|
| 58 |
|
| 118 |
|
| 25 |
|
| 109 |
|
| 30 |
|
| 24 |
|
| 9 |
|
| 94 |
|
| 16 |
|
| 78 |
|
| 4 |
|
| 23 |
|
| 111 |
|
| 12 |
|
| 98 |
|
| 57 |
|
| 67 |
|
| 31 |
|
| 85 |
|
| 116 |
|
| 88 |
|
| 74 |
|
| 72 |
|
| 108 |
|
| 73 |
|
| 13 |
|
| 76 |
|
| 54 |
|
| 100 |
|
| 84 |
|
| 14 |
|
| 27 |
|
| 49 |
|
| 29 |
|
| 106 |
|
| 112 |
|
| 66 |
|
| 53 |
|
| 117 |
|
| 81 |
|
| 5 |
|
| 28 |
|
| 56 |
|
| 110 |
|
| 47 |
|
| 3 |
|
| 0 |
|
| 80 |
|
| 6 |
|
| 36 |
|
| Label | Accuracy |
|---|---|
| all | 0.5493 |
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("kaustubhgap/kaustubh_setfit_1iteration")
# Run inference
preds = model("tube")
| Training set | Min | Median | Max |
|---|---|---|---|
| Word count | 1 | 1.7047 | 6 |
| Label | Training Sample Count |
|---|---|
| 0 | 2 |
| 1 | 5 |
| 2 | 12 |
| 3 | 2 |
| 4 | 6 |
| 5 | 3 |
| 6 | 2 |
| 7 | 12 |
| 8 | 16 |
| 9 | 2 |
| 10 | 2 |
| 11 | 11 |
| 12 | 4 |
| 13 | 2 |
| 14 | 2 |
| 15 | 2 |
| 16 | 2 |
| 17 | 6 |
| 18 | 9 |
| 19 | 63 |
| 20 | 8 |
| 21 | 31 |
| 22 | 6 |
| 23 | 2 |
| 24 | 13 |
| 25 | 5 |
| 26 | 2 |
| 27 | 2 |
| 28 | 3 |
| 29 | 2 |
| 30 | 13 |
| 31 | 3 |
| 32 | 7 |
| 33 | 22 |
| 34 | 12 |
| 35 | 102 |
| 36 | 2 |
| 37 | 119 |
| 38 | 34 |
| 39 | 32 |
| 40 | 6 |
| 41 | 2 |
| 42 | 13 |
| 43 | 17 |
| 44 | 5 |
| 45 | 10 |
| 46 | 6 |
| 47 | 2 |
| 48 | 10 |
| 49 | 2 |
| 50 | 91 |
| 51 | 13 |
| 52 | 2 |
| 53 | 2 |
| 54 | 2 |
| 55 | 12 |
| 56 | 4 |
| 57 | 7 |
| 58 | 17 |
| 59 | 2 |
| 60 | 2 |
| 61 | 7 |
| 62 | 9 |
| 63 | 3 |
| 64 | 14 |
| 65 | 53 |
| 66 | 3 |
| 67 | 6 |
| 68 | 41 |
| 69 | 41 |
| 70 | 33 |
| 71 | 5 |
| 72 | 5 |
| 73 | 4 |
| 74 | 7 |
| 75 | 49 |
| 76 | 2 |
| 77 | 23 |
| 78 | 11 |
| 79 | 12 |
| 80 | 2 |
| 81 | 5 |
| 82 | 33 |
| 83 | 33 |
| 84 | 2 |
| 85 | 2 |
| 86 | 17 |
| 87 | 2 |
| 88 | 2 |
| 89 | 10 |
| 90 | 29 |
| 91 | 2 |
| 92 | 8 |
| 93 | 21 |
| 94 | 2 |
| 95 | 3 |
| 96 | 5 |
| 97 | 10 |
| 98 | 5 |
| 99 | 6 |
| 100 | 6 |
| 101 | 12 |
| 102 | 13 |
| 103 | 2 |
| 104 | 10 |
| 105 | 28 |
| 106 | 2 |
| 107 | 321 |
| 108 | 2 |
| 109 | 10 |
| 110 | 2 |
| 111 | 2 |
| 112 | 2 |
| 113 | 15 |
| 114 | 4 |
| 115 | 2 |
| 116 | 5 |
| 117 | 2 |
| 118 | 2 |
| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.0004 | 1 | 0.2895 | - |
| 0.0225 | 50 | 0.2059 | - |
| 0.0449 | 100 | 0.1794 | - |
| 0.0674 | 150 | 0.1994 | - |
| 0.0898 | 200 | 0.2708 | - |
| 0.1123 | 250 | 0.1355 | - |
| 0.1347 | 300 | 0.0695 | - |
| 0.1572 | 350 | 0.117 | - |
| 0.1796 | 400 | 0.0601 | - |
| 0.2021 | 450 | 0.0873 | - |
| 0.2245 | 500 | 0.07 | - |
| 0.2470 | 550 | 0.0805 | - |
| 0.2694 | 600 | 0.0204 | - |
| 0.2919 | 650 | 0.1059 | - |
| 0.3143 | 700 | 0.1178 | - |
| 0.3368 | 750 | 0.1804 | - |
| 0.3592 | 800 | 0.0979 | - |
| 0.3817 | 850 | 0.1597 | - |
| 0.4041 | 900 | 0.1215 | - |
| 0.4266 | 950 | 0.0188 | - |
| 0.4490 | 1000 | 0.0738 | - |
| 0.4715 | 1050 | 0.0635 | - |
| 0.4939 | 1100 | 0.1439 | - |
| 0.5164 | 1150 | 0.0684 | - |
| 0.5388 | 1200 | 0.0732 | - |
| 0.5613 | 1250 | 0.0401 | - |
| 0.5837 | 1300 | 0.1223 | - |
| 0.6062 | 1350 | 0.1044 | - |
| 0.6286 | 1400 | 0.0717 | - |
| 0.6511 | 1450 | 0.0413 | - |
| 0.6736 | 1500 | 0.0544 | - |
| 0.6960 | 1550 | 0.1419 | - |
| 0.7185 | 1600 | 0.0284 | - |
| 0.7409 | 1650 | 0.0484 | - |
| 0.7634 | 1700 | 0.0049 | - |
| 0.7858 | 1750 | 0.0229 | - |
| 0.8083 | 1800 | 0.0739 | - |
| 0.8307 | 1850 | 0.0371 | - |
| 0.8532 | 1900 | 0.0213 | - |
| 0.8756 | 1950 | 0.0753 | - |
| 0.8981 | 2000 | 0.0359 | - |
| 0.9205 | 2050 | 0.0232 | - |
| 0.9430 | 2100 | 0.0507 | - |
| 0.9654 | 2150 | 0.0258 | - |
| 0.9879 | 2200 | 0.0606 | - |
| 1.0 | 2227 | - | 0.2105 |
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}