diff --git "a/README.md" "b/README.md" new file mode 100644--- /dev/null +++ "b/README.md" @@ -0,0 +1,341 @@ +--- +base_model: sentence-transformers/all-roberta-large-v1 +library_name: setfit +metrics: +- accuracy +pipeline_tag: text-classification +tags: +- setfit +- sentence-transformers +- text-classification +- generated_from_setfit_trainer +widget: +- text: "USER: Please rewrite this poem in paragraph form (One paragraph is fine).\ + \ Keep the tone the same; since it is no longer a poem, make sure it is not rhyming\ + \ or in the rhythm of a poem.\n\nAt the Zoo\nFirst I saw the white bear, then\ + \ I saw the black;\nThen I saw the camel with a hump upon his back;\nThen I saw\ + \ the grey wolf, with mutton in his maw;\nThen I saw the wombat waddle in the\ + \ straw;\nThen I saw the elephant a-waving of his trunk;\nThen I saw the monkeys—mercy,\ + \ how unpleasantly they smelt!\n\nhttps://poets.org/poem/zoo\n\nASSISTANT: Please\ + \ see the rewritten poem \"At the Zoo\" below, in a paragraph form.\n\nWhen we\ + \ went to the zoo, I saw a bunch of animals. I saw many animals with different\ + \ colors and attributes. \nIn the beginning, I saw both a white and a black bear.\ + \ Afterward, there was a camel that had a hump. The next group of animals was\ + \ a grey wolf, a wombat hanging out in the straw, and an elephant waving his trunk\ + \ around. The last animals that I saw were the stinky smelly monkeys!" +- text: "USER: Please write a short poem about ducks.\n\nASSISTANT: Ducks one of the\ + \ luckiest birds.\nThey come in all shapes and love all grapes \nThey are better\ + \ than pups when it comes to deluxe fluff\nDucks amazing at flying and diving\n\ + Ducks on of the luckiest birds." +- text: "USER: Tell me how to get more Instagram followers please, as I want to become\ + \ Instagram famous! Ten tips would be ideal. I deleted everything from my feed,\ + \ so you can write it as if I'm starting from scratch! \n\nASSISTANT: Good luck!\ + \ Here are ten ways to get more Instagram followers:\n\n1. Create an Aesthetic\ + \ Instagram Feed: Instagram is all about aesthetics, so having a strong aesthetic\ + \ on your Instagram feed is key to attracting more followers. The first thing\ + \ people see when they look at your page is your collection of photos, so having\ + \ pictures that blend well together will go down nicely with potential new followers.\n\ + \n2. Engage With Followers: Don't forget to engage with your current followers\ + \ to build your reputation and make sure people will recommend you. You can do\ + \ this by liking and responding to comments, as well as following people back\ + \ who seem to have similar interests or content to you. \n\n3. Optimize Your Instagram\ + \ Bio: Make sure your Instagram bio makes it clear what your Instagram is about,\ + \ and focus on a niche if possible. For example, your Instagram might revolve\ + \ around your pets and your make-up, or it might be more about travel. You can\ + \ post a variety of things, but focusing primarily on one can be extremely helpful\ + \ in building your follower base and drawing in people with similar interests.\n\ + \n4. Find the Best Hashtags for Your Followers: Hashtags are key to bringing an\ + \ audience who might never have come across your feed otherwise. Look up what\ + \ hashtags are trending that relate to your niche and make sure you use them in\ + \ your post. Don't be tempted to use irrelevant hashtags just because they're\ + \ trending though, or people may be turned away from your page.\n\n5. Create a\ + \ Consistent Instagram Posting Schedule: No one wants to be spammed with content\ + \ all day every day, but they also don't want to be left without it. Creating\ + \ a consistent posting schedule, so people know when to expect new content and\ + \ how much they're going to get it when they follow you, can be extremely helpful\ + \ in attracting the right audience who will enjoy what you post.\n\n6. Make Sure\ + \ Your Instagram Content is Accessible: Accessibility is key for online spaces.\ + \ For example, you could include descriptions of the picture in each caption so\ + \ that blind audiences can also enjoy your content. Make sure to avoid any videos\ + \ with flashing lights, and avoid posting triggering content without a warning.\ + \ Caring about your followers will go a long way in building brand loyalty in\ + \ an audience.\n\n7. Work With Other Brands, Creators, and Influencers: You can\ + \ also work with other Instagram brands and creators, as this will help draw people\ + \ into your page from theirs and you can offer them the same thing. Brands tend\ + \ to reach out to larger pages to offer sponsorships, but there's no reason you\ + \ can't reach out to them first and see what you can build together.\n\n8. Start\ + \ Creating Instagram Reels: Although Instagram has always been known as a picture\ + \ platform, it's starting to become popular in the video space. Making reels with\ + \ trending audio can help draw people in that might not have seen your page otherwise.\n\ + \n9. Create Memes and Other Shareable Content: Making content that's shareable\ + \ is another sure way to bring in audiences. For example, posting memes that people\ + \ might relate to means they might share it on their story, exposing your Instagram\ + \ to their followers. Make sure the memes are authentic to your page and the message\ + \ you're trying to send.\n\n10. Run an Instagram Giveaway: You could also run\ + \ an Instagram giveaway, mandating that people must comment on your post and follow\ + \ your page to enter. Some giveaways also dictate that entrants must share the\ + \ post, exposing the giveaway—and therefore, the account—to even more people.\n\ + \nBuilding a following on Instagram is all about being authentic and true to yourself\ + \ while creating content that your followers will love." +- text: 'USER: Write a haiku about a kid who hates doing homework, but doesn''t have + a choice. Make it funny but keep it to the point. Keep it vivid even though it''s + a haiku. + + + ASSISTANT: My pen-covered hands + + slave away at my homework + + enslaved until done.' +- text: 'USER: I would like a quick description of the basic creation of a statistical + learning model. This should be simple enough that the average person could understand + it. The description should be a paragraph of 5 sentences. + + + ASSISTANT: When we look to create a model, we are attempting to predict a single + feature from all the other features of something. This starts by looking at a + set of instances that all have data points for the same features. A key assumption + is that these features all have some distribution even if we know nothing about + it. We then run them through a processing component in order to create that prediction + component, after assuming there is a completely correct version of this function. + If we do not have complete data or knowledge of the distributions, the prediction + function can rarely be completely perfect.' +inference: true +model-index: +- name: SetFit with sentence-transformers/all-roberta-large-v1 + results: + - task: + type: text-classification + name: Text Classification + dataset: + name: Unknown + type: unknown + split: test + metrics: + - type: accuracy + value: 0.9282511210762332 + name: Accuracy +--- + +# SetFit with sentence-transformers/all-roberta-large-v1 + +This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-roberta-large-v1](https://huggingface.co/sentence-transformers/all-roberta-large-v1) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. + +The model has been trained using an efficient few-shot learning technique that involves: + +1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. +2. Training a classification head with features from the fine-tuned Sentence Transformer. + +## Model Details + +### Model Description +- **Model Type:** SetFit +- **Sentence Transformer body:** [sentence-transformers/all-roberta-large-v1](https://huggingface.co/sentence-transformers/all-roberta-large-v1) +- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance +- **Maximum Sequence Length:** 256 tokens +- **Number of Classes:** 10 classes + + + + +### Model Sources + +- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) +- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) +- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) + +### Model Labels +| Label | Examples | +|:-----------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Brainstorm | | +| Rewrite | | +| Coding | | +| Summarize | | +| Closed QA | | +| Open QA | | +| Math | | +| Generation | | +| Classify | | +| Extract | | + +## Evaluation + +### Metrics +| Label | Accuracy | +|:--------|:---------| +| **all** | 0.9283 | + +## Uses + +### Direct Use for Inference + +First install the SetFit library: + +```bash +pip install setfit +``` + +Then you can load this model and run inference. + +```python +from setfit import SetFitModel + +# Download from the 🤗 Hub +model = SetFitModel.from_pretrained("setfit_model_id") +# Run inference +preds = model("USER: Write a haiku about a kid who hates doing homework, but doesn't have a choice. Make it funny but keep it to the point. Keep it vivid even though it's a haiku. + +ASSISTANT: My pen-covered hands +slave away at my homework +enslaved until done.") +``` + + + + + + + + + +## Training Details + +### Training Set Metrics +| Training set | Min | Median | Max | +|:-------------|:----|:---------|:-----| +| Word count | 20 | 216.0099 | 2635 | + +| Label | Training Sample Count | +|:-----------|:----------------------| +| Coding | 50 | +| Rewrite | 50 | +| Open QA | 54 | +| Generation | 50 | +| Classify | 50 | +| Extract | 50 | +| Summarize | 50 | +| Closed QA | 51 | +| Math | 50 | +| Brainstorm | 50 | + +### Training Hyperparameters +- batch_size: (24, 24) +- num_epochs: (1, 1) +- max_steps: 2000 +- sampling_strategy: oversampling +- body_learning_rate: (2e-05, 1e-05) +- head_learning_rate: 0.01 +- loss: CosineSimilarityLoss +- distance_metric: cosine_distance +- margin: 0.25 +- end_to_end: False +- use_amp: False +- warmup_proportion: 0.1 +- seed: 42 +- eval_max_steps: -1 +- load_best_model_at_end: True + +### Training Results +| Epoch | Step | Training Loss | Validation Loss | +|:----------:|:--------:|:-------------:|:---------------:| +| 0.0001 | 1 | 0.3244 | - | +| 0.0052 | 50 | 0.129 | - | +| 0.0105 | 100 | 0.0488 | - | +| 0.0157 | 150 | 0.0162 | - | +| 0.0209 | 200 | 0.0013 | - | +| 0.0261 | 250 | 0.001 | - | +| 0.0314 | 300 | 0.0004 | - | +| 0.0366 | 350 | 0.0002 | - | +| 0.0418 | 400 | 0.0002 | - | +| 0.0471 | 450 | 0.0002 | - | +| 0.0523 | 500 | 0.0002 | - | +| 0.0575 | 550 | 0.0002 | - | +| 0.0627 | 600 | 0.0001 | - | +| 0.0680 | 650 | 0.0002 | - | +| 0.0732 | 700 | 0.0001 | - | +| 0.0784 | 750 | 0.0001 | - | +| 0.0837 | 800 | 0.0 | - | +| 0.0889 | 850 | 0.0001 | - | +| 0.0941 | 900 | 0.0001 | - | +| 0.0993 | 950 | 0.0001 | - | +| 0.1046 | 1000 | 0.0001 | - | +| 0.1098 | 1050 | 0.0 | - | +| 0.1150 | 1100 | 0.0001 | - | +| 0.1203 | 1150 | 0.0001 | - | +| 0.1255 | 1200 | 0.0 | - | +| 0.1307 | 1250 | 0.0001 | - | +| 0.1359 | 1300 | 0.0 | - | +| 0.1412 | 1350 | 0.0 | - | +| 0.1464 | 1400 | 0.0 | - | +| 0.1516 | 1450 | 0.0 | - | +| 0.1569 | 1500 | 0.0 | - | +| 0.1621 | 1550 | 0.0 | - | +| 0.1673 | 1600 | 0.0 | - | +| 0.1725 | 1650 | 0.0 | - | +| 0.1778 | 1700 | 0.0 | - | +| 0.1830 | 1750 | 0.0001 | - | +| 0.1882 | 1800 | 0.0 | - | +| 0.1935 | 1850 | 0.0 | - | +| 0.1987 | 1900 | 0.0 | - | +| 0.2039 | 1950 | 0.0 | - | +| **0.2091** | **2000** | **0.0** | **0.063** | + +* The bold row denotes the saved checkpoint. +### Framework Versions +- Python: 3.10.14 +- SetFit: 1.0.3 +- Sentence Transformers: 3.0.1 +- Transformers: 4.40.0 +- PyTorch: 2.4.1+cu121 +- Datasets: 2.21.0 +- Tokenizers: 0.19.1 + +## Citation + +### BibTeX +```bibtex +@article{https://doi.org/10.48550/arxiv.2209.11055, + doi = {10.48550/ARXIV.2209.11055}, + url = {https://arxiv.org/abs/2209.11055}, + author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, + keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, + title = {Efficient Few-Shot Learning Without Prompts}, + publisher = {arXiv}, + year = {2022}, + copyright = {Creative Commons Attribution 4.0 International} +} +``` + + + + + + \ No newline at end of file