SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0	"The answer correctly identifies that the primary reasons behind the Nuggets' offensive outburst in January are the team's increased comfort and effectiveness, as well as Coach Brian Shaw's strategy of encouraging the team to push the ball after makes and misses and to take the first available shot in the rhythm of the offense. However, the mention of a new training technique involving virtual reality is not supported by the provided document.\n\nReasoning:\n1. Context Grounding: The majority of the answer is well-supported by the document, but the part about virtual reality training is not mentioned in the provided text.\n2. Relevance: The answer is mostly relevant to the question, but the inclusion of virtual reality training deviates from the information in the document.\n3. Conciseness: The answer could be clearer and more concise by excluding the irrelevant information about virtual reality training.\n\nThe final evaluation:" "Reasoning:\n\n1. Context Grounding: The answer is generally well-grounded in the document but contains some inaccuracies. The document discusses that film over-exposes better, not under-exposes better. The answer also mentions 5MP sensors, while the document refers to 10MP. \n\n2. Relevance: The answer is relevant to the question, addressing the differences between film and digital photography based on the author's experience.\n\n3. Conciseness: The answer is concise and to the point, which is good. However, inaccuracies in the details affect its quality.\n\nFinal Result:" 'Reasoning:\nThe provided answer does not address the question asked. The question seeks information about the main conflict in the third book of the Arcana Chronicles by Kresley Cole, while the answer given only discusses the results of a mixed martial arts event and the performance of fighters in various bouts. This answer is neither relevant to the question nor grounded in the correct context.\n\nFinal evaluation:'
1	'The answer provided effectively outlines best practices for web designers, detailing practices such as understanding client needs, signing detailed contracts, and maintaining clear communication. These are directly rooted in the provided document and address the specified question accurately.\n\n1. Context Grounding: \n - The answer is well-supported by the document, specifically referencing getting to know the client, maintaining a contract, and explaining the importance of communication as outlined in the text.\n\n2. Relevance:\n - The answer is highly relevant to the question, focusing precisely on best practices for web designers to avoid unnecessary revisions and conflicts.\n\n3. Conciseness:\n - The answer is clear, concise, and avoids extraneous details.\n\nFinal evaluation:' "Reasoning:\n\n1. Context Grounding: The answer is well-supported by the provided document. The author does emphasize that using the author's own experiences, especially those involving pain and emotion, makes the story genuine and relatable, thereby creating a connection between the reader and the characters.\n \n2. Relevance: The answer directly addresses the specific question asked about the key to creating a connection between the reader and the characters in a story.\n\n3. Conciseness: The answer is clear and to the point, without including unnecessary information.\n\nFinal result:" 'Reasoning:\n1. Context Grounding: The answer is directly supported by the provided document, which mentions that Mauro Rubin is the CEO of JoinPad and that he spoke during the event at Talent Garden Calabiana, Milan.\n2. Relevance: The answer is relevant to the question asked, directly addressing the identity of the CEO during the specified event.\n3. Conciseness: The answer is clear, to the point, and does not include unnecessary information.\n\nFinal result:'

Label

Examples

"The answer correctly identifies that the primary reasons behind the Nuggets' offensive outburst in January are the team's increased comfort and effectiveness, as well as Coach Brian Shaw's strategy of encouraging the team to push the ball after makes and misses and to take the first available shot in the rhythm of the offense. However, the mention of a new training technique involving virtual reality is not supported by the provided document.\n\nReasoning:\n1. Context Grounding: The majority of the answer is well-supported by the document, but the part about virtual reality training is not mentioned in the provided text.\n2. Relevance: The answer is mostly relevant to the question, but the inclusion of virtual reality training deviates from the information in the document.\n3. Conciseness: The answer could be clearer and more concise by excluding the irrelevant information about virtual reality training.\n\nThe final evaluation:"
"Reasoning:\n\n1. Context Grounding: The answer is generally well-grounded in the document but contains some inaccuracies. The document discusses that film over-exposes better, not under-exposes better. The answer also mentions 5MP sensors, while the document refers to 10MP. \n\n2. Relevance: The answer is relevant to the question, addressing the differences between film and digital photography based on the author's experience.\n\n3. Conciseness: The answer is concise and to the point, which is good. However, inaccuracies in the details affect its quality.\n\nFinal Result:"
'Reasoning:\nThe provided answer does not address the question asked. The question seeks information about the main conflict in the third book of the Arcana Chronicles by Kresley Cole, while the answer given only discusses the results of a mixed martial arts event and the performance of fighters in various bouts. This answer is neither relevant to the question nor grounded in the correct context.\n\nFinal evaluation:'

'The answer provided effectively outlines best practices for web designers, detailing practices such as understanding client needs, signing detailed contracts, and maintaining clear communication. These are directly rooted in the provided document and address the specified question accurately.\n\n1. Context Grounding: \n - The answer is well-supported by the document, specifically referencing getting to know the client, maintaining a contract, and explaining the importance of communication as outlined in the text.\n\n2. **Relevance:**\n - The answer is highly relevant to the question, focusing precisely on best practices for web designers to avoid unnecessary revisions and conflicts.\n\n3. **Conciseness:**\n - The answer is clear, concise, and avoids extraneous details.\n\nFinal evaluation:'
"Reasoning:\n\n1. Context Grounding: The answer is well-supported by the provided document. The author does emphasize that using the author's own experiences, especially those involving pain and emotion, makes the story genuine and relatable, thereby creating a connection between the reader and the characters.\n \n2. Relevance: The answer directly addresses the specific question asked about the key to creating a connection between the reader and the characters in a story.\n\n3. Conciseness: The answer is clear and to the point, without including unnecessary information.\n\nFinal result:"
'Reasoning:\n1. Context Grounding: The answer is directly supported by the provided document, which mentions that Mauro Rubin is the CEO of JoinPad and that he spoke during the event at Talent Garden Calabiana, Milan.\n2. Relevance: The answer is relevant to the question asked, directly addressing the identity of the CEO during the specified event.\n3. Conciseness: The answer is clear, to the point, and does not include unnecessary information.\n\nFinal result:'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_gpt-4o_cot-few_shot-instructions_remove_final_evaluation_e1_one_big_model")
# Run inference
preds = model("The answer accurately states that \"Allan Cox's First Class Delivery was launched on a H128-10W for his Level 1 certification flight,\" which is directly supportedby the provided document.

Final evaluation:")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	11	75.9730	196

Label	Training Sample Count
0	199
1	209

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0010	1	0.2081	-
0.0490	50	0.2494	-
0.0980	100	0.2031	-
0.1471	150	0.1212	-
0.1961	200	0.0675	-
0.2451	250	0.0656	-
0.2941	300	0.0487	-
0.3431	350	0.0341	-
0.3922	400	0.0232	-
0.4412	450	0.0232	-
0.4902	500	0.0147	-
0.5392	550	0.0078	-
0.5882	600	0.0075	-
0.6373	650	0.0058	-
0.6863	700	0.0048	-
0.7353	750	0.0061	-
0.7843	800	0.0047	-
0.8333	850	0.0044	-
0.8824	900	0.0047	-
0.9314	950	0.0042	-
0.9804	1000	0.0044	-

Framework Versions

Python: 3.10.14
SetFit: 1.1.0
Sentence Transformers: 3.1.1
Transformers: 4.44.0
PyTorch: 2.4.0+cu121
Datasets: 3.0.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Netta1994/setfit_baai_gpt-4o_cot-few_shot-instructions_remove_final_evaluation_e1_one_big_model

Base model

BAAI/bge-base-en-v1.5

Finetuned

(462)

this model

Paper for Netta1994/setfit_baai_gpt-4o_cot-few_shot-instructions_remove_final_evaluation_e1_one_big_model

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 6