SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
status_report	"Naringin and naringin's aglycone naringenin belong to a subclass of flavonoids called flavanones. While many studies of pure naringenin and naringin and their food sources have shown beneficial health effects, including improved lipid metabolism, in animals and humans, the mechanisms underlying the lipid-lowering effects are not completely understood. In recent years, multiple studies using various in vitro and rodent models have revealed new mechanisms underlying the hypolipidemic effects of naringin and naringenin, including regulation of lipid digestion, reverse cholesterol transport, and low-density lipoprotein receptor expression. In addition, naringin and naringenin show diverse effects in populations with different health conditions, such as obesity and diabetes. Furthermore, a novel naringin and naringenin enriched food source citrus bergamia (bergamot) and other citrus fruits have recently been studied for lipid-lowering effects in animal models and human clinical trials. In this review, we provide an update on recent advances in naringin and naringenin and their enriched food sources on lipid metabolism and underlying mechanisms. Because absorption, distribution, metabolism, and excretion, particularly in the presence of food matrix, impact the bioavailability, which in turn affects the bioactivities of these flavonoids in vivo, we also summarize new findings from the pharmacokinetics studies andthe interplays between the flavanones and gut microbiota." 'Buku ini merupakan panduan yang mudah digunakan, memberi cakupan asuhan obstetri dan ginekologi yang essensial, disertai saran kritis untuk membantu keberhasilan anda mengelola banyak masalah sering kali ditemui dalam praktik sehari-hari. \r\nIsi buku ini membahas seputar isu kesehatan wanita secara lengkap, dari masalah dan prosedur yang umum, terapi alternatif, dan pengobatan yang efektif untuk perawatan wanita yang sehat di sepanjang siklus kehidupan dan ketidaknyamanan pada saat kehamilan.' 'Purpose: This study aims at providing a guideline with high priorities to prevent crimes in the outdoor areas of Studio-type Housing by conducting surveys on satisfaction and importance of residents for outdoor area guidelines derived from the preceding studies. Method: Research method was to investigate preceding studies and literature to set the spatial range of the outdoor area, and spaces with high crime rates and high level of anxiety among the outdoor areas of Studio-type Housing were selected using preceding studies. Next, resident surveys were conducted on detailed guidelines for outdoor area, and IPA statistical analysis was used to draw a guideline that requires preferred application to prevent crimes. Result: Through research analysis, five guidelines with high priority to prevent crimes in the outdoor areas of Studio-type Housing were deduced. Specifically, the outdoor areas are comprised of main entrance on the first floor, window, parking lot, and hallway, and the detailed guidelines for outdoor area are as follows. First, in the main entrance on the first floor, the guidelines are ‘installation of entry control system on the main entrance on the first floor of the building that can control the access of outsiders’ and ‘installation of light that is on all night in the inner wall of the entrance door so that the main entrance on the first floor of the Studio-type Housing is well visible at night.’ Second, the guideline for window, is ‘installation of anti-theft window to prevent criminals climbing in from the outside through the window.’ Third, for parking lot, the guideline was ‘installation of CCTV,’ and fourth, for hallway, ‘installation of well visible crime warning signs such as hallway being recorded by CCTV or SECOM installed.’ This implies that the five guidelines deduced in this study need to be applied urgently among various guidelines because outdoor areas were confirmed to be a space with high concerns for crime as a result of investigation on the general features of residents, lifestyles, and physical features of outdoor areas of Studio-type Housing.'
survey_study	'Major and trace element analyses have been obtained by wavelength dispersive X‐ray fluorescence for the Geological Survey of Japan Igneous rock series and selected samples from the Sedimentary rock series reference samples. Additional trace element data for the Igneous rock series were obtained by instrumental neutron activation analysis. Samples were analyzed multiple times for 10 major elements (with loss‐on‐ignition) and the following trace elements; As, Ba, Ce, Co, Cr, Cs, Cu, Eu, Ga, Hf, La, Lu, Nb, Nd, Ni, Pb, Rb, Sb, Sc, Sm, Sr, Ta, Tb, Th, U, V, W, Y, Yb, Zn and Zr.' "OBJECTIVE: There is a growing body of research that investigates how the residential neighbourhood context relates to individual diet. However, previous studies ignore participants' time spent in the residential environment and this may be a problem because time-use studies show that adults' time-use pattern can significantly vary. To better understand the role of exposure duration, we designed a study to examine 'time spent at home' as a moderator to the residential food environment-diet association. DESIGN: Cross-sectional observational study. SETTINGS: City of Toronto, Ontario, Canada. PARTICIPANTS: 2411 adults aged 25-65. PRIMARY OUTCOME MEASURE: Frequency of vegetable and fruit intake (VFI) per day. RESULTS: To examine how time spent at home may moderate the relationship between residential food environment and VFI, the full sample was split into three equal subgroups--short, medium and long duration spent at home. We detected significant associations between density of food stores in the residential food environment and VFI for subgroups that spend medium and long durations at home (ie, spending a mean of 8.0 and 12.3 h at home, respectively--not including sleep time), but no associations exist for people who spend the lowest amount of time at home (mean=4.7 h). Also, no associations were detected in analyses using the full sample. CONCLUSIONS: Our study is the first to demonstrate that time spent at home may be an important variable to identify hidden population patterns regarding VFI. Time spent at home can impact the association between the residential food environment and individual VFI." 'Supplemental material, Questionnaire_Wilawun for An exploratory factor analysis of core competencies of public health professionals at primary care service level in Northeastern Thailand by Songkramchai Leethongdissakul, Wilawun Chada, Supa Pengpid and Sangud Chualinfa in SAGE Open Medicine'
research_paper	'Recent experiments in the cat have demonstrated that several response parameters, including frequency tuning, intensity tuning, and FM selectivity, are spatially segregated across the isofrequency axis. To investigate whether a similar functional organization exists in the primate, we have studied the spatial distribution of pure-tone receptive field parameters across the primary auditory cortex (AI) in six owl monkeys (Aotus trivirgatus). The distributions of binaural interaction types and onset latency were also examined. Consistent with previous studies, the primary auditory cortex contained a clear cochleotopic organization. We demonstrate here that several other properties of the responses to tonal stimuli also showed nonrandom spatial distributions that were largely independent from each other. In particular, the sharpness of frequency tuning to pure tones, intensity tuning and sensitivity, response latency, and binaural interaction types all showed spatial variations that were independent from the representation of characteristic frequency and from each other. Statistical analysis confirmed that these organizations did not reflect random distributions. The overall organizational pattern of overlaying but independent functional maps that emerged was quite similar to that seen in AI of cats and, in general, appears to reflect a fundamental organization principle of primary sensory cortical fields.' 'We show that large positive solutions exist for the equation ( P ±) : Δ u ±
tool_paper	'Drug poisoning is the most common form of poisoning in the world. Timely and efficient management of this form of poisoning may save the life of many patients. Today, smartphone apps are widely used for various utilisations, such as for medical purposes. This study aimed to review the crucial characteristics of Android and iOS apps for drug poisoning management and categorise them by the use case classification model. Google Play and App Store were searched in December 2018 for drug poisoning apps using the keywords toxicology, poisoning, drug poisoning and drug toxicities, and resulted in 551 smartphone apps. The 17 final apps were evaluated based on the following items: platform, cost, date of update, country, app target, target group, rating and developer. The results showed that 64.7% of apps were available on both platforms and 53% were free to download. Majority of the apps (53%) were designed for medical staff and 47% were developed in the USA. In 47% of the apps, users rated a score above 4 for apps evaluation, and in 47% the last update was 1 year ago. Nine distinct use cases from the published use case classification model were found in 17 apps. The results of this study can help users select and use a reliable app for management of drug poisoning. The results also showed that 22 use cases of the 31 introduced were not considered in the development of the apps. Application of these use cases may improve the quality of drug poisoning management apps.' 'End-users of high-performance computing resources have come to expect that consistent levels of performance be delivered to their applications. The advancement of the computational grid enables the seamless use of a multitude of computing resources by these users. The combination of these developments has generated a need for users to monitor the end-to-end-performance available to an application. In addition, tools are needed to alert users of degradation in expected performance. We present the NwsAlarm, a Java-based utility that enables users to monitor performance levels of any resource being monitored by the Network Weather Service. The NwsAlarm is invoked by a user without special privileges with a simple click on a Web page link. More importantly the NwsAlarm allows any user of the NwsAlarm to register and set expected performance levels. When performance levels fall below these thresholds, the registered administrators are immediately notified via email. The NwsAlarm uses prediction of performance measurements to filter false alarm values. We exemplify the importance of and accuracy achieved by the NwsAlarm with real examples of performance degradation caused by routing table changes and loss of service on the Abilene, Internet-2 research network used for experimentation with evolving Grid software technology. On average, 92% fewer false alarms are raised by the NwsAlarm than if raw measurements are used.' 'Abstract Background Mortality prediction of congenital diaphragmatic hernia (CDH) is essential for developing treatment strategies, including fetal therapy. Several researchers have reported prognostic factors for this rare but life‐threatening condition; however, the optimal combination of prognostic factors remains to be elucidated. Objectives This study aimed to develop the most discriminative prenatal and postnatal models to predict the mortality of infants with an isolated left‐sided CDH. Methods This multi‐institutional retrospective cohort study included infants with CDH born at 15 tertiary hospitals of the Japanese CDH Study Group between 2011 and 2016. We developed multivariable logistic models with every possible combination of predictors and identified models with the highest cross‐validated area under the receiver operating characteristic curve (AUC) for prenatal and postnatal predictions. Results Among 302 eligible infants, 44 died before discharge. The prenatal mortality prediction model was based on the observed/expected lung area to head circumference ratio (O/E LHR), liver herniation, and stomach herniation (AUC, 0.830). The postnatal mortality prediction model was based on O/E LHR, liver herniation, and the lowest oxygenation index (AUC, 0.944). Conclusion Our models can facilitate the prenatal and postnatal mortality prediction of infants with isolated left‐sided CDH.'
literature_review	'Small Private Online Courses (SPOCs) have emerged as a promising instructional model in English language education, offering structured, flexible, and scalable learning opportunities within blended or hybrid contexts. This systematic review synthesises 36 empirical studies published between 2015 and 2025 to examine how SPOCs have been pedagogically designed, methodologically investigated, and evaluated in diverse educational settings. The studies were identified through a comprehensive search of Scopus and Web of Science and analysed thematically in terms of methodological approaches, instructional strategies, and reported learning outcomes. Findings reveal considerable methodological variation, including qualitative case studies, quasi-experimental designs, mixed-methods inquiries, and design-based research. Pedagogically, SPOCs were most frequently implemented in flipped classroom models and integrated with skill-specific scaffolding, adaptive technologies, gamified features, and collaborative or reflective tasks. These designs were aligned with communicative and learner-centred approaches to language teaching. Reported outcomes spanned linguistic, cognitive, affective, and behavioural domains, with most studies indicating gains in grammar, vocabulary, writing, and speaking, as well as enhanced learner autonomy, motivation, and metacognitive awareness. The review underscores the pedagogical value of SPOCs when thoughtfully integrated into language curricula and supported by appropriate technological and instructional design. It highlights the potential of SPOCs to personalise learning, support skill-specific development, and promote active, reflective, and autonomous learning. These findings offer timely guidance for educators, instructional designers, and policymakers aiming to enhance English language education in increasingly hybrid and digital environments.' "OBJECTIVE:: To review methods for measuring adherence to exercise or physical activity practice recommendations in the stroke population and evaluate measurement properties of identified tools. DATA SOURCES:: Two systematic searches were conducted in eight databases (MEDLINE, CINAHL, PsycINFO, Cochrane Library of Systematic Reviews, Sports Discus, PEDro, PubMed and EMBASE). Phase 1 was conducted to identify measures. Phase 2 was conducted to identify studies investigating properties of these measures. REVIEW METHODS:: Phase 1 articles were selected if they were published in English, included participants with stroke, quantified adherence to exercise or physical activity recommendations, were patient or clinician reported, were defined and reproducible measures and included patients >18\u2009years old. In phase 2, articles were included if they explored psychometric properties of the identified tools. Included articles were screened based on title/abstract and full-text review by two independent reviewers. RESULTS:: In phase 1, seven methods of adherence measurement were identified, including logbooks ( n\u2009=\u200916), diaries ( n\u2009=\u200918), 'record of practice' ( n\u2009=\u20093), journals ( n\u2009=\u20091), surveys ( n\u2009=\u20092) and questionnaires ( n\u2009=\u20094). One measurement tool was identified, the Physical Activity Scale for Individuals with Physical Disabilities ( n\u2009=\u20094). In phase 2, no eligible studies were identified. CONCLUSION:: There is not a consistent measure of adherence that is currently utilized. Diaries and logbooks are the most frequently utilized tools." 'These estimates suggest that mental disorders rank among the most substantial causes of death worldwide. Efforts to quantify and address the global burden of illness need to better consider the role of mental disorders in preventable mortality.'

Evaluation

Metrics

Label	Accuracy
all	0.7893

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("Vegetables play an importance role in balance diet by providing not only energy but also supplying vital protective nutrients like mineral and vitamins. In addition to their role in nutrition, vegetables increase attractiveness and palatability of a diet by providing sensory appeal through their test and flavors. Vegetables are major and very important constitute of human diet (Thamburaj and Singh, 2005).")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	196.4484	795

Label	Training Sample Count
literature_review	128
research_paper	128
status_report	128
survey_study	128
tool_paper	128

Training Hyperparameters

batch_size: (64, 64)
num_epochs: (2, 3)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0025	1	0.2322	-
0.125	50	0.2353	-
0.25	100	0.1902	-
0.375	150	0.1218	-
0.5	200	0.0703	-
0.625	250	0.0475	-
0.75	300	0.0232	-
0.875	350	0.0098	-
1.0	400	0.0043	-
1.125	450	0.0034	-
1.25	500	0.0029	-
1.375	550	0.0027	-
1.5	600	0.0026	-
1.625	650	0.0024	-
1.75	700	0.0023	-
1.875	750	0.0023	-
2.0	800	0.0022	-

Framework Versions

Python: 3.12.3
SetFit: 1.1.1
Sentence Transformers: 3.4.1
Transformers: 4.57.1
PyTorch: 2.8.0+cu128
Datasets: 3.4.1
Tokenizers: 0.22.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 8

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for jimnoneill/pubguard-review-classifier

Base model

BAAI/bge-base-en-v1.5

Finetuned

(472)

this model

Paper for jimnoneill/pubguard-review-classifier

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 7

Evaluation results

Accuracy on Unknown
test set self-reported

0.789