ShiWarai
/

CVC-Panda

@@ -32,35 +32,35 @@ model-index:
       name: Accuracy
 ---
-# SetFit with google/embeddinggemma-300M
-This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [google/embeddinggemma-300M](https://huggingface.co/google/embeddinggemma-300M) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
-The model has been trained using an efficient few-shot learning technique that involves:
-1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
-2. Training a classification head with features from the fine-tuned Sentence Transformer.
-## Model Details
-### Model Description
-- **Model Type:** SetFit
-- **Sentence Transformer body:** [google/embeddinggemma-300M](https://huggingface.co/google/embeddinggemma-300M)
-- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
-- **Maximum Sequence Length:** 2048 tokens
-- **Number of Classes:** 13 classes
-<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
-<!-- - **Language:** Unknown -->
-<!-- - **License:** Unknown -->
-### Model Sources
-- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
-- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
-- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
-### Model Labels
-| Label              | Examples                                                                                                                      |
 |:-------------------|:------------------------------------------------------------------------------------------------------------------------------|
 | unknown            | <ul><li>'чей робот'</li><li>'опять лежать'</li><li>'смотри панда'</li></ul>                                                   |
 | stand_at_attention | <ul><li>'пора выравняться'</li><li>'не хочешь равняться'</li><li>'выравнялся бы'</li></ul>                                    |
@@ -76,66 +76,64 @@ The model has been trained using an efficient few-shot learning technique that i
 | run                | <ul><li>'надо бежать'</li><li>'побеги'</li><li>'хотела бы чтобы панда бежала'</li></ul>                                       |
 | help               | <ul><li>'надо помочь'</li><li>'помог бы'</li><li>'команды'</li></ul>                                                          |
-## Evaluation
-### Metrics
-| Label   | Accuracy |
 |:--------|:---------|
-| **all** | 0.8903   |
-## Uses
-### Direct Use for Inference
-First install the SetFit library:
 ```bash
 pip install setfit
 ```
-Then you can load this model and run inference.
 ```python
 from setfit import SetFitModel
-# Download from the 🤗 Hub
 model = SetFitModel.from_pretrained("tmp4qvbozcq/panda_commands")
-# Run inference
 preds = model("беги бы")
 ```
 <!--
-### Downstream Use
-*List how someone could finetune this model on their own dataset.*
 -->
 <!--
-### Out-of-Scope Use
-*List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
 <!--
-## Bias, Risks and Limitations
-*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
 -->
 <!--
-### Recommendations
-*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
 -->
-## Training Details
-### Training Set Metrics
-| Training set | Min | Median | Max |
 |:-------------|:----|:-------|:----|
-| Word count   | 1   | 2.3697 | 7   |
-| Label              | Training Sample Count |
 |:-------------------|:----------------------|
 | bind               | 44                    |
 | dismiss            | 128                   |
@@ -151,7 +149,7 @@ preds = model("беги бы")
 | unbind             | 30                    |
 | unknown            | 381                   |
-### Training Hyperparameters
 - batch_size: (128, 128)
 - num_epochs: (1, 1)
 - max_steps: -1
@@ -170,8 +168,8 @@ preds = model("беги бы")
 - eval_max_steps: -1
 - load_best_model_at_end: False
-### Training Results
-| Epoch  | Step | Training Loss | Validation Loss |
 |:------:|:----:|:-------------:|:---------------:|
 | 0.0025 | 1    | 0.2328        | -               |
 | 0.1253 | 50   | 0.0955        | -               |
@@ -182,7 +180,7 @@ preds = model("беги бы")
 | 0.7519 | 300  | 0.0031        | -               |
 | 0.8772 | 350  | 0.0022        | -               |
-### Framework Versions
 - Python: 3.11.14
 - SetFit: 1.1.3
 - Sentence Transformers: 5.2.2
@@ -191,7 +189,7 @@ preds = model("беги бы")
 - Datasets: 4.5.0
 - Tokenizers: 0.22.2
-## Citation
 ### BibTeX
 ```bibtex
@@ -208,19 +206,17 @@ preds = model("беги бы")
 ```
 <!--
-## Glossary
-*Clearly define terms in order to be accessible across audiences.*
 -->
 <!--
-## Model Card Authors
-*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
 -->
 <!--
-## Model Card Contact
-*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--->

       name: Accuracy
 ---
+# SetFit на базе google/embeddinggemma-300M
+Это модель [SetFit](https://github.com/huggingface/setfit) для классификации текста. В качестве модели эмбеддингов Sentence Transformer используется [google/embeddinggemma-300M](https://huggingface.co/google/embeddinggemma-300M). Для классификации применяется [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).
+Модель обучена методом эффективного few-shot обучения, который включает:
+1. Дообучение [Sentence Transformer](https://www.sbert.net) с контрастным обучением.
+2. Обучение классификационной головы на признаках из дообученного Sentence Transformer.
+## Сведения о модели
+### Описание модели
+- **Тип модели:** SetFit
+- **Тело Sentence Transformer:** [google/embeddinggemma-300M](https://huggingface.co/google/embeddinggemma-300M)
+- **Классификационная голова:** экземпляр [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html)
+- **Максимальная длина последовательности:** 2048 токенов
+- **Количество классов:** 13 классов
+<!-- - **Обучающий датасет:** [Unknown](https://huggingface.co/datasets/unknown) -->
+<!-- - **Язык:** Неизвестно -->
+<!-- - **Лицензия:** Неизвестно -->
+### Источники модели
+- **Репозиторий:** [SetFit на GitHub](https://github.com/huggingface/setfit)
+- **Статья:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
+- **Блог:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
+### Метки модели
+| Метка              | Примеры                                                                                                                      |
 |:-------------------|:------------------------------------------------------------------------------------------------------------------------------|
 | unknown            | <ul><li>'чей робот'</li><li>'опять лежать'</li><li>'смотри панда'</li></ul>                                                   |
 | stand_at_attention | <ul><li>'пора выравняться'</li><li>'не хочешь равняться'</li><li>'выравнялся бы'</li></ul>                                    |
 | run                | <ul><li>'надо бежать'</li><li>'побеги'</li><li>'хотела бы чтобы панда бежала'</li></ul>                                       |
 | help               | <ul><li>'надо помочь'</li><li>'помог бы'</li><li>'команды'</li></ul>                                                          |
+## Оценка
+### Метрики
+| Метка   | Точность (Accuracy) |
 |:--------|:---------|
+| **все** | 0.8903   |
+## Применение
+### Прямое использование для инференса
+Сначала установите библиотеку SetFit:
 ```bash
 pip install setfit
 ```
+Затем можно загрузить эту модель и выполнить инференс.
 ```python
 from setfit import SetFitModel
+# Загрузка из 🤗 Hub
 model = SetFitModel.from_pretrained("tmp4qvbozcq/panda_commands")
+# Инференс
 preds = model("беги бы")
 ```
 <!--
+### Дальнейшее использование
+*Опишите, как можно дообучить эту модель на своём датасете.*
 -->
 <!--
+### Вне области применения
+*Опишите возможные случаи неправильного использования и что пользователям не следует делать с моделью.*
 -->
 <!--
+## Смещения, риски и ограничения
+*Какие известные или ожидаемые проблемы связаны с этой моделью? Можно указать известные случаи неудач или слабые стороны.*
 -->
 <!--
+### Рекомендации
+*Какие рекомендации дать в связи с ожидаемыми проблемами? Например, фильтрация явного контента.*
 -->
+## Детали обучения
+### Метрики обучающей выборки
+| Обучающая выборка | Мин | Медиана | Макс |
 |:-------------|:----|:-------|:----|
+| Количество слов   | 1   | 2.3697 | 7   |
+| Метка              | Количество обучающих примеров |
 |:-------------------|:----------------------|
 | bind               | 44                    |
 | dismiss            | 128                   |
 | unbind             | 30                    |
 | unknown            | 381                   |
+### Гиперпараметры обучения
 - batch_size: (128, 128)
 - num_epochs: (1, 1)
 - max_steps: -1
 - eval_max_steps: -1
 - load_best_model_at_end: False
+### Результаты обучения
+| Эпоха  | Шаг | Функция потерь (обучение) | Функция потерь (валидация) |
 |:------:|:----:|:-------------:|:---------------:|
 | 0.0025 | 1    | 0.2328        | -               |
 | 0.1253 | 50   | 0.0955        | -               |
 | 0.7519 | 300  | 0.0031        | -               |
 | 0.8772 | 350  | 0.0022        | -               |
+### Версии фреймворков
 - Python: 3.11.14
 - SetFit: 1.1.3
 - Sentence Transformers: 5.2.2
 - Datasets: 4.5.0
 - Tokenizers: 0.22.2
+## Цитирование
 ### BibTeX
 ```bibtex
 ```
 <!--
+## Глоссарий
+*Дайте чёткие определения терминов для понятности разной аудитории.*
 -->
 <!--
+## Авторы карточки модели
+*Укажите людей, создавших карточку модели.*
 -->
 <!--
+## Контакт
+*Как связаться с авторами карточки модели.*
+-->