beyond
/

genius-large

@@ -42,13 +42,27 @@ inference:
 - Paper: [coming soon](to_be_added)
 - GitHub: [SEGA](https://github.com/beyondguo/SEGA).
-**SEGA** is able to write complete paragraphs given a sketch (or framework), which can be composed of:
-- keywords /key-phrases, like [NLP | AI | computer science]
-- spans, like [Conference on Empirical Methods | submission of research papers]
-- sentences, like [I really like machine learning | I work at Google since last year]
-- all mixup~
 ### How to use
 ```python
 from transformers import pipeline
 # 1. load the model with the huggingface `pipeline`
@@ -64,34 +78,42 @@ Output:
 'The Conference on Empirical Methods welcomes the submission of research papers. Abstracts should be in the form of a paper or presentation. Please submit abstracts to the following email address: eemml.stanford.edu. The conference will be held at Stanford University on April 1618, 2019. The theme of the conference is Deep Learning.'
 ```
-## Model variations
-| Model | #params | Language |
-|------------------------|--------------------------------|-------|
-| [`sega-large`](https://huggingface.co/beyond/sega-large) | xM   | English |
-| [`sega-base`(coming soon)]()  | xM    | English |
-| [`sega-large-chinese`(coming soon)]() | xM    |  Chinese |
-| [`sega-base-chinese`(New!)](https://huggingface.co/beyond/sega-base-chinese) | xM    | Chinese |
-## Data Augmentation for Text Classification Tasks:
 - Setting: Low-resource setting, where only n={50,100,200,500,1000} labeled samples are available for training. The below results are the average of all training sizes.
 - Datasets: [HuffPost](https://huggingface.co/datasets/khalidalt/HuffPost), [BBC](https://huggingface.co/datasets/SetFit/bbc-news), [SST2](https://huggingface.co/datasets/glue), [IMDB](https://huggingface.co/datasets/imdb), [Yahoo](https://huggingface.co/datasets/yahoo_answers_topics), [20NG](https://huggingface.co/datasets/newsgroup).
 - Base classifier: [DistilBERT](https://huggingface.co/distilbert-base-cased)
-| Method  |      HuffPost      |         BBC        |          SST2          |          IMDB          |    Yahoo   |    20NG    |    avg.    |
-|---------|:------------------:|:------------------:|:----------------------:|:----------------------:|:----------:|:----------:|:----------:|
-|         |   ID / OOD (BBC)   |   ID / OOD (Huff)  |     ID / OOD (IMDB)    |     ID / OOD (SST2)    |            |            |            |
-| none    |   79.17 / 62.32    | **96.16** / 62.00  |     76.67 / 73.16      |     77.87 / 74.43      |   45.77    |   46.67    |   69.42    |
-| EDA     |   79.63 / 67.48    |   95.11 / 58.92    |     75.52 / 69.46      |     77.88 / 75.88      |   45.10    |   46.15    |   69.11    |
-| STA     |   80.74 / 69.31    |   95.64 / 64.82    |     77.80 / 73.66      |     77.88 / 74.77      |   46.96    |   47.27    |   70.88    |
-| Back    |   80.48 / 67.75    |   95.28 / 63.10    |     76.96 / 72.23      |     78.35 / 75.96      |   46.10    |   46.61    |   70.28    |
-| MLM     |   80.04 / 66.80    |   96.07 / 65.39    |      76.61/ 73.11      |     75.73 / 73.70      |   45.35    |   46.53    |   69.93    |
-| C-MLM   |   79.96 / 65.10    | 96.13 / **67.80**  |     76.91 / 71.83      |     77.31 / 75.02      |   45.29    |   46.36    |   70.17    |
-| LAMBADA |   81.03 / 68.89    |   93.75 / 52.79    |     77.87 / 74.54      |     77.49 / 74.33      |   50.66    |   47.72    |   69.91    |
-| **SEGA (Ours)**   |   81.43 / 74.87    |   95.61 / 67.79    |     77.87 / 72.94      | **79.51** / **76.75** |   49.43    |   50.47    |   72.67    |
-| **SEGA-f (Ours)**  | **81.82** / **76.18** |   95.78 / 67.79    | **80.59** / **80.32** |     79.37 / 76.61      | **50.12** | **50.81** | **73.94** |

 - Paper: [coming soon](to_be_added)
 - GitHub: [SEGA](https://github.com/beyondguo/SEGA).
+**SEGA** is able to write complete paragraphs given a *sketch*, which can be composed of:
+- keywords /key-phrases, like "––NLP––AI––computer––science––"
+- spans, like "Conference on Empirical Methods––submission of research papers––"
+- sentences, like "I really like machine learning––I work at Google since last year––"
+- or mixup~
+**Model variations:**
+| Model | #params | Language | comment|
+|------------------------|--------------------------------|-------|---------|
+| [`sega-large`](https://huggingface.co/beyond/sega-large) | 406M   | English | The version used in paper |
+| [`sega-large-k2t`](https://huggingface.co/beyond/sega-large-k2t)  | 406M    | English | keywords-to-text |
+| [`sega-base`](https://huggingface.co/beyond/sega-base)  | 139M    | English | smaller version |
+| [`sega-base-ps`](https://huggingface.co/beyond/sega-base)  | 139M    | English | pre-trained both in paragraphs and short sentences |
+| [`sega-base-chinese`](https://huggingface.co/beyond/sega-base-chinese) | 116M    | 中文 | 在一千万纯净中文段落上预训练|
+---
 ### How to use
+#### 1. If you want to generate sentences given a **sketch**
 ```python
 from transformers import pipeline
 # 1. load the model with the huggingface `pipeline`
 'The Conference on Empirical Methods welcomes the submission of research papers. Abstracts should be in the form of a paper or presentation. Please submit abstracts to the following email address: eemml.stanford.edu. The conference will be held at Stanford University on April 1618, 2019. The theme of the conference is Deep Learning.'
 ```
+#### 2. If you want to do **data augmentation** to generate new training samples
+Please Check our Github page: [github.com/beyondguo/SEGA](https://github.com/beyondguo/SEGA), where we provide ready-to-run scripts for data augmentation for text classification/NER/MRC tasks.
+---
+## SEGA as A Strong Data Augmentation Tool:
 - Setting: Low-resource setting, where only n={50,100,200,500,1000} labeled samples are available for training. The below results are the average of all training sizes.
 - Datasets: [HuffPost](https://huggingface.co/datasets/khalidalt/HuffPost), [BBC](https://huggingface.co/datasets/SetFit/bbc-news), [SST2](https://huggingface.co/datasets/glue), [IMDB](https://huggingface.co/datasets/imdb), [Yahoo](https://huggingface.co/datasets/yahoo_answers_topics), [20NG](https://huggingface.co/datasets/newsgroup).
 - Base classifier: [DistilBERT](https://huggingface.co/distilbert-base-cased)
+In-distribution (ID) evaluations:
+|   Method   |    Huff    |     BBC    |    Yahoo   |    20NG    |    IMDB    |    SST2    |    avg.    |
+|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|
+|    none    |   79.17   | **96.16** |   45.77   |   46.67   |   77.87   |   76.67   |   70.39   |
+|     EDA    |   79.20   |   95.11   |   45.10   |   46.15   |   77.88   |   75.52   |   69.83   |
+|    BackT   |   80.48   |   95.28   |   46.10   |   46.61   |   78.35   |   76.96   |   70.63   |
+|     MLM    |   80.04   |   96.07   |   45.35   |   46.53   |   75.73   |   76.61   |   70.06   |
+|    C-MLM   |   80.60   |   96.13   |   45.40   |   46.36   |   77.31   |   76.91   |   70.45   |
+|   LAMBADA  |   81.46   |   93.74   |   50.49   |   47.72   |   78.22   |   78.31   |   71.66   |
+|     STA    |   80.74   |   95.64   |   46.96   |   47.27   |   77.88   |   77.80   |   71.05   |
+|  **SEGA**  |   81.43   |   95.74   |   49.60   |   50.38   | **80.16** |   78.82   |   72.68   |
+| **SEGA-f** | **81.82** |   95.99   | **50.42** | **50.81** |   79.40   | **80.57** | **73.17** |
+Out-of-distribution (OOD) evaluations:
+|            |  Huff->BBC |  BBC->Huff | IMDB->SST2 | SST2->IMDB |    avg.    |
+|------------|:----------:|:----------:|:----------:|:----------:|:----------:|
+|    none    |   62.32   |   62.00   |   74.37   |   73.11   |   67.95   |
+|     EDA    |   67.48   |   58.92   |   75.83   |   69.42   |   67.91   |
+|    BackT   |   67.75   |   63.10   |   75.91   |   72.19   |   69.74   |
+|     MLM    |   66.80   |   65.39   |   73.66   |   73.06   |   69.73   |
+|    C-MLM   |   64.94   | **67.80** |   74.98   |   71.78   |   69.87   |
+|   LAMBADA  |   68.57   |   52.79   |   75.24   |   76.04   |   68.16   |
+|     STA    |   69.31   |   64.82   |   74.72   |   73.62   |   70.61   |
+|  **SEGA**  |   74.87   |   66.85   |   76.02   |   74.76   |   73.13   |
+| **SEGA-f** | **76.18** |   66.89   | **77.45** | **80.36** | **75.22** |