lightblue
/

openorca_stx

Text Generation

text-generation-inference

Model card Files Files and versions

ptrdvn commited on Sep 14, 2023

Commit

f72c6f1

·

1 Parent(s): c2fc322

Update README.md

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -10,6 +10,8 @@ language:
 # About
 This model is Lightblue's QLoRA finetune of OpenOrca's [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) model on Japanese fine-tuning datasets.
 We trained on equal samples of the following three datasets:
 * [SNOW](https://huggingface.co/datasets/snow_simplified_japanese_corpus)
 * [TyDiQA (Ja)](https://huggingface.co/datasets/khalidalt/tydiqa-goldp)
@@ -20,6 +22,18 @@ which resulted in a dataset of 13,167 samples total.
 These three datasets were chosen as they represent three distinct fine-tuning tasks (Text simplification, question answering, and text summarization, respectively) which we hypothesize can help to improve the language models suitability for dealing with Japanese data.
 These three datasets make up the model name: STX.
 # How to use
 ```python

 # About
 This model is Lightblue's QLoRA finetune of OpenOrca's [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) model on Japanese fine-tuning datasets.
+This model specialises on answering **Closed Question Answering** in Japanese. Input a piece of reference text, ask a question, and see the model answer based on the reference text.
 We trained on equal samples of the following three datasets:
 * [SNOW](https://huggingface.co/datasets/snow_simplified_japanese_corpus)
 * [TyDiQA (Ja)](https://huggingface.co/datasets/khalidalt/tydiqa-goldp)
 These three datasets were chosen as they represent three distinct fine-tuning tasks (Text simplification, question answering, and text summarization, respectively) which we hypothesize can help to improve the language models suitability for dealing with Japanese data.
 These three datasets make up the model name: STX.
+With these datasets, we achieve the following scores on the JGLUE benchmark:
+| Model Name             | Open-Orca/OpenOrcaxOpenChat-Preview2-13B | lightblue/openorca_stx |
+|------------------------|------------------------------------------|------------------------|
+| jsquad-1.1-0.3         | 0.692                                    | 0.836                  |
+| jcommonsenseqa-1.1-0.3 | 0.831                                    | 0.782                  |
+| jnli-1.1-0.3           | 0.504                                    | 0.48                   |
+| marc_ja-1.1-0.3        | 0.936                                    | 0.959                  |
+Our model achieves much better results on the question answering benchmark (JSQuAD) than the base checkpoint without monstrous degradation of performance on multi-choice question benchmarks (JCommonSense, JNLI, MARC-Ja) purely through QLoRA training.
+This shows the potential for applying strong language models such as [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) to minimal QLoRA fine-tuning using Japanese fine-tuning datasets to achieve better results at narrow NLP tasks.
 # How to use
 ```python