Commit
·
bb45fe7
1
Parent(s):
57d1ccf
Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: transformers
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
# Flacuna: A Vicuna made of Flan
|
| 7 |
+
|
| 8 |
+
<img src="flacuna5.png" alt="Image" width="200" height="335">
|
| 9 |
+
|
| 10 |
+
Flacuna was developed by fine-tuning Vicuna on Flan-mini, a comprehensive instruction collection encompassing various tasks. Vicuna is already an excellent writing assistant, and the intention behind Flacuna was to enhance Vicuna's problem-solving capabilities. To achieve this, we curated a dedicated instruction dataset called Flan-mini.
|
| 11 |
+
|
| 12 |
+
| Dataset Name | Source | Dataset Size |
|
| 13 |
+
|-----------------------------|------------------------|--------------|
|
| 14 |
+
| Flan2021 | Flan | 388K |
|
| 15 |
+
| Public Pool of Prompts | Flan | 320K |
|
| 16 |
+
| Natural instructions v2 | Flan | 200K |
|
| 17 |
+
| CoT | Flan | 100K |
|
| 18 |
+
| Code Search | husain2019codesearchnet | 100K |
|
| 19 |
+
| Code Contest | li2022competition | 50K |
|
| 20 |
+
| Apps | hendrycksapps2021 | 50K |
|
| 21 |
+
| GPT4-Alpaca | GPT-4 | 52K |
|
| 22 |
+
| Code-Alpaca | ChatGPT | 20K |
|
| 23 |
+
| ShareGPT | ChatGPT | 60K |
|
| 24 |
+
| Total | - | 1.34M |
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
As a result of this fine-tuning process, Flacuna exhibited notable performance improvements in problem-solving across multiple benchmark datasets, both in few-shot and zero-shot settings.
|
| 28 |
+
|
| 29 |
+
| **Model** | **Size** | **MMLU (5-shot)** | **BBH (3-shot)** | **DROP (3-shot)** | **CRASS (3-shot)** | **HumanEval (0-shot)** | **Avg.** |
|
| 30 |
+
| --- | --- | --- | --- | --- | --- | --- | --- |
|
| 31 |
+
| StableVicuna | 13B | 49.2 (+3.0) | 37.5 (+0.4) | 34.3 (-1.0) | 67.5 (+8.7) | 15.9 (+2.5) | 40.9 (+2.7) |
|
| 32 |
+
| Vicuna | 13B | 50.6 (+4.5) | 37.6 (+0.5) | 32.6 (-3.0) | 60.9 (+2.1) | 11.6 (-1.8) | 38.7 (+0.6) |
|
| 33 |
+
| Flacuna | 13B | 51.1 (+5.0) | 39.3 (+2.2) | 43.6 (+8.0) | 74.1 (+15.3) | 11.0 (-2.4) | 43.8 (+5.6) |
|
| 34 |
+
|
| 35 |
+
| **Model** | **Size** | **MMLU (0-shot)** | **BBH (0-shot)** | **CRASS (0-shot)** |
|
| 36 |
+
| --- | --- | --- | --- | --- |
|
| 37 |
+
| StableVicuna | 13B | 47.5 | 18.5 | 64.2 |
|
| 38 |
+
| Vicuna | 13B | 48.3 | 28.3 | 65.7 |
|
| 39 |
+
| Flacuna | 13B | 49.4 | 32.5 | 67.9 |
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
During training, Flacuna employed a maximum input sequence length of 1280. We utilized LoRA for parameter-efficient fine-tuning.
|