Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,88 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
+
|
| 5 |
+
# Cappy-Large
|
| 6 |
+
|
| 7 |
+
## Getting Started
|
| 8 |
+
|
| 9 |
+
Cappy is a pretrained small scorer designed to enhance the performance and efficiency of multi-task LLMs.
|
| 10 |
+
Cappy takes in an instruction and a candidate response as input, and produces a score between 0 and 1, indicating an estimated correctness of the response with respect to the instruction.
|
| 11 |
+
With merely 360 million parameters, Cappy functions either independently on classification tasks or serve as an auxiliary component for LLMs, boosting their performance.
|
| 12 |
+
Also, Cappy enables efficiently integrating downstream supervision without requiring LLM finetuning nor the access to their parameters.
|
| 13 |
+
Furthermore, Cappy is flexible to cooperate with other LLM adaptations, including finetuning and in-context learning, and prompt tuning, offering additional performance enhancement.
|
| 14 |
+
|
| 15 |
+
- **Repository:** [https://github.com/tanyuqian/cappy](https://github.com/tanyuqian/cappy)
|
| 16 |
+
- **Paper:** [arxiv.org/abs/2311.06720](https://arxiv.org/abs/2311.06720)
|
| 17 |
+
|
| 18 |
+
## Uses
|
| 19 |
+
|
| 20 |
+
Cappy can be loaded either as a Jax/Flax model or a PyTorch model.
|
| 21 |
+
|
| 22 |
+
### Jax/Flax
|
| 23 |
+
```python
|
| 24 |
+
from transformers import AutoTokenizer, FlaxAutoModelForSequenceClassification
|
| 25 |
+
tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
|
| 26 |
+
cappy = FlaxAutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
|
| 27 |
+
|
| 28 |
+
instruction = """
|
| 29 |
+
What label best describes this news article?
|
| 30 |
+
Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
|
| 31 |
+
"""
|
| 32 |
+
response = 'Business'
|
| 33 |
+
|
| 34 |
+
inputs = tokenizer([(instruction, response), ], return_tensors='pt')
|
| 35 |
+
score = cappy(**inputs).logits[0][0].item()
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
### PyTorch
|
| 39 |
+
```python
|
| 40 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
| 41 |
+
|
| 42 |
+
tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
|
| 43 |
+
cappy = AutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
|
| 44 |
+
|
| 45 |
+
instruction = """
|
| 46 |
+
What label best describes this news article?
|
| 47 |
+
Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
|
| 48 |
+
"""
|
| 49 |
+
response = 'Business'
|
| 50 |
+
|
| 51 |
+
inputs = tokenizer([(instruction, response), ], return_tensors='pt')
|
| 52 |
+
score = cappy(**inputs).logits[0][0].item()
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
## Evaluation
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
We validate Cappy through an extensive suite of held-out tasks distinct from those incorporated in its pretraining.
|
| 60 |
+
The overall performance is as shown in Fig. 1 and Fig. 2.
|
| 61 |
+
Specifically, on 11 language understanding tasks drawn from PromptSource, Cappy, with 360 million parameters, outperforms
|
| 62 |
+
OPT-IML-30B and OPT-175B significantly, and matches the best ones among previous multi-task
|
| 63 |
+
LLMs. Besides, on 45 diverse complex tasks from BIG-Bench, Cappy consistently boosts the
|
| 64 |
+
performance of the advanced multi-task LLM, FLAN-T5, by a large margin. Furthermore, Cappy
|
| 65 |
+
offers additional performance enhancement when applied together with finetuning or in-context
|
| 66 |
+
learning. Our subsequent ablation study proves the significance of our proposed pretraining and data
|
| 67 |
+
augmentation strategies.
|
| 68 |
+
|
| 69 |
+

|
| 70 |
+
|
| 71 |
+
## Software
|
| 72 |
+
|
| 73 |
+
Cappy's pretraining uses the code from [this example](https://github.com/tanyuqian/redco/tree/master/examples/classification_regression) in [Red Coast](https://github.com/tanyuqian/redco), a lightweight
|
| 74 |
+
toolkit for automating distributed training.
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
## Citation
|
| 78 |
+
|
| 79 |
+
```
|
| 80 |
+
@inproceedings{
|
| 81 |
+
tan2023cappy,
|
| 82 |
+
title={Cappy: Outperforming and Boosting Large Multi-Task {LM}s with a Small Scorer},
|
| 83 |
+
author={Bowen Tan and Yun Zhu and Lijuan Liu and Eric Xing and Zhiting Hu and Jindong Chen},
|
| 84 |
+
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
|
| 85 |
+
year={2023},
|
| 86 |
+
url={https://openreview.net/forum?id=Srt1hhQgqa}
|
| 87 |
+
}
|
| 88 |
+
```
|