btan2
/

cappy-large

 ---
 license: apache-2.0
 ---
+# Cappy-Large
+## Getting Started
+Cappy is a pretrained small scorer designed to enhance the performance and efficiency of multi-task LLMs.
+Cappy takes in an instruction and a candidate response as input, and produces a score between 0 and 1, indicating an estimated correctness of the response with respect to the instruction.
+With merely 360 million parameters, Cappy functions either independently on classification tasks or serve as an auxiliary component for LLMs, boosting their performance.
+Also, Cappy enables efficiently integrating downstream supervision without requiring LLM finetuning nor the access to their parameters.
+Furthermore, Cappy is flexible to cooperate with other LLM adaptations, including finetuning and in-context learning, and prompt tuning, offering additional performance enhancement.
+- **Repository:** [https://github.com/tanyuqian/cappy](https://github.com/tanyuqian/cappy)
+- **Paper:** [arxiv.org/abs/2311.06720](https://arxiv.org/abs/2311.06720)
+## Uses
+Cappy can be loaded either as a Jax/Flax model or a PyTorch model.
+### Jax/Flax
+```python
+from transformers import AutoTokenizer, FlaxAutoModelForSequenceClassification
+tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
+cappy = FlaxAutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
+instruction = """
+What label best describes this news article?
+Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
+"""
+response = 'Business'
+inputs = tokenizer([(instruction, response), ], return_tensors='pt')
+score = cappy(**inputs).logits[0][0].item()
+```
+### PyTorch
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
+cappy = AutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')
+instruction = """
+What label best describes this news article?
+Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
+"""
+response = 'Business'
+inputs = tokenizer([(instruction, response), ], return_tensors='pt')
+score = cappy(**inputs).logits[0][0].item()
+```
+## Evaluation
+We validate Cappy through an extensive suite of held-out tasks distinct from those incorporated in its pretraining.
+The overall performance is as shown in Fig. 1 and Fig. 2.
+Specifically, on 11 language understanding tasks drawn from PromptSource, Cappy, with 360 million parameters, outperforms
+OPT-IML-30B and OPT-175B significantly, and matches the best ones among previous multi-task
+LLMs. Besides, on 45 diverse complex tasks from BIG-Bench, Cappy consistently boosts the
+performance of the advanced multi-task LLM, FLAN-T5, by a large margin. Furthermore, Cappy
+offers additional performance enhancement when applied together with finetuning or in-context
+learning. Our subsequent ablation study proves the significance of our proposed pretraining and data
+augmentation strategies.
+![](imgs/cappy_eval.png)
+## Software
+Cappy's pretraining uses the code from [this example](https://github.com/tanyuqian/redco/tree/master/examples/classification_regression) in [Red Coast](https://github.com/tanyuqian/redco), a lightweight
+toolkit for automating distributed training.
+## Citation
+```
+@inproceedings{
+tan2023cappy,
+title={Cappy: Outperforming and Boosting Large Multi-Task {LM}s with a Small Scorer},
+author={Bowen Tan and Yun Zhu and Lijuan Liu and Eric Xing and Zhiting Hu and Jindong Chen},
+booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
+year={2023},
+url={https://openreview.net/forum?id=Srt1hhQgqa}
+}
+```