|
|
--- |
|
|
tags: |
|
|
- encoder-decoder |
|
|
- adapter-transformers |
|
|
--- |
|
|
|
|
|
# Adapter `leaBroe/Heavy2Light_adapter` for the Heavy2Light EncoderDecoder Model |
|
|
|
|
|
An [adapter](https://adapterhub.ml) for the `Heavy2Light EncoderDecoder Model (Encoder: HeavyBERTa, Decoder: LightGPT)` model that was trained with data from [OAS](https://opig.stats.ox.ac.uk/webapps/oas/) and [PLAbDab](https://opig.stats.ox.ac.uk/webapps/plabdab/). |
|
|
|
|
|
This adapter was created for usage with the **[Adapters](https://github.com/Adapter-Hub/adapters)** library. |
|
|
|
|
|
## Usage |
|
|
|
|
|
First, install `adapters`: |
|
|
|
|
|
``` |
|
|
pip install -U adapters |
|
|
``` |
|
|
|
|
|
Now, the adapter can be loaded and activated like this: |
|
|
|
|
|
```python |
|
|
from transformers import EncoderDecoderModel, AutoTokenizer, GenerationConfig |
|
|
from adapters import init |
|
|
|
|
|
model_path = "leaBroe/Heavy2Light" |
|
|
subfolder_path = "heavy2light_final_checkpoint" |
|
|
|
|
|
model = EncoderDecoderModel.from_pretrained(model_path) |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_path, subfolder=subfolder_path) |
|
|
|
|
|
init(model) |
|
|
adapter_name = model.load_adapter("leaBroe/Heavy2Light_adapter", set_active=True) |
|
|
model.set_active_adapters(adapter_name) |
|
|
``` |
|
|
|
|
|
then, the model can be used for inference: |
|
|
|
|
|
``` python |
|
|
generation_config = GenerationConfig.from_pretrained(model_path) |
|
|
|
|
|
# example input heavy sequence |
|
|
heavy_seq = "QLQVQESGPGLVKPSETLSLTCTVSGASSSIKKYYWGWIRQSPGKGLEWIGSIYSSGSTQYNPALGSRVTLSVDTSQTQFSLRLTSVTAADTATYFCARQGADCTDGSCYLNDAFDVWGRGTVVTVSS" |
|
|
|
|
|
inputs = tokenizer( |
|
|
heavy_seq, |
|
|
padding="max_length", |
|
|
truncation=True, |
|
|
max_length=250, |
|
|
return_tensors="pt" |
|
|
) |
|
|
|
|
|
generated_seq = model.generate( |
|
|
input_ids=inputs.input_ids, |
|
|
attention_mask=inputs.attention_mask, |
|
|
num_return_sequences=1, |
|
|
output_scores=True, |
|
|
return_dict_in_generate=True, |
|
|
generation_config=generation_config, |
|
|
bad_words_ids=[[4]], |
|
|
do_sample=True, |
|
|
temperature=1.0, |
|
|
) |
|
|
|
|
|
generated_text = tokenizer.decode( |
|
|
generated_seq.sequences[0], |
|
|
skip_special_tokens=True, |
|
|
) |
|
|
|
|
|
print("Generated light sequence:", generated_text) |
|
|
``` |
|
|
|
|
|
|