leaBroe
/

Heavy2Light_adapter

encoder-decoder

Model card Files Files and versions

Heavy2Light_adapter / README.md

leaBroe's picture

Update README.md

ddc8dfc verified 6 months ago

|

history blame contribute delete

2.01 kB

	---
	tags:
	- encoder-decoder
	- adapter-transformers
	---

	# Adapter `leaBroe/Heavy2Light_adapter` for the Heavy2Light EncoderDecoder Model

	An [adapter](https://adapterhub.ml) for the `Heavy2Light EncoderDecoder Model (Encoder: HeavyBERTa, Decoder: LightGPT)` model that was trained with data from [OAS](https://opig.stats.ox.ac.uk/webapps/oas/) and [PLAbDab](https://opig.stats.ox.ac.uk/webapps/plabdab/).

	This adapter was created for usage with the [Adapters](https://github.com/Adapter-Hub/adapters) library.

	## Usage

	First, install `adapters`:

	```
	pip install -U adapters
	```

	Now, the adapter can be loaded and activated like this:

	```python
	from transformers import EncoderDecoderModel, AutoTokenizer, GenerationConfig
	from adapters import init

	model_path = "leaBroe/Heavy2Light"
	subfolder_path = "heavy2light_final_checkpoint"

	model = EncoderDecoderModel.from_pretrained(model_path)

	tokenizer = AutoTokenizer.from_pretrained(model_path, subfolder=subfolder_path)

	init(model)
	adapter_name = model.load_adapter("leaBroe/Heavy2Light_adapter", set_active=True)
	model.set_active_adapters(adapter_name)
	```

	then, the model can be used for inference:

	``` python
	generation_config = GenerationConfig.from_pretrained(model_path)

	# example input heavy sequence
	heavy_seq = "QLQVQESGPGLVKPSETLSLTCTVSGASSSIKKYYWGWIRQSPGKGLEWIGSIYSSGSTQYNPALGSRVTLSVDTSQTQFSLRLTSVTAADTATYFCARQGADCTDGSCYLNDAFDVWGRGTVVTVSS"

	inputs = tokenizer(
	heavy_seq,
	padding="max_length",
	truncation=True,
	max_length=250,
	return_tensors="pt"
	)

	generated_seq = model.generate(
	input_ids=inputs.input_ids,
	attention_mask=inputs.attention_mask,
	num_return_sequences=1,
	output_scores=True,
	return_dict_in_generate=True,
	generation_config=generation_config,
	bad_words_ids=[[4]],
	do_sample=True,
	temperature=1.0,
	)

	generated_text = tokenizer.decode(
	generated_seq.sequences[0],
	skip_special_tokens=True,
	)

	print("Generated light sequence:", generated_text)
	```