Model ouput - What does the model output correspond to?

by tmwalsh - opened Feb 24, 2025

Feb 24, 2025

I am trying to extract FAbCon's final sequence embeddings for a set of amino acid sequences. What do the dimensions of the output correspond to?

justinbarton

Feb 24, 2025

The model call outputs a transformers.modeling_outputs.CausalLMOutputWithCrossAttentions object.

So if you want to do the typical thing of using the EOS token embedding as the sequence embedding then you would do something like:

from transformers import PreTrainedTokenizerFast, FalconForCausalLM

tokenizer = PreTrainedTokenizerFast.from_pretrained("alchemab/fabcon-large")
model = FalconForCausalLM.from_pretrained("alchemab/fabcon-large")

... ## --> Batching and tokenizing your inputs

output = model(**input_batch)

last_token_indices = input_batch['attention_mask'].sum(dim=1) - 1
batch_embeddings = output.last_hidden_state[range(output.last_hidden_state.size(0)), last_token_indices, :].cpu().numpy()

ideasbyjin

Feb 24, 2025

Adding to Justin’s point above, a tensor is of shape

B x L x D

Where D corresponds to the model’s size (eg Fabcon small has a D of 768), B is batch size (ie number of sequences) and L is your sequence length — typically the longest length of any antibody sequence input you provide due to padding.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment