LLM2Vec-Gen: Generative Embeddings from Large Language Models

LLM2Vec-Gen is a recipe to train interpretable, generative embeddings that encode the potential answer of an LLM to a query rather than the query.

Installation

pip install llm2vec-gen

Usage

from llm2vec_gen import LLM2VecGenModel

model = LLM2VecGenModel.from_pretrained("McGill-NLP/LLM2Vec-Gen-Qwen3-8B")
input_text = "Is Montreal located in Canada?"

enc = model.encode(input_text)
answer, enc_before_answer = model.generate(input_text, max_new_tokens=100, get_align_hidden_states=True)

print(answer)
print(enc_before_answer.equal(enc), enc)

Yes, Montreal is a city in Canada. It is the second-largest city in the country, located in the province of Quebec. Montreal is known for its rich cultural heritage, historic architecture, and vibrant arts scene.<|end_of_text|>

True tensor([[-0.2393, 0.0280, -0.5078, ..., 0.1270, 0.6484, 0.3574]], device='cuda:0', dtype=torch.bfloat16)

Questions

If you have any question about the code, feel free to email Parishad (parishad.behnamghader@mila.quebec) and Vaibhav (vaibhav.adlakha@mila.quebec).

Citation

If you use our code, models, or data, please cite the LLM2Vec-Gen paper.

@article{behnamghader2026llm2vec-gen,
  title={LLM2Vec-Gen: Generative Embeddings from Large Language Models},
  author={BehnamGhader, Parishad and Adlakha, Vaibhav and Schmidt, Fabian David and Chapados, Nicolas and Mosbach, Marius and Reddy, Siva},
  journal={arXiv preprint},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for McGill-NLP/LLM2Vec-Gen-Qwen3-4B

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Finetuned
(499)
this model

Dataset used to train McGill-NLP/LLM2Vec-Gen-Qwen3-4B