Sentence Similarity
Transformers
Safetensors
text-embedding
embeddings
information-retrieval
beir
text-classification
language-model
text-clustering
text-semantic-similarity
text-evaluation
text-reranking
feature-extraction
Sentence Similarity
natural_questions
ms_marco
fever
hotpot_qa
mteb
advbenchir
bright
LLM2Vec-Gen: Generative Embeddings from Large Language Models
LLM2Vec-Gen is a recipe to train interpretable, generative embeddings that encode the potential answer of an LLM to a query rather than the query.
- Repository: https://github.com/McGill-NLP/llm2vec-gen
- Paper: XXX
Installation
pip install llm2vec-gen
Usage
from llm2vec_gen import LLM2VecGenModel
model = LLM2VecGenModel.from_pretrained("McGill-NLP/LLM2Vec-Gen-Qwen3-8B")
input_text = "Is Montreal located in Canada?"
enc = model.encode(input_text)
answer, enc_before_answer = model.generate(input_text, max_new_tokens=100, get_align_hidden_states=True)
print(answer)
print(enc_before_answer.equal(enc), enc)
Yes, Montreal is a city in Canada. It is the second-largest city in the country, located in the province of Quebec. Montreal is known for its rich cultural heritage, historic architecture, and vibrant arts scene.<|end_of_text|>
True tensor([[-0.2393, 0.0280, -0.5078, ..., 0.1270, 0.6484, 0.3574]], device='cuda:0', dtype=torch.bfloat16)
Questions
If you have any question about the code, feel free to email Parishad (parishad.behnamghader@mila.quebec) and Vaibhav (vaibhav.adlakha@mila.quebec).
Citation
If you use our code, models, or data, please cite the LLM2Vec-Gen paper.
@article{behnamghader2026llm2vec-gen,
title={LLM2Vec-Gen: Generative Embeddings from Large Language Models},
author={BehnamGhader, Parishad and Adlakha, Vaibhav and Schmidt, Fabian David and Chapados, Nicolas and Mosbach, Marius and Reddy, Siva},
journal={arXiv preprint},
year={2026}
}