codefuse-ai
/

F2LLM-v2-8B-Preview

+---
+license: apache-2.0
+language:
+- en
+- es
+- fr
+- de
+- ru
+- nl
+- vi
+- zh
+- hi
+- id
+- it
+- ja
+- pt
+- pl
+- ar
+- ko
+- uk
+- th
+- ca
+- cs
+- gl
+- tl
+- eu
+- hy
+- ne
+- fa
+- my
+- lo
+- km
+- az
+- tg
+- sv
+- si
+- da
+- tr
+- sw
+- fi
+- ro
+- 'no'
+- hu
+- he
+- el
+- sk
+- bg
+base_model:
+- Qwen/Qwen3-8B
+pipeline_tag: feature-extraction
+library_name: transformers
+tags:
+- sentence-transformers
+---
+# F2LLM-v2-8B-Preview
+**F2LLM-v2-8B-Preview** is a multilingual embedding model trained from Qwen3-8B on a corpus of **27 million samples**, spanning **over 100 languages**. It is a "preview" version trained without instructions and intended to serve as a foundation for downstream embedding tasks and further fine-tuning.
+## Usage
+### With Sentence Transformers
+To encode text with the [Sentence Transformers](https://www.sbert.net/) library:
+```python
+from sentence_transformers import SentenceTransformer
+model = SentenceTransformer("codefuse-ai/F2LLM-v2-8B-Preview", device="cuda:0", model_kwargs={"torch_dtype": "bfloat16"})
+# Some sample query and documents
+query = "What is F2LLM used for?"
+documents = [
+    'We present F2LLM, a family of fully open embedding LLMs that achieve a strong balance between model size, training data, and embedding performance.',
+    'F2LLM is a model for computing text embeddings that can be used for various NLP tasks such as information retrieval, semantic search, and text classification.',
+    'F2LLM 是 CodeFuse 开源的系列嵌入模型。',
+    'F2LLM — это модель вычисления встраивания текста, которую можно использовать для различных задач НЛП, таких как поиск информации, семантический поиск и классификация текста.'
+]
+# Encode the query and documents separately, the encode_query method uses the query prompt
+query_embedding = model.encode_query(query)
+document_embeddings = model.encode_document(documents)
+print(query_embedding.shape, document_embeddings.shape)
+# (4096,) (4, 4096)
+# Compute cosine similarity between the query and documents
+similarity = model.similarity(query_embedding, document_embeddings)
+print(similarity)
+# tensor([[0.6329, 0.8003, 0.6361, 0.8267]])
+```
+### With Transformers
+Or directly with the [Transformers](https://huggingface.co/docs/transformers/index) library:
+```python
+from transformers import AutoModel, AutoTokenizer
+import torch
+import torch.nn.functional as F
+model_path = "codefuse-ai/F2LLM-v2-8B-Preview"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model = AutoModel.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map={'': 0})
+query = "What is F2LLM used for?"
+documents = [
+    'We present F2LLM, a family of fully open embedding LLMs that achieve a strong balance between model size, training data, and embedding performance.',
+    'F2LLM is a model for computing text embeddings that can be used for various NLP tasks such as information retrieval, semantic search, and text classification.',
+    'F2LLM 是 CodeFuse 开源的系列嵌入模型。',
+    'F2LLM — это модель вычисления встраивания текста, которую можно использовать для различных задач НЛП, таких как поиск информации, семантический поиск и классификация текста.'
+]
+def encode(sentences):
+    batch_size = len(sentences)
+    # the tokenizer will automatically add eos token
+    tokenized_inputs = tokenizer(sentences, padding=True, return_tensors='pt').to(model.device)
+    last_hidden_state = model(**tokenized_inputs).last_hidden_state
+    eos_positions = tokenized_inputs.attention_mask.sum(dim=1) - 1
+    embeddings = last_hidden_state[torch.arange(batch_size, device=model.device), eos_positions]
+    embeddings = F.normalize(embeddings, p=2, dim=1)
+    return embeddings
+# Encode the query and documents
+query_embedding = encode([query])
+document_embeddings = encode(documents)
+print(query_embedding.shape, document_embeddings.shape)
+# torch.Size([1, 4096]) torch.Size([4, 4096])
+# Compute cosine similarity between the query and documents
+similarity = query_embedding @ document_embeddings.T
+print(similarity)
+# tensor([[0.6328, 0.8008, 0.6328, 0.8242]], device='cuda:0',
+#        dtype=torch.bfloat16, grad_fn=<MmBackward0>)
+```
+## Future Releases
+We are committed to the open-source community and will soon release:
+- **The Finetuned Version:** Optimized for downstream tasks, with state-of-the-art performance on MTEB.
+- **The Training Data:** We will be releasing the data used to train F2LLM-v2 to help advance the field of multilingual embeddings.
+Stay tuned for more updates!