--- license: apache-2.0 language: - en - es - fr - de - ru - nl - vi - zh - hi - id - it - ja - pt - pl - ar - ko - uk - th - ca - cs - gl - tl - eu - hy - ne - fa - my - lo - km - az - tg - sv - si - da - tr - sw - fi - ro - 'no' - hu - he - el - sk - bg base_model: - Qwen/Qwen3-8B pipeline_tag: feature-extraction library_name: transformers tags: - sentence-transformers --- # F2LLM-v2-8B-Preview **F2LLM-v2-8B-Preview** is a multilingual embedding model trained from Qwen3-8B on a corpus of **27 million samples**, spanning **over 100 natural and programming languages**. It is a "preview" version trained without instructions and intended to serve as a foundation for downstream embedding tasks and further fine-tuning. ## Usage ### With Sentence Transformers To encode text with the [Sentence Transformers](https://www.sbert.net/) library: ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("codefuse-ai/F2LLM-v2-8B-Preview", device="cuda:0", model_kwargs={"torch_dtype": "bfloat16"}) # Some sample query and documents query = "What is F2LLM used for?" documents = [ 'We present F2LLM, a family of fully open embedding LLMs that achieve a strong balance between model size, training data, and embedding performance.', 'F2LLM is a model for computing text embeddings that can be used for various NLP tasks such as information retrieval, semantic search, and text classification.', 'F2LLM 是 CodeFuse 开源的系列嵌入模型。', 'F2LLM — это модель вычисления встраивания текста, которую можно использовать для различных задач НЛП, таких как поиск информации, семантический поиск и классификация текста.' ] # Encode the query and documents query_embedding = model.encode(query) document_embeddings = model.encode(documents) print(query_embedding.shape, document_embeddings.shape) # (4096,) (4, 4096) # Compute cosine similarity between the query and documents similarity = model.similarity(query_embedding, document_embeddings) print(similarity) # tensor([[0.6329, 0.8003, 0.6361, 0.8267]]) ``` ### With Transformers Or directly with the [Transformers](https://huggingface.co/docs/transformers/index) library: ```python from transformers import AutoModel, AutoTokenizer import torch import torch.nn.functional as F model_path = "codefuse-ai/F2LLM-v2-8B-Preview" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModel.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map={'': 0}) query = "What is F2LLM used for?" documents = [ 'We present F2LLM, a family of fully open embedding LLMs that achieve a strong balance between model size, training data, and embedding performance.', 'F2LLM is a model for computing text embeddings that can be used for various NLP tasks such as information retrieval, semantic search, and text classification.', 'F2LLM 是 CodeFuse 开源的系列嵌入模型。', 'F2LLM — это модель вычисления встраивания текста, которую можно использовать для различных задач НЛП, таких как поиск информации, семантический поиск и классификация текста.' ] def encode(sentences): batch_size = len(sentences) # the tokenizer will automatically add eos token tokenized_inputs = tokenizer(sentences, padding=True, return_tensors='pt').to(model.device) last_hidden_state = model(**tokenized_inputs).last_hidden_state eos_positions = tokenized_inputs.attention_mask.sum(dim=1) - 1 embeddings = last_hidden_state[torch.arange(batch_size, device=model.device), eos_positions] embeddings = F.normalize(embeddings, p=2, dim=1) return embeddings # Encode the query and documents query_embedding = encode([query]) document_embeddings = encode(documents) print(query_embedding.shape, document_embeddings.shape) # torch.Size([1, 4096]) torch.Size([4, 4096]) # Compute cosine similarity between the query and documents similarity = query_embedding @ document_embeddings.T print(similarity) # tensor([[0.6328, 0.8008, 0.6328, 0.8242]], device='cuda:0', # dtype=torch.bfloat16, grad_fn=) ``` ## Future Releases We are committed to the open-source community and will soon release: - **The Finetuned Version:** Optimized for downstream tasks, with state-of-the-art performance on MTEB. - **The Training Data:** We will be releasing the data used to train F2LLM-v2 to help advance the field of multilingual embeddings. Stay tuned for more updates!