Geralt-Targaryen commited on
Commit
390bf08
·
verified ·
1 Parent(s): a5a2f19

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -9
README.md CHANGED
@@ -51,12 +51,27 @@ pipeline_tag: feature-extraction
51
  library_name: transformers
52
  tags:
53
  - sentence-transformers
 
 
54
  ---
55
 
56
  # F2LLM-v2-8B-Preview
57
 
58
  **F2LLM-v2-8B-Preview** is a multilingual embedding model trained from Qwen3-8B on a corpus of **27 million samples**, spanning **over 100 natural and programming languages**. It is a "preview" version trained without instructions and intended to serve as a foundation for downstream embedding tasks and further fine-tuning.
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  ## Usage
61
 
62
  ### With Sentence Transformers
@@ -138,12 +153,3 @@ print(similarity)
138
  ## Intermediate Checkpoints
139
 
140
  To facilitate future research, we release intermediate checkpoints in the `intermediate_checkpoints` branch.
141
-
142
- ## Future Releases
143
-
144
- We are committed to the open-source community and will soon release:
145
-
146
- - **The Finetuned Version:** Optimized for downstream tasks, with state-of-the-art performance on MTEB.
147
- - **The Training Data:** We will be releasing the data used to train F2LLM-v2 to help advance the field of multilingual embeddings.
148
-
149
- Stay tuned for more updates!
 
51
  library_name: transformers
52
  tags:
53
  - sentence-transformers
54
+ datasets:
55
+ - codefuse-ai/F2LLM-v2
56
  ---
57
 
58
  # F2LLM-v2-8B-Preview
59
 
60
  **F2LLM-v2-8B-Preview** is a multilingual embedding model trained from Qwen3-8B on a corpus of **27 million samples**, spanning **over 100 natural and programming languages**. It is a "preview" version trained without instructions and intended to serve as a foundation for downstream embedding tasks and further fine-tuning.
61
 
62
+ F2LLM-v2 is fully open. We release base models in 5 sizes, instruct models in 8 sizes, the training data, the training code, and intermediate checkpoints. The three smallest instruct models are pruned and trained from the 0.6B base model.
63
+
64
+ | Model | Base | Instruct |
65
+ | ----- | ----------------------------------------------------------------------------------- | ------------------------------------------------------------------- |
66
+ | 80M | | [🤗F2LLM-v2-80M](https://huggingface.co/codefuse-ai/F2LLM-v2-80M) |
67
+ | 160M | | [🤗F2LLM-v2-160M](https://huggingface.co/codefuse-ai/F2LLM-v2-160M) |
68
+ | 330M | | [🤗F2LLM-v2-330M](https://huggingface.co/codefuse-ai/F2LLM-v2-330M) |
69
+ | 0.6B | [🤗F2LLM-v2-0.6B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-0.6B-Preview) | [🤗F2LLM-v2-0.6B](https://huggingface.co/codefuse-ai/F2LLM-v2-0.6B) |
70
+ | 1.7B | [🤗F2LLM-v2-1.7B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-1.7B-Preview) | [🤗F2LLM-v2-1.7B](https://huggingface.co/codefuse-ai/F2LLM-v2-1.7B) |
71
+ | 4B | [🤗F2LLM-v2-4B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-4B-Preview) | [🤗F2LLM-v2-4B](https://huggingface.co/codefuse-ai/F2LLM-v2-4B) |
72
+ | 8B | [🤗F2LLM-v2-8B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-8B-Preview) | [🤗F2LLM-v2-8B](https://huggingface.co/codefuse-ai/F2LLM-v2-8B) |
73
+ | 14B | [🤗F2LLM-v2-14B-Preview](https://huggingface.co/codefuse-ai/F2LLM-v2-14B-Preview) | [🤗F2LLM-v2-14B](https://huggingface.co/codefuse-ai/F2LLM-v2-14B) |
74
+
75
  ## Usage
76
 
77
  ### With Sentence Transformers
 
153
  ## Intermediate Checkpoints
154
 
155
  To facilitate future research, we release intermediate checkpoints in the `intermediate_checkpoints` branch.