FronyAI commited on
Commit
4ed1d72
·
verified ·
1 Parent(s): be7df63

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -43,8 +43,10 @@ Total trained query and document pair is 100,000.<br>
43
  ### Training Details
44
  The overall training process was conducted with reference to **snowflake-arctic-2.0**.<br>
45
  Training was divided into two stages: Pre-training and Post-training.<br>
46
- In the pre-training stage, the model was trained using in-batch negatives.<br>
47
- In the post-training stage, we utilized the multilingual-e5-large model to identify hard negatives—specifically, the top 4 samples with a similarity score below a **99% threshold**.<br>
 
 
48
  Given the increasing prevalence of LLM-generated content, we also converted existing data into Markdown-style passages to improve retrieval performance on such formats.<br>
49
  The types of data augmentation applied are as follows:<br>
50
  | Augmentation* | Description |
 
43
  ### Training Details
44
  The overall training process was conducted with reference to **snowflake-arctic-2.0**.<br>
45
  Training was divided into two stages: Pre-training and Post-training.<br>
46
+
47
+ * In the pre-training stage, the model was trained using in-batch negatives.
48
+ * In the post-training stage, we utilized the multilingual-e5-large model to identify hard negatives—specifically, the top 4 samples with a similarity score below a **99% threshold**.
49
+
50
  Given the increasing prevalence of LLM-generated content, we also converted existing data into Markdown-style passages to improve retrieval performance on such formats.<br>
51
  The types of data augmentation applied are as follows:<br>
52
  | Augmentation* | Description |