FronyAI
/

frony-embed-tiny-ko-v1

Sentence Similarity

sentence-transformers

feature-extraction

text-embeddings-inference

Model card Files Files and versions

FronyAI commited on Apr 21, 2025

Commit

4ed1d72

·

verified ·

1 Parent(s): be7df63

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -43,8 +43,10 @@ Total trained query and document pair is 100,000.<br>
 ### Training Details
 The overall training process was conducted with reference to **snowflake-arctic-2.0**.<br>
 Training was divided into two stages: Pre-training and Post-training.<br>
-In the pre-training stage, the model was trained using in-batch negatives.<br>
-In the post-training stage, we utilized the multilingual-e5-large model to identify hard negatives—specifically, the top 4 samples with a similarity score below a **99% threshold**.<br>
 Given the increasing prevalence of LLM-generated content, we also converted existing data into Markdown-style passages to improve retrieval performance on such formats.<br>
 The types of data augmentation applied are as follows:<br>
 | Augmentation* | Description |

 ### Training Details
 The overall training process was conducted with reference to **snowflake-arctic-2.0**.<br>
 Training was divided into two stages: Pre-training and Post-training.<br>
+* In the pre-training stage, the model was trained using in-batch negatives.
+* In the post-training stage, we utilized the multilingual-e5-large model to identify hard negatives—specifically, the top 4 samples with a similarity score below a **99% threshold**.
 Given the increasing prevalence of LLM-generated content, we also converted existing data into Markdown-style passages to improve retrieval performance on such formats.<br>
 The types of data augmentation applied are as follows:<br>
 | Augmentation* | Description |