Update README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,7 @@ base_model:
|
|
| 15 |
|
| 16 |
# FronyAI Embedding (medium)
|
| 17 |
This is a lightweight and efficient embedding model designed specifically for the Korean language.<br>
|
| 18 |
-
It has been trained on a diverse set of data sources, including
|
| 19 |
The model demonstrates strong retrieval capabilities across:<br>
|
| 20 |
|
| 21 |
* Korean–Korean
|
|
@@ -56,7 +56,7 @@ The types of data augmentation applied are as follows:<br>
|
|
| 56 |
|
| 57 |
### Evaluation
|
| 58 |
The evaluation consists of five dataset groups, and the results in the table represent the average retrieval performance across these five groups.<br>
|
| 59 |
-
Three groups are subsets extracted from
|
| 60 |
One group is based on a specific sports regulation PDF, for which synthetic query and **markdown-style passage** pairs were generated using GPT-4o-mini.<br>
|
| 61 |
The final group is a concatenation of all four aforementioned groups, providing a comprehensive mixed set.<br>
|
| 62 |
The following table presents the average retrieval performance across five dataset groups.<br>
|
|
|
|
| 15 |
|
| 16 |
# FronyAI Embedding (medium)
|
| 17 |
This is a lightweight and efficient embedding model designed specifically for the Korean language.<br>
|
| 18 |
+
It has been trained on a diverse set of data sources, including AI 허브, to ensure robust performance in a wide range of retrieval tasks.<br>
|
| 19 |
The model demonstrates strong retrieval capabilities across:<br>
|
| 20 |
|
| 21 |
* Korean–Korean
|
|
|
|
| 56 |
|
| 57 |
### Evaluation
|
| 58 |
The evaluation consists of five dataset groups, and the results in the table represent the average retrieval performance across these five groups.<br>
|
| 59 |
+
Three groups are subsets extracted from AI 허브 datasets.<br>
|
| 60 |
One group is based on a specific sports regulation PDF, for which synthetic query and **markdown-style passage** pairs were generated using GPT-4o-mini.<br>
|
| 61 |
The final group is a concatenation of all four aforementioned groups, providing a comprehensive mixed set.<br>
|
| 62 |
The following table presents the average retrieval performance across five dataset groups.<br>
|