Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,9 @@ Data-Juicer is A one-stop data processing system to make data higher-quality, ju
|
|
| 13 |
Data-Juicer 是一个一站式数据处理系统,可以使数据质量更高、更丰富、更易被大语言模型"消化"!
|
| 14 |
|
| 15 |
## News
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
|
|
|
|
|
|
|
|
| 13 |
Data-Juicer 是一个一站式数据处理系统,可以使数据质量更高、更丰富、更易被大语言模型"消化"!
|
| 14 |
|
| 15 |
## News
|
| 16 |
+
- [2024-02-20] We have actively maintained an awesome list of LLM-Data, welcome to [visit](docs/awesome_llm_data.md) and contribute!
|
| 17 |
+
- [2024-02-05] Our paper has been accepted by SIGMOD'24 industrial track!
|
| 18 |
+
- [2024-01-10] Discover new horizons in "Data Mixture"—Our second data-centric LLM competition has kicked off! Please visit the competition's [official website](https://tianchi.aliyun.com/competition/entrance/532174) for more information.
|
| 19 |
+
- [2024-01-05] We release **Data-Juicer v0.1.3** now!
|
| 20 |
+
In this new version, we support **more Python versions** (3.7-3.10), and support **multimodal** dataset [converting](tools/multimodal/README.md)/[processing](docs/Operators.md) (Including texts, images, and audios. More modalities will be supported in the future).
|
| 21 |
+
Besides, our paper is also updated to [v3](https://arxiv.org/abs/2309.02033).
|