Update README.md
Browse files
README.md
CHANGED
|
@@ -1,10 +1,118 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
emoji:
|
| 4 |
colorFrom: blue
|
| 5 |
-
colorTo:
|
| 6 |
-
sdk: static
|
| 7 |
-
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: DATUMO
|
| 3 |
+
emoji: β
|
| 4 |
colorFrom: blue
|
| 5 |
+
colorTo: indigo
|
|
|
|
|
|
|
| 6 |
---
|
| 7 |
|
| 8 |
+
<div align="center">
|
| 9 |
+
|
| 10 |
+
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/63aa4990769a10efc403771c/-hPclrsYl0IW6kqD2DWBL.png" width="140" alt="DATUMO logo"/>
|
| 11 |
+
|
| 12 |
+
# β DATUMO
|
| 13 |
+
### *The Data-centric AI Company*
|
| 14 |
+
|
| 15 |
+
**Built by [Selectstar](https://selectstar.ai/) β data infrastructure for trustworthy AI**
|
| 16 |
+
|
| 17 |
+
[](https://selectstar.ai/)
|
| 18 |
+
[](https://selectstar.ai/blog/)
|
| 19 |
+
[](https://kr.linkedin.com/company/datumo-usa)
|
| 20 |
+
[](https://selectstar.ai/contact/)
|
| 21 |
+
|
| 22 |
+
</div>
|
| 23 |
+
|
| 24 |
+
---
|
| 25 |
+
|
| 26 |
+
## π About Us
|
| 27 |
+
|
| 28 |
+
We're **Selectstar** β a Korean AI company building the **data foundation for trustworthy AI**.
|
| 29 |
+
Since 2018, we've partnered with AI teams across the entire data value-chain: from dataset design and construction to **LLM reliability evaluation and red-teaming**.
|
| 30 |
+
|
| 31 |
+
Our flagship **Datumo Platform** is Korea's first end-to-end AI trust evaluation solution, unifying dataset preparation, automated evaluation, red-teaming, and improvement analytics in a single pipeline.
|
| 32 |
+
|
| 33 |
+
> π°π· **μλ
νμΈμ, μ
λ νΈμ€νμ
λλ€.**
|
| 34 |
+
> λ°μ΄ν° μ€κ³Β·κ΅¬μΆλΆν° LLM μ λ’°μ± κ²μ¦κΉμ§, AI κ°λ°μ λͺ¨λ λ¨κ³λ₯Ό ν¨κ»νλ **Data-centric AI κΈ°μ
**μ
λλ€.
|
| 35 |
+
> μ΄ νμ΄μ§μμλ μ ν¬κ° μ°κ΅¬Β·μ€λ¬΄μ μ¬μ©νλ λ°μ΄ν°μ
κ³Ό λͺ¨λΈμ μ€νμμ€λ‘ 곡μ νκ³ μμ΄μ.
|
| 36 |
+
|
| 37 |
+
---
|
| 38 |
+
|
| 39 |
+
## π― What We Do
|
| 40 |
+
|
| 41 |
+
<table>
|
| 42 |
+
<tr>
|
| 43 |
+
<td width="33%" valign="top">
|
| 44 |
+
|
| 45 |
+
### ποΈ Data Construction
|
| 46 |
+
Training data design & build
|
| 47 |
+
Pre-training data licensing
|
| 48 |
+
RAG knowledge pipelines
|
| 49 |
+
Crowdsourced at scale
|
| 50 |
+
|
| 51 |
+
</td>
|
| 52 |
+
<td width="33%" valign="top">
|
| 53 |
+
|
| 54 |
+
### π‘οΈ AI Trust & Safety
|
| 55 |
+
LLM red-teaming
|
| 56 |
+
Reliability benchmarks
|
| 57 |
+
Safety evaluation datasets
|
| 58 |
+
Guardrail testing
|
| 59 |
+
|
| 60 |
+
</td>
|
| 61 |
+
<td width="33%" valign="top">
|
| 62 |
+
|
| 63 |
+
### π Datumo Platform
|
| 64 |
+
Automated LLM evaluation
|
| 65 |
+
Dashboard analytics
|
| 66 |
+
**45 days β 45 minutes**
|
| 67 |
+
End-to-end eval pipeline
|
| 68 |
+
|
| 69 |
+
</td>
|
| 70 |
+
</tr>
|
| 71 |
+
</table>
|
| 72 |
+
|
| 73 |
+
---
|
| 74 |
+
|
| 75 |
+
## π Featured Collections
|
| 76 |
+
|
| 77 |
+
### π‘οΈ [Safety-Data](https://huggingface.co/collections/datumo/safety-data)
|
| 78 |
+
Curated by our **AI Safety team** β Korean-language safety and reliability benchmarks for LLM evaluation.
|
| 79 |
+
|
| 80 |
+
- πΈ [**KorSET**](https://huggingface.co/datasets/datumo/KorSET) β Korean Safety Evaluation Toolkit
|
| 81 |
+
- πΈ [**KorNAT**](https://huggingface.co/datasets/datumo/KorNAT) β Korea's first LLM reliability / national-alignment benchmark
|
| 82 |
+
|
| 83 |
+
### π¦ [Data-Data](https://huggingface.co/collections/datumo/data-data)
|
| 84 |
+
Research outputs from our **Data team**.
|
| 85 |
+
|
| 86 |
+
- πΈ [**CAC-CoT**](https://huggingface.co/datumo/CAC-CoT) β 7B Chain-of-Thought feature extraction model
|
| 87 |
+
- πΈ [**CAC-CoT dataset**](https://huggingface.co/datasets/datumo/CAC-CoT) β accompanying training data
|
| 88 |
+
|
| 89 |
+
> π‘ νλ‘μ°νμλ©΄ μ λ°μ΄ν°μ
κ³Ό λͺ¨λΈμ΄ 곡κ°λ λ μλ¦Όμ λ°μΌμ€ μ μμ΄μ.
|
| 90 |
+
|
| 91 |
+
---
|
| 92 |
+
|
| 93 |
+
## π Milestones
|
| 94 |
+
|
| 95 |
+
- π°π· **K-AI Company** β Selected for Korea's Sovereign AI Foundation Model Project *(SKT Consortium, data lead)*
|
| 96 |
+
- π
**Forbes Korea "2025 AI 50"**
|
| 97 |
+
- π
**Forbes "30 Under 30 Asia"** β Enterprise Technology
|
| 98 |
+
- π **Datumo Eval** β Korea's first automated LLM reliability evaluation platform (2025)
|
| 99 |
+
- π **200M+ annotations** Β· **287+ enterprise clients** Β· **250K+ crowdworkers**
|
| 100 |
+
- π Co-built landmark Korean benchmarks including **KLUE** and **KorQuAD 2.0**
|
| 101 |
+
- π¬ Publications at **NeurIPS Β· EMNLP Β· CVPR**
|
| 102 |
+
|
| 103 |
+
---
|
| 104 |
+
|
| 105 |
+
## π€ Connect
|
| 106 |
+
|
| 107 |
+
| | |
|
| 108 |
+
|---|---|
|
| 109 |
+
| π Website | [selectstar.ai](https://selectstar.ai/) |
|
| 110 |
+
| π° Blog | [selectstar.ai/blog](https://selectstar.ai/blog/) |
|
| 111 |
+
| πΌ Enterprise inquiries | [Contact form](https://selectstar.ai/contact/) |
|
| 112 |
+
| π¬ Community | Join the [discussion tab](https://huggingface.co/spaces/datumo/README/discussions) |
|
| 113 |
+
|
| 114 |
+
---
|
| 115 |
+
|
| 116 |
+
<div align="center">
|
| 117 |
+
<sub>β Building the data foundation for trustworthy AI · Made with care in Seoul π°π·</sub>
|
| 118 |
+
</div>
|