File size: 1,123 Bytes
0100979
7b7257a
 
 
 
70be64c
0100979
7b7257a
 
 
 
 
 
0100979
 
7b7257a
 
 
 
 
 
 
 
 
 
 
70be64c
7b7257a
70be64c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7b7257a
 
 
 
 
 
 
 
 
 
 
70be64c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
title: KAIdol Thinking Experiment
emoji: 🎀
colorFrom: purple
colorTo: pink
sdk: docker
pinned: false
license: apache-2.0
tags:
  - roleplay
  - korean
  - llm-evaluation
  - a-b-testing
---

# KAIdol A/B Test Arena

K-pop μ•„μ΄λŒ λ‘€ν”Œλ ˆμ΄ 챗봇 λͺ¨λΈ A/B 비ꡐ 평가 ν”Œλž«νΌ

## Features

- **A/B Arena**: 두 λͺ¨λΈμ˜ 응닡을 λ‚˜λž€νžˆ 비ꡐ
- **Blind Mode**: λͺ¨λΈλͺ… 숨기고 순수 ν’ˆμ§ˆ 평가
- **ELO Ranking**: νˆ¬ν‘œ 기반 λͺ¨λΈ μˆœμœ„
- **5 Characters**: κ°•μœ¨, μ„œμ΄μ•ˆ, 이지후, μ°¨λ„ν•˜, 졜민

## Models (19개 μ†Œν˜• Student λͺ¨λΈ)

### DPO v5 (7-14B)
- qwen2.5-7b/14b-dpo-v5
- exaone-7.8b-dpo-v5
- qwen3-8b-dpo-v5
- solar-10.7b-dpo-v5

### SFT Thinking (7-14B)
- qwen2.5-7b/14b-thinking
- exaone-7.8b-thinking

### Phase 7 Kimi Students
- qwen2.5-7b/14b-kimi
- exaone-7.8b-kimi

### V7 Students
- qwen2.5-7b/14b-v7
- exaone-7.8b-v7
- qwen3-8b-v7
- varco-8b-v7

## Usage

1. 캐릭터와 μ‹œλ‚˜λ¦¬μ˜€ 선택
2. λ©”μ‹œμ§€ μž…λ ₯ λ˜λŠ” 랜덀 μ‹œλ‚˜λ¦¬μ˜€ μ‚¬μš©
3. 두 λͺ¨λΈμ˜ 응닡 비ꡐ
4. νˆ¬ν‘œλ‘œ 더 λ‚˜μ€ 응닡 선택

## Tech Stack

- Gradio 4.x
- Python 3.11