developer-lunark's picture
Update README.md for Docker build with Python 3.11
70be64c verified
---
title: KAIdol Thinking Experiment
emoji: 🎀
colorFrom: purple
colorTo: pink
sdk: docker
pinned: false
license: apache-2.0
tags:
- roleplay
- korean
- llm-evaluation
- a-b-testing
---
# KAIdol A/B Test Arena
K-pop μ•„μ΄λŒ λ‘€ν”Œλ ˆμ΄ 챗봇 λͺ¨λΈ A/B 비ꡐ 평가 ν”Œλž«νΌ
## Features
- **A/B Arena**: 두 λͺ¨λΈμ˜ 응닡을 λ‚˜λž€νžˆ 비ꡐ
- **Blind Mode**: λͺ¨λΈλͺ… 숨기고 순수 ν’ˆμ§ˆ 평가
- **ELO Ranking**: νˆ¬ν‘œ 기반 λͺ¨λΈ μˆœμœ„
- **5 Characters**: κ°•μœ¨, μ„œμ΄μ•ˆ, 이지후, μ°¨λ„ν•˜, 졜민
## Models (19개 μ†Œν˜• Student λͺ¨λΈ)
### DPO v5 (7-14B)
- qwen2.5-7b/14b-dpo-v5
- exaone-7.8b-dpo-v5
- qwen3-8b-dpo-v5
- solar-10.7b-dpo-v5
### SFT Thinking (7-14B)
- qwen2.5-7b/14b-thinking
- exaone-7.8b-thinking
### Phase 7 Kimi Students
- qwen2.5-7b/14b-kimi
- exaone-7.8b-kimi
### V7 Students
- qwen2.5-7b/14b-v7
- exaone-7.8b-v7
- qwen3-8b-v7
- varco-8b-v7
## Usage
1. 캐릭터와 μ‹œλ‚˜λ¦¬μ˜€ 선택
2. λ©”μ‹œμ§€ μž…λ ₯ λ˜λŠ” 랜덀 μ‹œλ‚˜λ¦¬μ˜€ μ‚¬μš©
3. 두 λͺ¨λΈμ˜ 응닡 비ꡐ
4. νˆ¬ν‘œλ‘œ 더 λ‚˜μ€ 응닡 선택
## Tech Stack
- Gradio 4.x
- Python 3.11