--- title: KAIdol Thinking Experiment emoji: 🎀 colorFrom: purple colorTo: pink sdk: docker pinned: false license: apache-2.0 tags: - roleplay - korean - llm-evaluation - a-b-testing --- # KAIdol A/B Test Arena K-pop μ•„μ΄λŒ λ‘€ν”Œλ ˆμ΄ 챗봇 λͺ¨λΈ A/B 비ꡐ 평가 ν”Œλž«νΌ ## Features - **A/B Arena**: 두 λͺ¨λΈμ˜ 응닡을 λ‚˜λž€νžˆ 비ꡐ - **Blind Mode**: λͺ¨λΈλͺ… 숨기고 순수 ν’ˆμ§ˆ 평가 - **ELO Ranking**: νˆ¬ν‘œ 기반 λͺ¨λΈ μˆœμœ„ - **5 Characters**: κ°•μœ¨, μ„œμ΄μ•ˆ, 이지후, μ°¨λ„ν•˜, 졜민 ## Models (19개 μ†Œν˜• Student λͺ¨λΈ) ### DPO v5 (7-14B) - qwen2.5-7b/14b-dpo-v5 - exaone-7.8b-dpo-v5 - qwen3-8b-dpo-v5 - solar-10.7b-dpo-v5 ### SFT Thinking (7-14B) - qwen2.5-7b/14b-thinking - exaone-7.8b-thinking ### Phase 7 Kimi Students - qwen2.5-7b/14b-kimi - exaone-7.8b-kimi ### V7 Students - qwen2.5-7b/14b-v7 - exaone-7.8b-v7 - qwen3-8b-v7 - varco-8b-v7 ## Usage 1. 캐릭터와 μ‹œλ‚˜λ¦¬μ˜€ 선택 2. λ©”μ‹œμ§€ μž…λ ₯ λ˜λŠ” 랜덀 μ‹œλ‚˜λ¦¬μ˜€ μ‚¬μš© 3. 두 λͺ¨λΈμ˜ 응닡 비ꡐ 4. νˆ¬ν‘œλ‘œ 더 λ‚˜μ€ 응닡 선택 ## Tech Stack - Gradio 4.x - Python 3.11