KO-TTS-Arena

Runtime error

Ko-TTS-Arena Contributors Claude Sonnet 4.5 commited on Jan 13

Commit

540021e

1 Parent(s): 57d9f94

Improve model pairing: prioritize new and low-vote models

- Remove similarity-based weighting for model pairs
- New models now match against both new and established models
- All models weighted by vote count (fewer votes = higher probability)
- Better exposure for newly added models in voting

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Files changed (2) hide show

app.py +4 -27
ko_prompts.json +5 -5

app.py CHANGED Viewed

@@ -1217,9 +1217,10 @@ def get_weighted_random_models(
     weighting models with fewer votes higher. A smoothing factor is used to ensure
     the preference is slight and to prevent models with zero votes from being
     overwhelmingly favored. Models are selected without replacement.
-    For pairs (num_to_select=2), ensures the two models have similar vote counts
-    to avoid unfair matchups between new and established models.
     Assumes len(applicable_models) >= num_to_select, which should be checked by the caller.
     """
@@ -1261,30 +1262,6 @@ def get_weighted_random_models(
             # This should ideally not happen if chosen_model came from current_candidates.
             app.logger.error(f"Error removing model {chosen_model.id} from weighted selection candidates.")
             break # Avoid potential issues
-        # For the second model in a pair, adjust weights based on vote count similarity
-        if i == 0 and num_to_select == 2 and current_candidates:
-            first_model_votes = model_votes_counts[chosen_model.id]
-            # Calculate similarity-based weight adjustments
-            similarity_weights = []
-            for j, candidate in enumerate(current_candidates):
-                candidate_votes = model_votes_counts[candidate.id]
-                # Calculate vote count difference ratio
-                # Models with similar vote counts get higher weights
-                vote_diff = abs(first_model_votes - candidate_votes)
-                max_votes = max(first_model_votes, candidate_votes, 1)
-                # Similarity factor: higher when vote counts are similar
-                # Range: 0.1 to 1.0
-                similarity_factor = 1.0 / (1.0 + (vote_diff / max_votes))
-                # Combine original weight with similarity factor
-                adjusted_weight = current_weights[j] * (0.3 + 0.7 * similarity_factor)
-                similarity_weights.append(adjusted_weight)
-            current_weights = similarity_weights
     return selected_models_list

     weighting models with fewer votes higher. A smoothing factor is used to ensure
     the preference is slight and to prevent models with zero votes from being
     overwhelmingly favored. Models are selected without replacement.
+    This ensures new models and models with fewer votes get more exposure, while
+    still allowing matchups between models with different vote counts for better
+    evaluation of new models against established ones.
     Assumes len(applicable_models) >= num_to_select, which should be checked by the caller.
     """
             # This should ideally not happen if chosen_model came from current_candidates.
             app.logger.error(f"Error removing model {chosen_model.id} from weighted selection candidates.")
             break # Avoid potential issues
     return selected_models_list

ko_prompts.json CHANGED Viewed

@@ -7,16 +7,16 @@
     "채널코퍼레이션은 대한민국 서울 강남구 논현로 508 에 한국오피스를 두고있습니다.",
     "채널톡은 채팅 상담, AI 챗봇, CRM 마케팅, 사내 메신저가 결합된 올인원 AI 비즈니스 메신저입니다.",
     "한국어 TTS 아레나에 참여해주셔서 너무 감사드립니다! 다들 새해 복 많이 받으세요!",
-    "앞으로도 고객님께서 24시간 언제든지 편리하게 서비스를 이용하실 수 있도록 AI 상담 시스템을  지속적으로 개선하고 response time을 평균 150밀리초 이하로 유지하며 상담 품질 향상에 최선을 다하겠습니다.",
     "둥둥에어를 이용해 주셔서 감사드리며 고객님께서 예약하신 DD201편은 기상 악화로 인해 출발 시 간이 약 40분 정도 지연될 예정입니다.",
     "둥둥레스토랑의 영업시간은 평일 오전 11시부터 오후 9시까지이며 브레이크 타임은 오후 3시부터 5시까지입니다.",
-    "둥둥호텔의 체크인은 오후 3시부터 가능하며 예약 정보 확인 후 빠르게 도 와드리겠습니다.",
     "객실과 공용 공간에서는 free wifi를 이용하실 수 있으며 네트워크 이름과 비밀번호는 객실 안내 문에서 확인하실 수 있습니다.",
-    "둥둥호텔의 체크인은 오후 3시부터 가능하며 예약 정보 확인 후 빠르게 도 와드리겠습니다.",
     "둥둥호텔의 조식은 매일 오전 7시부터 10시까지 1층 레스토랑에서 제공되며 뷔페 형식으로 다양한 메뉴를 이용하실 수 있습니다.",
-    "수영장 이용 시 수영모 착용은 필수이며 안전을 위해 만 12세 이하 고객님께서는 보호자 동반이  필요합니다.",
     "피트니스 센터는 투숙객 전용 시설로 24시간 이용 가능하며 심야 시간대에는 안전을 위해 직원 호출이 필요할 수 있습니다.",
     "현재 선로 사정으로 인해 열차 운행이 5분 정도 지연되고 있으니 이용에 참고해 주시기 바랍니다.",
-    "승강장에서는 안전선 안쪽으로 물러서 주시고 열차가 완전히 정차한 후 차례대로 승차해 주시기  바랍니다."
   ]
 }

     "채널코퍼레이션은 대한민국 서울 강남구 논현로 508 에 한국오피스를 두고있습니다.",
     "채널톡은 채팅 상담, AI 챗봇, CRM 마케팅, 사내 메신저가 결합된 올인원 AI 비즈니스 메신저입니다.",
     "한국어 TTS 아레나에 참여해주셔서 너무 감사드립니다! 다들 새해 복 많이 받으세요!",
+    "앞으로도 고객님께서 24시간 언제든지 편리하게 서비스를 이용하실 수 있도록 AI 상담 시스템을 지속적으로 개선하고 response time을 평균 150밀리초 이하로 유지하며 상담 품질 향상에 최선을 다하겠습니다.",
     "둥둥에어를 이용해 주셔서 감사드리며 고객님께서 예약하신 DD201편은 기상 악화로 인해 출발 시 간이 약 40분 정도 지연될 예정입니다.",
     "둥둥레스토랑의 영업시간은 평일 오전 11시부터 오후 9시까지이며 브레이크 타임은 오후 3시부터 5시까지입니다.",
+    "둥둥호텔의 체크인은 오후 3시부터 가능하며 예약 정보 확인 후 빠르게 도와드리겠습니다.",
     "객실과 공용 공간에서는 free wifi를 이용하실 수 있으며 네트워크 이름과 비밀번호는 객실 안내 문에서 확인하실 수 있습니다.",
+    "둥둥호텔의 체크인은 오후 3시부터 가능하며 예약 정보 확인 후 빠르게 도와드리겠습니다.",
     "둥둥호텔의 조식은 매일 오전 7시부터 10시까지 1층 레스토랑에서 제공되며 뷔페 형식으로 다양한 메뉴를 이용하실 수 있습니다.",
+    "수영장 이용 시 수영모 착용은 필수이며 안전을 위해 만 12세 이하 고객님께서는 보호자 동반이 필요합니다.",
     "피트니스 센터는 투숙객 전용 시설로 24시간 이용 가능하며 심야 시간대에는 안전을 위해 직원 호출이 필요할 수 있습니다.",
     "현재 선로 사정으로 인해 열차 운행이 5분 정도 지연되고 있으니 이용에 참고해 주시기 바랍니다.",
+    "승강장에서는 안전선 안쪽으로 물러서 주시고 열차가 완전히 정차한 후 차례대로 승차해 주시기 바랍니다."
   ]
 }