Collections

Discover the best community collections!

Collections trending this week
Customer Service Human Evaluation Data (Evaluator 2)
Per-model human evaluation datasets (evaluator_2) for customer service client-agent conversations.
Pairwise Comparison Datasets (Virtuoso-Large vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against Virtuoso-Large on customer service client-agent conversations.
Customer Service SLM/LLM Inference Outputs
Per-model inference outputs from SLMs and LLMs evaluated on customer service client-agent conversations.
Customer Service Human Evaluation Data (Evaluator 1)
Per-model human evaluation datasets (evaluator_1) for customer service client-agent conversations.
Pairwise Comparison (Gemini-2.5-Flash vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against Gemini-2.5-Flash on customer service client-agent conversations.
Customer Service LLM-as-a-Judge Evaluation Data
Per-model LLM-as-a-Judge evaluation datasets (~6k rows each) generated for customer service client-agent conversations.
Customer Service Human Evaluation Data (Evaluator 2)
Per-model human evaluation datasets (evaluator_2) for customer service client-agent conversations.
Customer Service Human Evaluation Data (Evaluator 1)
Per-model human evaluation datasets (evaluator_1) for customer service client-agent conversations.
Pairwise Comparison Datasets (Virtuoso-Large vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against Virtuoso-Large on customer service client-agent conversations.
Pairwise Comparison (Gemini-2.5-Flash vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against Gemini-2.5-Flash on customer service client-agent conversations.
Customer Service LLM-as-a-Judge Evaluation Data
Per-model LLM-as-a-Judge evaluation datasets (~6k rows each) generated for customer service client-agent conversations.
Customer Service SLM/LLM Inference Outputs
Per-model inference outputs from SLMs and LLMs evaluated on customer service client-agent conversations.