In a Training Loop 🔄

3 6 55

Lakshan Cooray

Lakshan2003

jingzhezi's profile picture

Mi6paulino's profile picture

Gargaz's profile picture

Lakshan2023
lakshan-cooray

AI & ML interests

Natural Language Generation, LLMs, SLMs, LLM based Evaluation, Graph RAG, Industrial NLP

Recent Activity

updated a dataset about 1 month ago

Lakshan2003/customer-support-context-summary-50k

updated a dataset about 2 months ago

Lakshan2003/context-summarization-llm-judge-results

published a dataset about 2 months ago

Lakshan2003/context-summarization-llm-judge-results

View all activity

Organizations

Lakshan2003 's collections 14

Context Summarization Model Inference Outputs

Lakshan2003/Llama3.2-instruct-customerservice-context-summarization-evaldata

Viewer • Updated Mar 3 • 10k • 14
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-evaldata

Viewer • Updated Mar 3 • 10k • 7
Lakshan2003/Qwen3-8B-customerservice-context-summarization-evaldata

Viewer • Updated Mar 3 • 10k • 12
Lakshan2003/Qwen3-4B-customerservice-context-summarization-evaldata

Viewer • Updated Mar 4 • 10k • 9

Customer Service Context Summarization Fine-tuned Models

Fine-tuned models for context summarization in multi-turn customer service conversations.

Lakshan2003/Qwen3-8B-Instruct-customerservice-context-summary

Summarization • Updated Mar 22
Lakshan2003/Phi-4-mini-instruct-customerservice-context-summary

Summarization • Updated Mar 22
Lakshan2003/Qwen3-4B-instruct-customerservice-context-summary

Summarization • Updated Mar 22
Lakshan2003/Llama3.2-3B-instruct-customerservice-context-summary

Text Generation • Updated Mar 22

Customer Service Context Summarization Evaluation Data

Per-model evaluation datasets (~10k rows each) for context summarization experiments in customer service conversations.

Lakshan2003/Llama3.2-instruct-customerservice-context-summarization-evaldata

Viewer • Updated Mar 3 • 10k • 14
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-evaldata

Viewer • Updated Mar 3 • 10k • 7
Lakshan2003/Qwen3-8B-customerservice-context-summarization-evaldata

Viewer • Updated Mar 3 • 10k • 12
Lakshan2003/Qwen3-4B-customerservice-context-summarization-evaldata

Viewer • Updated Mar 4 • 10k • 9

Customer Service Human Evaluation Data (Evaluator 3)

Per-model human evaluation datasets (evaluator_3) for customer service client-agent conversations.

Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_3_data

Viewer • Updated Jan 15 • 500 • 12
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_3_data

Viewer • Updated Jan 15 • 500 • 13
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_3_data

Viewer • Updated Jan 15 • 500 • 16
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_3_data

Viewer • Updated Jan 15 • 500 • 12

Customer Service Human Evaluation Data (Evaluator 1)

Per-model human evaluation datasets (evaluator_1) for customer service client-agent conversations.

Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_1_data

Viewer • Updated Jan 15 • 500 • 7
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_1_data

Viewer • Updated Jan 15 • 500 • 9
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_1_data

Viewer • Updated Jan 15 • 500 • 7
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_1_data

Viewer • Updated Jan 15 • 500 • 6

Pairwise Comparison (Gemini-2.5-Flash vs SLMs)

Pairwise comparison datasets used to evaluate SLM responses against Gemini-2.5-Flash on customer service client-agent conversations.

Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-1.7b

Viewer • Updated Jan 22 • 1k • 2
Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-4b

Viewer • Updated Jan 22 • 1k • 3
Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-8b

Viewer • Updated Jan 22 • 1k • 2
Lakshan2003/pairwise-gemini-2.5-flash-vs-phi-4-mini

Viewer • Updated Jan 22 • 1k • 3

Customer Service LLM-as-a-Judge Evaluation Data

Per-model LLM-as-a-Judge evaluation datasets (~6k rows each) generated for customer service client-agent conversations.

Lakshan2003/Qwen3-1.7B-customerservice-LLM-as-a-judge-data

Viewer • Updated Jan 21 • 6k • 8
Lakshan2003/Qwen3-4B-customerservice-LLM-as-a-judge-data

Viewer • Updated Jan 21 • 6k • 6
Lakshan2003/Qwen3-8B-customerservice-LLM-as-a-judge-data

Viewer • Updated Jan 21 • 6k • 8
Lakshan2003/Llama3.1-8b-instruct-customerservice-LLM-as-a-judge-data

Viewer • Updated Jan 20 • 6k • 6

Customer Service Context Summarization LLM-as-a-judge

Lakshan2003/gemini-2.5-flash-customerservice-context-summarization-llm-judge-data

Viewer • Updated Mar 22 • 1k • 10
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-llm-judge-data

Viewer • Updated Mar 22 • 1k • 8
Lakshan2003/Llama3.2-3B-instruct-customerservice-context-summarization-llm-judge-data

Viewer • Updated Mar 22 • 1k • 9
Lakshan2003/Phi-4-mini-customerservice-context-summarization-llm-judge-data

Viewer • Updated Mar 22 • 1k • 8

Customer Service QA Fine-tuned SLMs

Fine-tuned SLMs for context-summarized multi-turn customer service response generation.

Lakshan2003/SmolLM3-3B-instruct-customerservice

Text Generation • Updated Feb 7 • 1
Lakshan2003/Qwen3-4B-instruct-customerservice

Text Generation • Updated Feb 7
Lakshan2003/Phi-4-mini-instruct-customerservice

Text Generation • Updated Feb 7 • 1
Lakshan2003/Qwen3-1.7B-instruct-customerservice

Text Generation • Updated Feb 7

SLM Cost Benchmarking Datasets

Datasets used for benchmarking computational cost and inference efficiency of SLMs in customer service QA experiments.

Lakshan2003/slm-cost-benchmark-testset-1000

Viewer • Updated Feb 1 • 1k • 6
Lakshan2003/slm-cost-benchmark-final-summary

Viewer • Updated Feb 7 • 9 • 6
Lakshan2003/Qwen3-1.7B-Instruct-cost-benchmark-results

Viewer • Updated Feb 7 • 1k • 7
Lakshan2003/Qwen3-4B-Instruct-cost-benchmark-results

Viewer • Updated Feb 5 • 1k • 4

Customer Service Human Evaluation Data (Evaluator 2)

Per-model human evaluation datasets (evaluator_2) for customer service client-agent conversations.

Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_2_data

Viewer • Updated Jan 15 • 500 • 8
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_2_data

Viewer • Updated Jan 15 • 500 • 7
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_2_data

Viewer • Updated Jan 15 • 500 • 9
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_2_data

Viewer • Updated Jan 15 • 500 • 63

Pairwise Comparison Datasets (Virtuoso-Large vs SLMs)

Pairwise comparison datasets used to evaluate SLM responses against Virtuoso-Large on customer service client-agent conversations.

Lakshan2003/pairwise-virtuoso-large-vs-qwen3-1.7b

Viewer • Updated Jan 22 • 1k • 4
Lakshan2003/pairwise-virtuoso-large-vs-qwen3-4b

Viewer • Updated Jan 22 • 1k • 4
Lakshan2003/pairwise-virtuoso-large-vs-qwen3-8b

Viewer • Updated Jan 22 • 1k • 2
Lakshan2003/pairwise-virtuoso-large-vs-phi-4-mini

Viewer • Updated Jan 22 • 1k • 2

Pairwise Comparison (GPT-4.1 vs SLMs)

Pairwise comparison datasets used to evaluate SLM responses against GPT-4.1.

Lakshan2003/pairwise-gpt4.1-vs-qwen3-1.7b

Viewer • Updated Jan 22 • 1k • 1
Lakshan2003/pairwise-gpt4.1-vs-qwen3-4b

Viewer • Updated Jan 22 • 1k • 4
Lakshan2003/pairwise-gpt4.1-vs-qwen3-8b

Viewer • Updated Jan 22 • 1k • 5
Lakshan2003/pairwise-gpt4.1-vs-phi-4-mini

Viewer • Updated Jan 22 • 1k • 4

Customer Service SLM/LLM Inference Outputs

Per-model inference outputs from SLMs and LLMs evaluated on customer service client-agent conversations.

Lakshan2003/Qwen3-1.7B-customerservice-evaldata

Viewer • Updated Jan 20 • 36.7k • 6
Lakshan2003/Qwen3-4B-customerservice-evaldata

Viewer • Updated Oct 13, 2025 • 36.7k • 4 • 1
Lakshan2003/Qwen3-8B-customerservice-evaldata

Viewer • Updated Jan 19 • 36.7k • 7
Lakshan2003/Llama3.1-8b-instruct-customerservice-evaldata

Viewer • Updated Jan 20 • 36.7k • 12