HUFS-DILAB

university

https://dilab.hufs.ac.kr/

AI & ML interests

None defined yet.

Recent Activity

jsjang0104 updated a dataset about 1 month ago

HUFS-DILAB/QE-wmt21-1k-prometheus

jsjang0104 updated a dataset about 1 month ago

HUFS-DILAB/QE-wmt21-1k-cometkiwi

jsjang0104 published a dataset about 2 months ago

HUFS-DILAB/QE-wmt21-1k-prometheus

View all activity

updated 2 datasets about 1 month ago

HUFS-DILAB/QE-wmt21-1k-prometheus

Viewer • Updated May 21 • 100 • 6

HUFS-DILAB/QE-wmt21-1k-cometkiwi

Viewer • Updated May 20 • 1k • 17 • 1

published 2 datasets about 2 months ago

HUFS-DILAB/QE-wmt21-1k-prometheus

Viewer • Updated May 21 • 100 • 6

HUFS-DILAB/QE-wmt21-1k-cometkiwi

Viewer • Updated May 20 • 1k • 17 • 1

updated a dataset about 2 months ago

HUFS-DILAB/MT-wmt14-500k-opus-mt-en-de

Viewer • Updated May 13 • 500k • 32

published a dataset about 2 months ago

HUFS-DILAB/MT-wmt14-500k-opus-mt-en-de

Viewer • Updated May 13 • 500k • 32

updated a dataset 3 months ago

HUFS-DILAB/PREPAIR-reproduction-results

Viewer • Updated Mar 25 • 319 • 7

updated a dataset 3 months ago

HUFS-DILAB/reproduce-MT-crowd-xcomet-xxl-llms

Viewer • Updated Mar 24 • 7.03k • 3

published a dataset 3 months ago

HUFS-DILAB/reproduce-MT-crowd-xcomet-xxl-llms

Viewer • Updated Mar 24 • 7.03k • 3

updated a dataset 3 months ago

HUFS-DILAB/reproduce-MT-crowd-metricx-xxl-llms

Viewer • Updated Mar 24 • 7.03k • 3

published a dataset 3 months ago

HUFS-DILAB/reproduce-MT-crowd-metricx-xxl-llms

Viewer • Updated Mar 24 • 7.03k • 3

updated a dataset 3 months ago

HUFS-DILAB/reproduce-MT-llama-target

Viewer • Updated Mar 19 • 5.07k • 25

published a dataset 3 months ago

HUFS-DILAB/reproduce-MT-llama-target

Viewer • Updated Mar 19 • 5.07k • 25

updated a dataset 3 months ago

HUFS-DILAB/reproduce-MT-llama-source

Viewer • Updated Mar 19 • 5.07k • 4

published a dataset 3 months ago

HUFS-DILAB/reproduce-MT-llama-source

Viewer • Updated Mar 19 • 5.07k • 4

published a dataset 4 months ago

HUFS-DILAB/PREPAIR-reproduction-results

Viewer • Updated Mar 25 • 319 • 7

authored 4 papers 4 months ago

FLEX: Expert-level False-Less EXecution Metric for Reliable Text-to-SQL Benchmark

Paper • 2409.19014 • Published Sep 24, 2024

Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge

Paper • 2502.16457 • Published Feb 23, 2025 • 12

TextME: Bridging Unseen Modalities Through Text Descriptions

Paper • 2602.03098 • Published Feb 3

Overthinking Loops in Agents: A Structural Risk via MCP Tools

Paper • 2602.14798 • Published Feb 16 • 1