videoeval_humaneval / feedback.csv
Youngsun Lim
minor correction
81cfb2c
ts_iso,participant_id,final_comment
2025-10-24T16:52:19.384807,TNEMERHPEG,none
2025-10-25T02:52:49.906506,9XYSRFY79G,no comments
2025-10-26T07:23:38.418145,KVKZUBPANQ,There are a lot of awkward parts
2025-10-26T07:23:41.638515,KVKZUBPANQ,There are a lot of awkward parts
2025-10-26T07:25:09.890904,KVKZUBPANQ,There are a lot of awkward parts
2025-10-26T07:25:13.404255,747S2K7GYA,"good squat, but bad hula hoop."
2025-10-26T07:25:15.705973,747S2K7GYA,"good squat, but bad hula hoop."
2025-10-26T10:17:04.960984,8EYFCMSVRG,Good
2025-10-26T10:17:08.782690,8EYFCMSVRG,Good
2025-10-26T14:37:26.925095,RVL788LS3Y,κ³ μƒν•˜μ…¨μŠ΅λ‹ˆλ‹€
2025-10-26T14:37:31.243598,RVL788LS3Y,κ³ μƒν•˜μ…¨μŠ΅λ‹ˆλ‹€
2025-10-27T01:05:22.406903,CW4WRWR4HL,I really enjoyed participating. Thank you so much!
2025-10-27T04:04:56.805589,RRNHDKQJ43,.
2025-10-27T06:06:04.671776,AFQFKJ9VEH,bodyweight squat bb
2025-10-27T08:30:47.484983,QSBDYJZE84,Thank you for the great survey.
2025-10-27T11:38:42.838557,YYZJNAEN53,"There are about 5 videos that look the same. Is that okay?
Some bodyweight workout videos are surprisingly sophisticated."
2025-10-27T15:58:25.133424,BER9QE5MZF,.
2025-10-28T00:15:21.902338,5ES65DQUWN,I wasn't sure if the interaction between the human and the sports equipment should be a huge factor in determining physical plausibility.
2025-10-28T03:00:54.138305,EUSJK4LNTY,None
2025-10-28T03:01:02.692128,EUSJK4LNTY,None
2025-10-28T03:01:16.838713,EUSJK4LNTY,None
2025-10-28T03:01:33.370855,EUSJK4LNTY,None
2025-10-28T03:01:36.331908,EUSJK4LNTY,None
2025-10-28T11:19:34.233849,ZUQLVUCDCD,λ©‹μ§„ κ²½ν—˜μ΄μ˜€μŠ΅λ‹ˆλ‹€. κ°μ‚¬ν•©λ‹ˆλ‹€.
2025-10-28T11:32:01.851906,FJLUPC9X4F,"It was a great experience, thank you.^^"
2025-10-28T11:53:47.053237,8FCMYACVMQ,:-D
2025-10-28T15:50:45.885093,4YGKM8PCAD,"It is difficult to rate both body weight squats and discus throws. The former often scores well in both categories, while the latter often fails in both categories. I am unsure whether to benchmark the average performance for each of these actions separately at a rating of 5 or whether to benchmark the quality over all actions at 5. I feel as if I have chosen to do a mix. Sorry if this description is confusing but I think this concept is difficult to describe."