Multiple Randomization Designs: Estimation and Inference with Interference
Paper
•
2112.13495
•
Published
None defined yet.
AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward