Would you still call this Dax? Novel Visual References in VLMs and Humans Paper • 2606.05409 • Published 24 days ago • 8
Forecasting Downstream Performance of LLMs With Proxy Metrics Paper • 2605.18607 • Published May 18 • 14
LLM2Vec-Gen: Generative Embeddings from Large Language Models Paper • 2603.10913 • Published Mar 11 • 44
Value Drifts: Tracing Value Alignment During LLM Post-Training Paper • 2510.26707 • Published Oct 30, 2025 • 13
Airavata Evaluation Suite Collection A collection of benchmarks used for evaluation of Airavata, an Hindi instruction-tuned model on top of Sarvam's OpenHathi base model. • 20 items • Updated Mar 2 • 10