Caco Collection Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning • 6 items • Updated Oct 9, 2025 • 1
view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG Jan 15 • 65
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Paper • 2406.16860 • Published Jun 24, 2024 • 63
RAE Collection Collection for Diffusion Transformers with Representation Autoencoders • 1 item • Updated Oct 14, 2025 • 11
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions Paper • 2503.20290 • Published Mar 26, 2025 • 1
ConECT Dataset: Overcoming Data Scarcity in Context-Aware E-Commerce MT Paper • 2506.04929 • Published Jun 5, 2025 • 2
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published Mar 6, 2025 • 72
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement Paper • 2406.11546 • Published Jun 17, 2024 • 1
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder Paper • 2409.14074 • Published Sep 21, 2024 • 3
jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval Paper • 2506.18902 • Published Jun 23, 2025 • 12
OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning Paper • 2506.00338 • Published May 31, 2025 • 10