FTibSuite: A Comprehensive Resource Suite for Tibetan Vision-Language Modeling Paper • 2605.26601 • Published 7 days ago
Source-Grounded Semantic Reinforcement Learning for Low-Resource Target-Language Generation Paper • 2605.29502 • Published 5 days ago
Reinforcement Learning with Semantic Rewards Enables Low-Resource Language Expansion without Alignment Tax Paper • 2605.14366 • Published 19 days ago
ML-Embed: Inclusive and Efficient Embeddings for a Multilingual World Paper • 2605.15081 • Published 19 days ago • 11
Beyond Retrieval: A Multitask Benchmark and Model for Code Search Paper • 2605.04615 • Published 27 days ago • 23
Beyond Retrieval: A Multitask Benchmark and Model for Code Search Paper • 2605.04615 • Published 27 days ago • 23
TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale Paper • 2604.21889 • Published Apr 23 • 12
TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale Paper • 2604.21889 • Published Apr 23 • 12
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World Paper • 2603.19223 • Published Mar 19 • 33
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World Paper • 2603.19223 • Published Mar 19 • 33
C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling Paper • 2512.21332 • Published Dec 24, 2025 • 17
SHRP: Specialized Head Routing and Pruning for Efficient Encoder Compression Paper • 2512.20635 • Published Dec 3, 2025
C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling Paper • 2512.21332 • Published Dec 24, 2025 • 17
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data Paper • 2510.10159 • Published Oct 11, 2025 • 3
F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data Paper • 2510.02294 • Published Oct 2, 2025 • 48
CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects Paper • 2509.14856 • Published Sep 18, 2025 • 2
CMHG: A Dataset and Benchmark for Headline Generation of Minority Languages in China Paper • 2509.09990 • Published Sep 12, 2025 • 3
ArguGPT: evaluating, understanding and identifying argumentative essays generated by GPT models Paper • 2304.07666 • Published Apr 16, 2023
From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms Paper • 2508.10860 • Published Aug 14, 2025 • 3
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning Paper • 2505.23754 • Published May 29, 2025 • 15