Parallel Complex Diffusion for Scalable Time Series Generation Paper • 2602.17706 • Published Feb 10 • 8
ML-Embed: Inclusive and Efficient Embeddings for a Multilingual World Paper • 2605.15081 • Published 16 days ago • 11
Beyond Retrieval: A Multitask Benchmark and Model for Code Search Paper • 2605.04615 • Published 24 days ago • 23
TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale Paper • 2604.21889 • Published Apr 23 • 12
QuitoBench: A High-Quality Open Time Series Forecasting Benchmark Paper • 2603.26017 • Published Mar 27 • 31
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World Paper • 2603.19223 • Published Mar 19 • 33
C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling Paper • 2512.21332 • Published Dec 24, 2025 • 17
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning Paper • 2311.02303 • Published Nov 4, 2023 • 12
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning Paper • 2409.06679 • Published Sep 10, 2024 • 4
CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models Paper • 2410.06741 • Published Oct 9, 2024 • 3
D2LLM: Decomposed and Distilled Large Language Models for Semantic Search Paper • 2406.17262 • Published Jun 25, 2024 • 5
F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data Paper • 2510.02294 • Published Oct 2, 2025 • 48
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions Paper • 2410.06577 • Published Oct 9, 2024 • 14
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks Paper • 2505.16901 • Published May 22, 2025 • 48
Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM Paper • 2503.17793 • Published Mar 22, 2025 • 24