view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 486
LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation Paper • 2411.04997 • Published Nov 7, 2024 • 39
ReNoise: Real Image Inversion Through Iterative Noising Paper • 2403.14602 • Published Mar 21, 2024 • 21
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference Paper • 2403.14520 • Published Mar 21, 2024 • 35
FlashTex: Fast Relightable Mesh Texturing with LightControlNet Paper • 2402.13251 • Published Feb 20, 2024 • 14
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Paper • 2402.13064 • Published Feb 20, 2024 • 50