Salma Mayorquin's picture

Salma Mayorquin PRO

salma-remyx

·

https://remyx.ai

smellslikeml

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

remyxai/dockergen-0.5b

updated a model 2 days ago

remyxai/dockergen-0.5b

published a model 2 days ago

remyxai/dockergen-0.5b

View all activity

Organizations

upvoted 2 collections 12 months ago

WorldVLA

https://github.com/alibaba-damo-academy/WorldVLA • 8 items • Updated Jun 25, 2025 • 1

JinaVDR (Visual Document Retrieval)

max. ~1000 images and OCR text included • 42 items • Updated Jul 20, 2025 • 9

upvoted 3 papers about 1 year ago

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7, 2025 • 29

SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding

Paper • 2505.17012 • Published May 22, 2025 • 12

Training-Free Reasoning and Reflection in MLLMs

Paper • 2505.16151 • Published May 22, 2025 • 9

upvoted 4 collections about 1 year ago

Perception Encoder

16 items • Updated Mar 2 • 82

SpaceThinker

Test Time Compute for Quantitative Spatial Reasoning using synthetic reasoning traces from 3D scene graphs • 7 items • Updated Oct 23, 2025 • 2

SpatialRGPT: Grounded Spatial Reasoning in VLMs

3 items • Updated Oct 11, 2024 • 5

Cosmos-Transfer1

⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/cosmos3 • 5 items • Updated about 2 hours ago • 31

upvoted 8 collections over 1 year ago

Cosmos-Preidct1

⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/cosmos3 • 14 items • Updated about 2 hours ago • 304

Qwen2-VL

Vision-language model series based on Qwen2 • 15 items • Updated Mar 2 • 233

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated Mar 2 • 90

LLM-Neo

Model hub for LLM-Neo, including Llama3.1-Neo-1B-100w and Minitron-4B-Depth-Neo-10w. • 3 items • Updated Nov 20, 2024 • 6

VLM Judge Distillation

Distilling the 13B SpaceLLaVA VLM-as-a-Judge into a Florence-2 model to efficiently quality filter spatialVQA datasets like OpenSpaces • 4 items • Updated Nov 14, 2024 • 1

DepthPro Models

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second • 4 items • Updated Aug 25, 2025 • 13

OpenSpaces VLMs

VLMs fine-tuned for spatial VQA using the OpenSpaces dataset. • 5 items • Updated Mar 30, 2025 • 2

Molmo

Artifacts for open multimodal language models. • 5 items • Updated Dec 23, 2025 • 310

upvoted a collection almost 2 years ago

SpaceVLMs

Features VLMs fine-tuned for enhanced spatial reasoning using a synthetic data pipeline similar to Spatial VLM. • 11 items • Updated Feb 13, 2025 • 7

upvoted a paper about 2 years ago

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22, 2024 • 30

upvoted a paper over 2 years ago

LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning

Paper • 2309.06440 • Published Sep 12, 2023 • 10