Salma Mayorquin PRO

salma-remyx

https://remyx.ai

smellslikeml

AI & ML interests

None yet

Recent Activity

repliedto their post about 12 hours ago

SciCrafter measured something AI practitioners have intuited: frontier agents are improving at executing inside well-framed problems, but lag at framing the problem in the first place. GPT-5.2, Gemini-3-Pro, and Claude Opus 4.5 all plateaued near 26% on a new Minecraft benchmark for probing AI capabilities in the discovery-to-application loop. So the authors ran targeted interventions: * Hints about what to investigate doubled performance. * A structured experimentation template added 7-14 more points. * Structured consolidation beat free-form summaries by 6 points. * Curriculum context beat independent task-solving. These interventions helped the agent frame what’s worth investigating, and structure what gets learned so it compounds. The bottleneck for AI in scientific workflows is upstream of execution. Their findings are congruent with the design patterns we've adopted at Remyx AI to help AI teams close the development loop scientifically. Agents work well inside structured loops, but they perform poorly when tasked with creating the structure. Instrumenting your scientific workflows offers greater leverage than scaling compute with a less informed search. In the work of building production AI systems, teams are flying through execution. The bigger challenge is identifying which experiments moved which production outcome, or what to try next. One of the more interesting results I found this week by tracking work in AI for scientific workflows using Remyx: https://engine.remyx.ai/papers/d8f23b9b-b14b-4ada-b44e-ccfc221c06b4

posted an update 1 day ago

VQASynth is the open source implementation of the https://huggingface.co/papers/2401.12168 paper, putting together the data synthesis pipeline behind https://huggingface.co/remyxai/SpaceQwen2.5-VL-3B-Instruct, https://huggingface.co/remyxai/SpaceThinker-Qwen2.5VL-3B, and several other spatial reasoning models we've shared here on HF. From early development through production, different categories of evidence become available to guide what to try next. The strongest decisions combine evidence across categories rather than relying on any one. Stage 1: Development history Commit history holds the moments where things changed. For VQASynth, that's how scenes get parsed, how captions get generated, how spatial relations get encoded. Even before a model is in production, those milestones are a strong signal for what methods are semantically relevant to where the system is now. Stage 2: Observational outcomes Once a model is serving, the same commit history delineates changes against real-world results. That opens up quasi-experiments. You get causal evidence about which changes drove which outcomes, and inference on questions you haven't directly tested. Stage 3: Controlled experiments When teams start running interventions, those outcomes tighten the estimates further. This is the regime most people associate with rigor, but it's expensive and gated by traffic. Stage 4: Counterfactual perturbations When A/B testing becomes the operational bottleneck, instrumenting decision points in the production system lets you probe what would have happened under alternative choices. Shadow mode first, live traffic once audits pass. Experimentation maturity is a journey, and every stage offers something to learn from. More on these ideas: https://docs.remyx.ai/concepts/maturity-progression

posted an update 3 days ago

View all activity

Organizations

New activity in remyxai/SpaceQwen2.5-VL-3B-Instruct 9 months ago

Add link to paper, project page and Github repo

#3 opened 11 months ago by

nielsr

New activity in remyxai/SpaceThinker-Qwen2.5VL-3B 9 months ago

conversion to int4 gpqt / awq ?

#4 opened 9 months ago by deleted

New activity in remyxai/SpaceOm 10 months ago

Update model card with paper, project page, and code links

#2 opened 10 months ago by

nielsr

How to Train the Model?

#1 opened 10 months ago by

qhz991029

New activity in remyxai/SpaceQwen2.5-VL-3B-Instruct 11 months ago

Performance

#1 opened about 1 year ago by

yveeckh

## 🚨License Conflict: Apache-2.0 vs Qwen Research License

#4 opened 11 months ago by

xixi126

Improve model card with paper link and clarify metadata

#2 opened 11 months ago by

nielsr

New activity in remyxai/SpaceThinker-Qwen2.5VL-3B 11 months ago

大佬,我能在instruct model 上面微调这个吗？

#2 opened 12 months ago by

allenliuvip

New activity in remyxai/SpaceThinker 12 months ago

finetuning code

#2 opened 12 months ago by

Abdorifaat

New activity in remyxai/SpaceThinker-Qwen2.5VL-3B 12 months ago

finetuning

#1 opened 12 months ago by

Abdorifaat

New activity in huggingface/InferenceSupport about 1 year ago

remyxai/SpaceThinker-Qwen2.5VL-3B

🚀 1

#987 opened about 1 year ago by

salma-remyx

New activity in open-r1/README about 1 year ago

Multimodal R1

🔥 1

#10 opened over 1 year ago by

salma-remyx

commented a paper about 1 year ago

YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published Apr 2, 2025 • 23 •

New activity in allenai/Molmo-7B-D-0924 over 1 year ago

Text -> Point -> Segmentation

#30 opened over 1 year ago by

Maverick17

New activity in remyxai/SpaceThinker-Qwen2.5VL-3B over 1 year ago

GPU Grant

❤️ 1

#1 opened over 1 year ago by

merve

New activity in remyxai/image-directory-to-video-tool almost 2 years ago

Update README.md

#3 opened almost 2 years ago by

G-Rost

New activity in remyxai/SpaceLLaVA about 2 years ago

An error occur when loading model from the local path

#1 opened about 2 years ago by

czb2133

New activity in remyxai/stablelm-zephyr-3B_localmentor over 2 years ago

Upload ggml-model-q4_k_m.gguf

#2 opened over 2 years ago by

mgonzs13

llama.cpp error

#1 opened over 2 years ago by

mgonzs13

llama.cpp error

#1 opened over 2 years ago by

mgonzs13

Salma Mayorquin PRO

AI & ML interests

Recent Activity

Organizations

salma-remyx's activity

Add link to paper, project page and Github repo

conversion to int4 gpqt / awq ?

Update model card with paper, project page, and code links

How to Train the Model?

Performance

## 🚨License Conflict: Apache-2.0 vs Qwen Research License

Improve model card with paper link and clarify metadata

大佬,我能在instruct model 上面微调这个吗？

finetuning code

finetuning

remyxai/SpaceThinker-Qwen2.5VL-3B

Multimodal R1

Text -> Point -> Segmentation

GPU Grant

Update README.md

An error occur when loading model from the local path

Upload ggml-model-q4_k_m.gguf

llama.cpp error

llama.cpp error