allenai
/

MolmoPoint-GUI-8B

Image-Text-to-Text

Model card Files Files and versions

chrisc36 commited on Mar 17

Commit

d8985a8

·

verified ·

1 Parent(s): df1e3b6

Update README.md

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -3,8 +3,7 @@ license: apache-2.0
 language:
 - en
 base_model:
-- Qwen/Qwen3-8B
-- google/siglip-so400m-patch14-384
 pipeline_tag: image-text-to-text
 tags:
 - multimodal
@@ -14,8 +13,8 @@ tags:
 - molmo_point
 ---
-# MolmoPoint-Img-8B
-MolmoPoint-Img-8B is a fully-open VLM developed by the Allen Institute for AI (Ai2) that is specialized for GUI pointing.
 As specialized model, it only supports single image input with instruction-like queries, and will output a single point.
 See MolmoPoint-8B for a generalist model.
 MolmoPoint-Img-8B points using grounding-tokens instead of text coordinates, see our paper for details.

 language:
 - en
 base_model:
+- allenai/MolmoPoint-8B
 pipeline_tag: image-text-to-text
 tags:
 - multimodal
 - molmo_point
 ---
+# MolmoPoint-GUI-8B
+MolmoPoint-GUI-8B is a fully-open VLM developed by the Allen Institute for AI (Ai2) that is specialized for GUI pointing.
 As specialized model, it only supports single image input with instruction-like queries, and will output a single point.
 See MolmoPoint-8B for a generalist model.
 MolmoPoint-Img-8B points using grounding-tokens instead of text coordinates, see our paper for details.