any idea how to make it describe what's physically there and omit the artistic critique?

by nawal2 - opened Dec 21, 2023

Dec 21, 2023

Using this image:

and this llama.cpp code:

 ./server -m models/obsidian-f16.gguf --mmproj models/mmproj-obsidian-f16.gguf --host 0.0.0.0  -ngl 42

and this prompt:

<|im_start|>user
What does this image contain? Describe each item, including the color in the description. Only describe physical objects present in the image. Do not make any other comments.\n[img-1]
###
<|im_start|>assistant''',

it replies:

The image features a wooden table with three Legos blocks on it. Two of the blocks are red, and one is blue. They are placed in a way that makes them look like they are standing up against a white wall. This arrangement creates an interesting visual effect that adds depth to the scene. The Legos are positioned in such a way that they appear to be looking at the camera, capturing attention with their vibrant colors and unique design.

This is convincing and all, but I don't really want it to offer a judgement of the blocks. Has anyone had any success making it obey?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment