vikhyatk commited on
Commit
05d8a39
·
verified ·
1 Parent(s): 638927a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -2
README.md CHANGED
@@ -32,8 +32,36 @@ moondream = AutoModelForCausalLM.from_pretrained(
32
  moondream.compile()
33
  ```
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  * TODO: Add usage examples
36
- * Query
37
- * Caption
38
  * Detect
39
  * Point
 
32
  moondream.compile()
33
  ```
34
 
35
+ The model comes with four skills, tailored towards different visual understanding tasks.
36
+
37
+ ### Query
38
+
39
+ The `query` skill can be used to ask open-ended questions about images.
40
+
41
+ ||TK -- code example for simple VQA||
42
+
43
+ By default, `query` runs in reasoning mode, allowing the model to "think" about the question before generating an answer. This is helpful for more complicated tasks, but sometimes the task you're running is simple and doesn't benefit from reasoning. To save on inference cost when this is the case, you can disable reasoning:
44
+
45
+ ||TK -- example without reasoning||
46
+
47
+ If you want to stream outputs, pass in `stream=True`. You can control the temperature, top-p, and maximum number of tokens generated by passing in optional settings.
48
+
49
+ ||TK -- stream + settings example||
50
+
51
+ Note that this isn't just for images; Moondream is also a strong general-purpose text model.
52
+
53
+ ||TK -- text only example||
54
+
55
+ ### Caption
56
+
57
+ Whether you want short, normal-sized or long descriptions of images, the `caption` skill has you covered.
58
+
59
+ ||TK -- captioning example||
60
+
61
+ It accepts the same streaming and temperature etc. settings as the `query` skill.
62
+
63
+ ---
64
+
65
  * TODO: Add usage examples
 
 
66
  * Detect
67
  * Point