BAJUKA commited on
Commit
dcc653f
·
verified ·
1 Parent(s): 1d331fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -160,9 +160,8 @@ To achieve optimal grounding performance, we recommend:
160
  - Maintain original aspect ratios when resizing
161
 
162
  2. **Prompt Engineering**:
163
- - Be specific about the target element (e.g., "Click on the blue 'Submit' button in the top-right corner")
164
  - Include element attributes when available (color, position, text)
165
- - Use consistent terminology matching the UI
166
 
167
  3. **Generation Parameters**:
168
  - Use `temperature=0.0` for deterministic grounding
@@ -177,7 +176,6 @@ To achieve optimal grounding performance, we recommend:
177
  5. **Post-processing**:
178
  - Parse `<tool_call>` tags to extract JSON
179
  - Validate coordinates are within screen bounds
180
- - Handle cases where model may describe element instead of providing coordinates
181
 
182
  ## Training
183
 
@@ -191,8 +189,8 @@ For detailed training instructions, dataset preparation, and reproduction steps,
191
  ## Limitations and Future Work
192
 
193
  - **Desktop-focused**: Primarily trained on desktop environments (though shows strong cross-platform generalization)
194
- - **Action space**: Currently supports mouse and keyboard actions; additional modalities under exploration
195
- - **Languages**: Optimized for English UI elements; multilingual support in development
196
  - **Resolution**: Performance may vary with extremely high or low resolution images
197
 
198
  ## Citation
 
160
  - Maintain original aspect ratios when resizing
161
 
162
  2. **Prompt Engineering**:
163
+ - Be specific about the target element (e.g., "Click on the blue 'Submit' button in the top-right corner" or "Click on the following element: Save")
164
  - Include element attributes when available (color, position, text)
 
165
 
166
  3. **Generation Parameters**:
167
  - Use `temperature=0.0` for deterministic grounding
 
176
  5. **Post-processing**:
177
  - Parse `<tool_call>` tags to extract JSON
178
  - Validate coordinates are within screen bounds
 
179
 
180
  ## Training
181
 
 
189
  ## Limitations and Future Work
190
 
191
  - **Desktop-focused**: Primarily trained on desktop environments (though shows strong cross-platform generalization)
192
+ - **Action space**: Currently supports mouse click action only
193
+ - **Languages**: Optimized for English UI elements
194
  - **Resolution**: Performance may vary with extremely high or low resolution images
195
 
196
  ## Citation