Update README.md
Browse files
README.md
CHANGED
|
@@ -160,9 +160,8 @@ To achieve optimal grounding performance, we recommend:
|
|
| 160 |
- Maintain original aspect ratios when resizing
|
| 161 |
|
| 162 |
2. **Prompt Engineering**:
|
| 163 |
-
- Be specific about the target element (e.g., "Click on the blue 'Submit' button in the top-right corner")
|
| 164 |
- Include element attributes when available (color, position, text)
|
| 165 |
-
- Use consistent terminology matching the UI
|
| 166 |
|
| 167 |
3. **Generation Parameters**:
|
| 168 |
- Use `temperature=0.0` for deterministic grounding
|
|
@@ -177,7 +176,6 @@ To achieve optimal grounding performance, we recommend:
|
|
| 177 |
5. **Post-processing**:
|
| 178 |
- Parse `<tool_call>` tags to extract JSON
|
| 179 |
- Validate coordinates are within screen bounds
|
| 180 |
-
- Handle cases where model may describe element instead of providing coordinates
|
| 181 |
|
| 182 |
## Training
|
| 183 |
|
|
@@ -191,8 +189,8 @@ For detailed training instructions, dataset preparation, and reproduction steps,
|
|
| 191 |
## Limitations and Future Work
|
| 192 |
|
| 193 |
- **Desktop-focused**: Primarily trained on desktop environments (though shows strong cross-platform generalization)
|
| 194 |
-
- **Action space**: Currently supports mouse
|
| 195 |
-
- **Languages**: Optimized for English UI elements
|
| 196 |
- **Resolution**: Performance may vary with extremely high or low resolution images
|
| 197 |
|
| 198 |
## Citation
|
|
|
|
| 160 |
- Maintain original aspect ratios when resizing
|
| 161 |
|
| 162 |
2. **Prompt Engineering**:
|
| 163 |
+
- Be specific about the target element (e.g., "Click on the blue 'Submit' button in the top-right corner" or "Click on the following element: Save")
|
| 164 |
- Include element attributes when available (color, position, text)
|
|
|
|
| 165 |
|
| 166 |
3. **Generation Parameters**:
|
| 167 |
- Use `temperature=0.0` for deterministic grounding
|
|
|
|
| 176 |
5. **Post-processing**:
|
| 177 |
- Parse `<tool_call>` tags to extract JSON
|
| 178 |
- Validate coordinates are within screen bounds
|
|
|
|
| 179 |
|
| 180 |
## Training
|
| 181 |
|
|
|
|
| 189 |
## Limitations and Future Work
|
| 190 |
|
| 191 |
- **Desktop-focused**: Primarily trained on desktop environments (though shows strong cross-platform generalization)
|
| 192 |
+
- **Action space**: Currently supports mouse click action only
|
| 193 |
+
- **Languages**: Optimized for English UI elements
|
| 194 |
- **Resolution**: Performance may vary with extremely high or low resolution images
|
| 195 |
|
| 196 |
## Citation
|