| --- |
| title: UI Screen Description Generator With Pix2Struct |
| emoji: 🐨 |
| colorFrom: purple |
| colorTo: blue |
| sdk: gradio |
| sdk_version: 5.28.0 |
| app_file: app.py |
| pinned: false |
| license: mit |
| short_description: Built a vision-language application |
| --- |
| |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
|
|
| # UI Screen Describer with Pix2Struct |
|
|
| This demo uses Google's `pix2struct-screen2words-large` model to turn UI screenshots into natural language descriptions. |
|
|
| ### Use Cases |
| - Accessibility |
| - UI testing |
| - Auto documentation |
|
|
| ### How it works |
| Upload any screenshot (e.g., app, webpage, dashboard) and the model will describe it in text. |
|
|
| Built using Hugging Face Transformers + Gradio. |
|
|
|
|