Add pipeline tag, library name, and paper link to model card

Hi! I'm Niels, part of the community science team at Hugging Face.

This pull request improves the model card for this repository:
- Added the `image-text-to-text` pipeline tag for better discoverability on the Hub.
- Added `library_name: transformers` to enable automated code snippets.
- Added a link to the research paper on Hugging Face Papers.
- Included the citation information and relevant links to the GitHub repository and project page.

The model architecture (`Qwen3VLForConditionalGeneration`) indicates compatibility with the Transformers library.

Files changed (1) hide show

README.md +30 -11

README.md CHANGED Viewed

@@ -1,12 +1,14 @@
 ---
-license: apache-2.0
 datasets:
 - GUI-Libra/GUI-Libra-81K-RL
 - GUI-Libra/GUI-Libra-81K-SFT
 language:
 - en
-base_model:
-- Qwen/Qwen3-VL-4B-Instruct
 tags:
 - VLM
 - GUI
@@ -15,12 +17,12 @@ tags:
 # Introduction
-The models from paper "GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL".
-**GitHub:** https://github.com/GUI-Libra/GUI-Libra
 **Website:** https://GUI-Libra.github.io
 # Usage
 ## 1) Start an OpenAI-compatible vLLM server
@@ -28,7 +30,7 @@ The models from paper "GUI-Libra: Training Native GUI Agents to Reason and Act w
 ```bash
 pip install -U vllm
 vllm serve GUI-Libra/GUI-Libra-4B --port 8000 --api-key token-abc123
-````
 * Endpoint: `http://localhost:8000/v1`
 * The `api_key` here must match `--api-key`.
@@ -79,11 +81,17 @@ action_type: Scroll, action_target: None, value: "up" | "down" | "left" | "right
 task_desc = 'Go to Amazon.com and buy a math book'
 prev_txt = ''
-question_description = '''Please generate the next move according to the UI screenshot {}, instruction and previous actions.\n\nInstruction: {}\n\nInteraction History: {}\n'''
 img_size_string = '(original image size {}x{})'.format(img_size[0], img_size[1])
 query = question_description.format(img_size_string, task_desc, prev_txt)
-query = query + '\n' + '''The response should be structured in the following format:
 <thinking>Your step-by-step thought process here...</thinking>
 <answer>
 {
@@ -125,5 +133,16 @@ python minimal_infer.py
 * If you hit OOM or slowdowns, reduce image size or run fewer concurrent requests.
 * The example assumes your vLLM server is running locally on port `8000`.

 ---
+base_model:
+- Qwen/Qwen3-VL-4B-Instruct
 datasets:
 - GUI-Libra/GUI-Libra-81K-RL
 - GUI-Libra/GUI-Libra-81K-SFT
 language:
 - en
+license: apache-2.0
+pipeline_tag: image-text-to-text
+library_name: transformers
 tags:
 - VLM
 - GUI
 # Introduction
+This repository contains the weights for **GUI-Libra-4B**, a native GUI agent model presented in the paper [GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL](https://huggingface.co/papers/2602.22190).
+**GitHub:** https://github.com/GUI-Libra/GUI-Libra
 **Website:** https://GUI-Libra.github.io
+GUI-Libra is a post-training framework that turns open-source VLMs into strong native GUI agents—models that see a screenshot, think step-by-step, and output an executable action, all within a single forward pass.
 # Usage
 ## 1) Start an OpenAI-compatible vLLM server
 ```bash
 pip install -U vllm
 vllm serve GUI-Libra/GUI-Libra-4B --port 8000 --api-key token-abc123
+```
 * Endpoint: `http://localhost:8000/v1`
 * The `api_key` here must match `--api-key`.
 task_desc = 'Go to Amazon.com and buy a math book'
 prev_txt = ''
+question_description = '''Please generate the next move according to the UI screenshot {}, instruction and previous actions.
+Instruction: {}
+Interaction History: {}
+'''
 img_size_string = '(original image size {}x{})'.format(img_size[0], img_size[1])
 query = question_description.format(img_size_string, task_desc, prev_txt)
+query = query + '
+' + '''The response should be structured in the following format:
 <thinking>Your step-by-step thought process here...</thinking>
 <answer>
 {
 * If you hit OOM or slowdowns, reduce image size or run fewer concurrent requests.
 * The example assumes your vLLM server is running locally on port `8000`.
+## Citation
+```bibtex
+@misc{yang2026guilibratrainingnativegui,
+      title={GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL},
+      author={Rui Yang and Qianhui Wu and Zhaoyang Wang and Hanyang Chen and Ke Yang and Hao Cheng and Huaxiu Yao and Baoling Peng and Huan Zhang and Jianfeng Gao and Tong Zhang},
+      year={2026},
+      eprint={2602.22190},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG},
+      url={https://arxiv.org/abs/2602.22190},
+}
+```