AFRAgent : An Adaptive Feature Renormalization Based High Resolution Aware GUI agent
Paper
• 2512.00846 • Published
YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
This model is a checkpoint from AFRAgent (paper, WACV 2026): an Adaptive Feature Renormalization–based GUI agent for smartphone automation, built on InstructBLIP.
all_data_any_res_adain_finetuning (bs128, ip512, op256, ep12 run; this is checkpoint at step 56266 ≈ epoch 7)Requires the AFRAgent codebase for the custom AnyResAdaIn class.
# Clone AFRAgent and add to path, then:
from models.any_res_adain_queries_fusion import AnyResAdaIn
from transformers import InstructBlipProcessor, AutoTokenizer
model = AnyResAdaIn.from_pretrained("neeraj321/AFRAgent_pure_multimodel")
processor = InstructBlipProcessor.from_pretrained("neeraj321/AFRAgent_pure_multimodel")
tokenizer = AutoTokenizer.from_pretrained("neeraj321/AFRAgent_pure_multimodel")
For evaluation with the AFRAgent script:
python instructblip_main.py \
--evaluate_dir neeraj321/AFRAgent_pure_multimodel \
--train_any_res_adain True \
--use_high_res True \
--data_root dataset/aitw/general/general \
--input_len 512 --output_len 256 --eval_bs 64
MIT
@article{anand2025afragent,
title={AFRAgent: An Adaptive Feature Renormalization Based High Resolution Aware GUI agent},
author={Anand, Neeraj and others},
journal={WACV},
year={2026}
}