YAML Metadata Warning: empty or missing yaml metadata in repo card

Check out the documentation for more information.

AFRAgent — Epoch 7 checkpoint

This model is a checkpoint from AFRAgent (paper, WACV 2026): an Adaptive Feature Renormalization–based GUI agent for smartphone automation, built on InstructBLIP.

  • Architecture: AnyResAdaIn (any-resolution adaptive feature renormalization)
  • Base model: Salesforce/instructblip-flan-t5-xl
  • Training: Fine-tuned on Android-in-the-Wild (AITW) — 7 epochs, all_data_any_res_adain_finetuning (bs128, ip512, op256, ep12 run; this is checkpoint at step 56266 ≈ epoch 7)

How to load

Requires the AFRAgent codebase for the custom AnyResAdaIn class.

# Clone AFRAgent and add to path, then:
from models.any_res_adain_queries_fusion import AnyResAdaIn
from transformers import InstructBlipProcessor, AutoTokenizer

model = AnyResAdaIn.from_pretrained("neeraj321/AFRAgent_pure_multimodel")
processor = InstructBlipProcessor.from_pretrained("neeraj321/AFRAgent_pure_multimodel")
tokenizer = AutoTokenizer.from_pretrained("neeraj321/AFRAgent_pure_multimodel")

For evaluation with the AFRAgent script:

python instructblip_main.py \
  --evaluate_dir neeraj321/AFRAgent_pure_multimodel \
  --train_any_res_adain True \
  --use_high_res True \
  --data_root dataset/aitw/general/general \
  --input_len 512 --output_len 256 --eval_bs 64

License

MIT

Citation

@article{anand2025afragent,
  title={AFRAgent: An Adaptive Feature Renormalization Based High Resolution Aware GUI agent},
  author={Anand, Neeraj and others},
  journal={WACV},
  year={2026}
}
Downloads last month
24
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for neeraj321/AFRAgent_pure_multimodel