litvit5 commited on
Commit
00e5ca7
·
1 Parent(s): 63b8b5e
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -6,9 +6,9 @@ base_model:
6
  - google/siglip2-base-patch16-512
7
  ---
8
 
9
- # LiteVit5 - Image-to-HTML Model
10
 
11
- A lightweight transformer model combining SigLIP vision encoder with T5 seq2seq decoder for image-to-text generation tasks.
12
 
13
  ## Model Architecture
14
 
 
6
  - google/siglip2-base-patch16-512
7
  ---
8
 
9
+ # No Code, No Cloud: On-Device Mockup-to-Code with Lightweight Vision-Language AI
10
 
11
+ Bridging the gap between visual design and functional code remains a persistent challenge in modern UI workflows, especially for small teams and non-programmers. Existing solutions, such as Figma-to-code tools and recent vision-language models (VLMs), often depend on proprietary cloud APIs or large-scale architectures, limiting offline operation, privacy, and control. We present LiteViT5, a lightweight, on-device vision-language model that directly generates HTML from images of design mockups, enabling private, no-code prototyping without cloud infrastructure. Built on a compact ViT–T5 encoder–decoder framework with 235M parameters, LiteViT5 achieves competitive results on both in-distribution (WebSight) and out-of-distribution (Design2Code) benchmarks. We evaluate its performance across structure, position, color, and CLIP-based similarity metrics and report its comparable performance to models 10–30× larger such as PaliGemma-3B, LLaVA-7B, and DeepSeek-VL-7B. We further assess LiteViT5 in a user study with 24 participants assessing perceived accuracy, code quality, and editability. Our findings show that LiteViT5 supports rapid design iteration, reduces reliance on developer handoff, making it a practical, assistive tool for democratizing web interface creation. This work highlights the potential of efficient, human-centered generative AI to empower interface design beyond expert-only workflows. To support transparency and reproducibility, we release LiteViT5 as an open-source model on Hugging Face: https://huggingface.co/LiteVit5/model.
12
 
13
  ## Model Architecture
14