Instructions to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("QuantFactory/UIGEN-T3-8B-Preview-GGUF", dtype="auto")

llama-cpp-python

How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="QuantFactory/UIGEN-T3-8B-Preview-GGUF",
	filename="UIGEN-T3-8B-Preview.Q2_K.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

Use Docker

docker model run hf.co/QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with Ollama:
```
ollama run hf.co/QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M
```

Unsloth Studio

How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for QuantFactory/UIGEN-T3-8B-Preview-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for QuantFactory/UIGEN-T3-8B-Preview-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for QuantFactory/UIGEN-T3-8B-Preview-GGUF to start chatting

How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with Docker Model Runner:
```
docker model run hf.co/QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M
```

Lemonade

How to use QuantFactory/UIGEN-T3-8B-Preview-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull QuantFactory/UIGEN-T3-8B-Preview-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.UIGEN-T3-8B-Preview-GGUF-Q4_K_M

List all available models

lemonade list

QuantFactory/UIGEN-T3-8B-Preview-GGUF

This is quantized version of Tesslate/UIGEN-T3-8B-Preview created using llama.cpp

Original Model Card

UIGEN-T3 — Advanced UI Generation with Hybrid Reasoning

Tesslate’s next-gen UI model, built for thoughtful design.

Demos

Explore New UI generations: 📂 https://uigenoutput.tesslate.com

Join our Discord: https://discord.gg/GNbWAeJ4 Our Website: https://tesslate.com

Quick Information

UI generation model built on Qwen3 architecture
Supports both components and full web pages
Hybrid reasoning system: Use /think or /no_think modes
Powered by UIGenEval, a first-of-its-kind benchmark for UI generation
Released for research, non-commercial use. If you want to use it commercially, please contact us for a pilot program.

Model Details

Base Model: Qwen/Qwen3-8B
Reasoning Style: Hybrid (/think and /no_think)
Tokenizer: Qwen default, with design token headers
Output: Components + Full pages (with <html>, <head>)
Images: User-supplied or placehold.co – no images in the dataset due to licensing concerns.
License: Research only (non-commercial). Contact us for enterprise use cases.

Reasoning System

UIGEN-T3 was trained using a pre/post reasoning model architecture.

You can explicitly control the reasoning mode:

/think → Enables guided reasoning with layout analysis and heuristics.
/no_think → Faster, raw code generation.

Outputs also include design tokens at the top of each generation for easier site-wide customization.

Inference Parameters

Please use 20k context length to get the best results if using reasoning.

Parameter	Value
Temperature	0.6
Top P	0.95
Top K	20
Max Tokens	40k+

Evaluation: UIGenEval Framework

UIGenEval is our internal evaluation suite, designed to bridge the gap between creative output and quality assurance. (Learn more in our upcoming paper: "UIGenEval: Bridging the Evaluation Gap in AI-Driven UI Generation" - August, 2025)

UIGenEval evaluates models across four pillars:

Technical Quality — Clean HTML, CSS structure, semantic accuracy.
Prompt Adherence — Feature completeness and fidelity to instructions.
Interaction Behavior — Dynamic logic hooks and functional interactivity.
Responsive Design — Multi-viewport performance via Lighthouse, Axe-core, and custom scripts.

This comprehensive framework directly informs our GRPO reward functions for the next release.

Example Prompts to Try

make a google drive clone
build a figma-style canvas with toolbar
create a modern pricing page with three plans
generate a mobile-first recipe sharing app layout

Use Cases

Use Case	Description
Startup MVPs	Quickly scaffold UIs from scratch with clean code.
Design-to-Code Transfer	Figma (coming soon) → Code generation.
Component Libraries	Build buttons, cards, navbars, and export at scale.
Internal Tool Builders	Create admin panels, dashboards, and layout templates.
Rapid Client Prototypes	Save time on mockups with production-ready HTML+Tailwind outputs.

Limitations

No Bootstrap support (planned).
Not suited for production use — research-only license.
Responsive tuning varies across output complexity.

Roadmap

Milestone	Status
Launch Tesslate Designer	2 days
Figma convert
Bootstrap & JS logic
GRPO fine-tuning
4B draft model release	Now

Technical Requirements

GPU: ≥16GB VRAM for 8B inference on GGUF.
Libraries: transformers, torch, peft.
Compatible with Hugging Face inference APIs and local generation pipelines.

Community & Contribution

Join our Discord: https://discord.gg/GNbWAeJ4
Chat about AI, design, or model training.
Want to contribute UIs or feedback? Let’s talk!

Citation

@misc{tesslate_UIGEN-T3,
  title={UIGEN-T3: Hybrid Reasoning for Robust UI Generation on Qwen3},
  author={Tesslate Team},
  year={2025},
  publisher={Tesslate},
  note={Non-commercial Research License},
  url={https://huggingface.co/tesslate/UIGEN-T3}
}

Downloads last month: 129

GGUF

Model size

8B params

Architecture

qwen3

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for QuantFactory/UIGEN-T3-8B-Preview-GGUF

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Quantized

(291)

this model