Instructions to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="prithivMLmods/Muscae-Qwen3-UI-Code-4B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Muscae-Qwen3-UI-Code-4B")
model = AutoModelForCausalLM.from_pretrained("prithivMLmods/Muscae-Qwen3-UI-Code-4B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prithivMLmods/Muscae-Qwen3-UI-Code-4B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Muscae-Qwen3-UI-Code-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B

SGLang

How to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prithivMLmods/Muscae-Qwen3-UI-Code-4B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Muscae-Qwen3-UI-Code-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prithivMLmods/Muscae-Qwen3-UI-Code-4B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Muscae-Qwen3-UI-Code-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with Docker Model Runner:
```
docker model run hf.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B
```

prithivMLmods commited on Sep 24, 2025

Commit

e5eab22

verified ·

1 Parent(s): 1a211c8

Update README.md

Browse files

Files changed (1) hide show

README.md +25 -26

README.md CHANGED Viewed

@@ -18,35 +18,34 @@ pipeline_tag: text-generation
 ![2](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/P-LIpWdt5ypMGIfE5kJiK.png)
 # **Muscae-Qwen3-UI-Code-4B**
-> **Muscae-Qwen3-UI-Code-4B** is a reasoning-enhanced model fine-tuned on **Qwen** using the **GPT-OSS Web UI Coding dataset traces**, specializing in **web interface coding**, **structured generation**, and **polished token probabilities**.
-> It excels at generating **production-grade UI components**, **frontend layouts**, and **logic-driven interface code** with high precision and consistency.
 > \[!note]
 > GGUF: [https://huggingface.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B-GGUF](https://huggingface.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B-GGUF)
 ## **Key Features**
-1. **UI-Focused Reasoning Engine**
-   Fine-tuned for precise **frontend development workflows**, generating optimized HTML, CSS, React, and Tailwind-based code with minimal refactoring needs.
-2. **Web Interface Generation Mastery**
-   Excels in building responsive layouts, interactive components, and dashboard UIs directly from natural language prompts or wireframe descriptions.
-3. **Polished Token Probabilities**
-   Trained for smoother generation curves and deterministic structure in code, minimizing syntax errors and enhancing readability.
-4. **Hybrid Logic-Coding Synthesis**
-   Combines structural reasoning with frontend logic understanding to generate UI code that’s both **functional** and **aesthetically consistent**.
-5. **Structured Output Formats**
-   Outputs code and structured data in **HTML**, **React (JSX/TSX)**, **Tailwind**, **JSON**, and **YAML**, supporting full-stack workflows and CI/CD pipelines.
 6. **Optimized Lightweight Footprint**
-   Compact **4B parameter size**, deployable on **mid-range GPUs**, **developer workstations**, and **edge build servers** while maintaining high-quality UI generation.
 ## **Quickstart with Transformers**
@@ -62,10 +61,10 @@ model = AutoModelForCausalLM.from_pretrained(
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
-prompt = "Generate a responsive React dashboard with a sidebar and top navigation bar using Tailwind CSS."
 messages = [
-    {"role": "system", "content": "You are a frontend coding assistant skilled in web UI generation and responsive design."},
     {"role": "user", "content": prompt}
 ]
@@ -91,15 +90,15 @@ print(response)
 ## **Intended Use**
-* Web UI component generation and layout scaffolding
-* Responsive dashboard, landing page, and frontend application coding
-* Educational and research tasks related to frontend development
-* Lightweight deployment in developer environments and CI/CD pipelines
-* Structured code generation and UI prototyping from natural language prompts
 ## **Limitations**
-* Focused on **UI and frontend code generation** — not suited for deep backend logic or non-UI tasks
-* Might require minor manual adjustments for large-scale production apps
-* Prioritizes structured and readable code over creative design experimentation
-* Performance may vary with **extremely long code contexts** or multi-file full-stack generation tasks

 ![2](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/P-LIpWdt5ypMGIfE5kJiK.png)
 # **Muscae-Qwen3-UI-Code-4B**
+> **Muscae-Qwen3-UI-Code-4B** is a web-UI-focused model fine-tuned on UIGEN-T3-4B-Preview (built upon **Qwen3-4B**) for **controlled Abliterated Reasoning** and **polished token probabilities**, designed **exclusively for experimental use**.
+> It excels at **modern web UI coding tasks**, **structured component generation**, and **layout-aware reasoning**, making it ideal for frontend developers, UI engineers, and research prototypes exploring structured code generation.
 > \[!note]
 > GGUF: [https://huggingface.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B-GGUF](https://huggingface.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B-GGUF)
 ## **Key Features**
+1. **UI-Oriented Abliterated Reasoning**
+   Controlled reasoning precision tailored for frontend development and code generation, with polished token distributions ensuring structured, maintainable output.
+2. **Web UI Component Generation**
+   Excels at generating **responsive components**, **semantic HTML**, and **Tailwind-based layouts** with reasoning-aware structure and minimal boilerplate.
+3. **Layout-Aware Structured Logic**
+   Understands **UI state flows**, **component hierarchies**, and **responsive design patterns**, producing logically consistent, production-ready UI code.
+4. **Hybrid Reasoning for Code**
+   Combines symbolic reasoning with probabilistic inference to deliver optimized component logic, conditional rendering, and event-driven UI behavior.
+5. **Structured Output Mastery**
+   Natively outputs in **HTML**, **React**, **Markdown**, **JSON**, and **YAML**, making it ideal for UI prototyping, design systems, and documentation generation.
 6. **Optimized Lightweight Footprint**
+   With a **4B parameter size**, it’s deployable on **mid-range GPUs**, **offline workstations**, or **edge devices** while retaining strong UI coding capabilities.
 ## **Quickstart with Transformers**
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
+prompt = "Generate a responsive landing page hero section with Tailwind and semantic HTML."
 messages = [
+    {"role": "system", "content": "You are a frontend coding assistant skilled in UI generation, semantic HTML, and component structuring."},
     {"role": "user", "content": prompt}
 ]
 ## **Intended Use**
+* Web UI coding and component generation
+* Responsive layout and frontend architecture prototyping
+* Semantic HTML, Tailwind, and React code generation
+* Research and experimental projects on structured code synthesis
+* Design-system-driven development workflows
 ## **Limitations**
+* Experimental model – not optimized for production-critical deployments
+* Focused on **UI coding** – not suitable for general reasoning or creative writing
+* May produce inconsistent results with **very long prompts** or **cross-framework tasks**
+* Prioritizes structure and correctness over stylistic creativity or verbosity