Instructions to use Open4bits/nexora-vector-v0.1-mlx-4Bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Open4bits/nexora-vector-v0.1-mlx-4Bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Open4bits/nexora-vector-v0.1-mlx-4Bit")
model = AutoModelForCausalLM.from_pretrained("Open4bits/nexora-vector-v0.1-mlx-4Bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

MLX

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Open4bits/nexora-vector-v0.1-mlx-4Bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

vLLM

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Open4bits/nexora-vector-v0.1-mlx-4Bit"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Open4bits/nexora-vector-v0.1-mlx-4Bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Open4bits/nexora-vector-v0.1-mlx-4Bit

SGLang

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Open4bits/nexora-vector-v0.1-mlx-4Bit" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Open4bits/nexora-vector-v0.1-mlx-4Bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Open4bits/nexora-vector-v0.1-mlx-4Bit" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Open4bits/nexora-vector-v0.1-mlx-4Bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Open4bits/nexora-vector-v0.1-mlx-4Bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Open4bits/nexora-vector-v0.1-mlx-4Bit

Run Hermes

hermes

OpenClaw new

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with OpenClaw:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "Open4bits/nexora-vector-v0.1-mlx-4Bit" \
  --custom-provider-id mlx-lm \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

MLX LM

How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "Open4bits/nexora-vector-v0.1-mlx-4Bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Docker Model Runner
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Docker Model Runner:
```
docker model run hf.co/Open4bits/nexora-vector-v0.1-mlx-4Bit
```

nexora-vector-v0.1-mlx-4Bit / README.md

fmasterpro27

Update README.md

fc6884b verified 3 months ago

preview code

Raw

History Blame Contribute Delete

10.3 kB

	---
	license: apache-2.0
	base_model: ArkAiLab-Adl/nexora-vector-v0.1
	tags:
	- nexora
	- chat
	- qwen3
	- conversational
	- mlx
	- mlx-my-repo
	language:
	- en
	pipeline_tag: text-generation
	library_name: transformers
	---
	<p align="center">
	<img src="https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1/resolve/main/assets/nexora-vector.png" alt="Nexora-Vector"/>
	</p>

	# Nexora-Vector-v0.1 · MLX 4-Bit

	<p align="center">
	<img src="https://img.shields.io/badge/status-beta-orange" alt="Status: Beta"/>
	<img src="https://img.shields.io/badge/license-Apache%202.0-blue" alt="License: Apache 2.0"/>
	<img src="https://img.shields.io/badge/base_model-Qwen3--4B-blueviolet" alt="Base Model"/>
	<img src="https://img.shields.io/badge/output-SVG-green" alt="Output: SVG"/>
	<img src="https://img.shields.io/badge/format-MLX-lightgrey" alt="Format: MLX"/>
	<img src="https://img.shields.io/badge/quantization-4--Bit-yellow" alt="Quantization: 4-Bit"/>
	</p>

	> Nexora-Vector-v0.1 MLX 4-Bit is the official Apple MLX 4-bit quantized release of [Nexora-Vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1), published by [Open4bits](https://huggingface.co/Open4bits) — an official quantization project under ArkAiLabs. Nexora-Vector is an experimental text-to-vector model that generates structured SVG graphics from natural language prompts. This variant is optimized for efficient local inference on Apple Silicon hardware via the MLX framework.

	---

	## Table of Contents

	- [Overview](#overview)
	- [Model Details](#model-details)
	- [Capabilities](#capabilities)
	- [Limitations](#limitations)
	- [Intended Use](#intended-use)
	- [Architecture & Quantization](#architecture--quantization)
	- [Usage Recommendations](#usage-recommendations)
	- [Original Model](#original-model)
	- [Evaluation](#evaluation)
	- [Risks & Considerations](#risks--considerations)
	- [Future Work](#future-work)
	- [Community & Support](#community--support)
	- [License](#license)
	- [Acknowledgements](#acknowledgements)

	---

	## Overview

	This is the official MLX 4-bit quantized release of Nexora-Vector-v0.1, published by [Open4bits](https://huggingface.co/Open4bits) — the official quantization project under ArkAiLabs — and converted for use with Apple's [MLX](https://github.com/ml-explore/mlx) framework. The base model is a supervised fine-tuned variant of Qwen3-4B, adapted specifically to generate structured vector graphics in SVG format from natural language instructions.

	This release is in beta and is intended for research, experimentation, and early-stage design tooling on Apple Silicon machines. All outputs should be validated before use in any downstream pipeline.

	---

	## Model Details

	\| Property \| Details \|
	\|---\|---\|
	\| Model Type \| MLX 4-Bit Quantized \|
	\| Base Model \| [Nexora-Vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1) \|
	\| Original Base \| Qwen3-4B \|
	\| Fine-tuning Method \| Supervised Fine-Tuning (SFT) \|
	\| Quantization \| 4-Bit (MLX) \|
	\| Target Hardware \| Apple Silicon (M1/M2/M3/M4 series) \|
	\| Framework \| [MLX](https://github.com/ml-explore/mlx) \|
	\| Output Format \| SVG \|
	\| License \| Apache 2.0 \|

	---

	## Capabilities

	Nexora-Vector-v0.1 is designed to translate textual instructions into structured SVG code. This MLX version retains all capabilities of the original model while enabling fast, memory-efficient inference on Apple Silicon. The model is best suited for:

	- Generating SVG markup for simple vector graphics
	- Producing geometric shapes and basic illustrations
	- Creating lightweight icons and minimal design assets
	- Supporting rapid prototyping in vector-based design workflows on macOS

	> Tip: The model performs best with concise, clearly scoped prompts focused on simple visual compositions.

	---

	## Limitations

	This is an early-stage beta release. Users should be aware of the following constraints:

	- High hallucination rate — outputs may be invalid or non-renderable SVG
	- Limited generalization — the small training dataset (~1,500 samples) affects output consistency
	- Weak complex scene handling — highly detailed or multi-element prompts may produce poor results
	- Manual correction required — outputs should be validated and post-processed before use
	- Not production-ready — not suitable for safety-critical or automated pipelines
	- 4-bit quality trade-off — minor quality degradation is expected compared to the full-precision original model

	---

	## Intended Use

	### ✅ Supported Use Cases

	- Academic and applied research in text-to-vector generation on Apple Silicon
	- Experimental AI-assisted design systems running locally on macOS
	- Educational exploration of structured output generation
	- Lightweight SVG prototyping and ideation with low memory overhead

	### ❌ Out-of-Scope Use Cases

	- Production-grade or commercial vector asset pipelines
	- High-precision design deliverables without human validation
	- Automated systems where SVG correctness is required without manual review
	- Non-Apple-Silicon hardware (use the [GGUF version](https://huggingface.co/Open4bits/nexora-vector-v0.1-GGUF) instead)

	---

	## Architecture & Quantization

	This model is a 4-bit MLX quantization of the original Nexora-Vector-v0.1 weights, which are themselves a supervised fine-tune of Qwen3-4B.

	### Quantization Details

	\| Parameter \| Details \|
	\|---\|---\|
	\| Quantization Method \| MLX 4-Bit \|
	\| Source Model \| [ArkAiLab-Adl/nexora-vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1) \|
	\| Framework \| Apple MLX \|
	\| Memory Reduction \| ~75% vs. full-precision (fp16) \|
	\| Target Platform \| macOS with Apple Silicon \|

	### Original Training Configuration

	\| Parameter \| Details \|
	\|---\|---\|
	\| Fine-tuning Method \| Supervised Fine-Tuning (SFT) \|
	\| Dataset Composition \| Curated prompt–SVG pairs \|
	\| Dataset Size \| ~1,500 samples \|
	\| Training Objective \| Structured output generation for SVG formats \|

	> Note: The relatively small dataset size may result in instability and limited generalization across diverse prompts. Improved dataset coverage is planned for future versions.

	---

	## Usage Recommendations

	To get the best results from this model:

	1. Keep prompts simple and specific — avoid multi-scene or highly complex compositions
	2. Validate all SVG outputs before rendering or integrating into any pipeline
	3. Post-process outputs to correct syntax or structural issues
	4. Use iterative prompting — refining prompts across multiple turns often yields better results
	5. Expect imperfections — this is a beta model; treat outputs as drafts, not finals
	6. Run on Apple Silicon — this MLX build is optimized for M1/M2/M3/M4 series chips



	---

	## Original Model

	\| Version \| Link \|
	\|---\|---\|
	\| Original (full precision) \| [ArkAiLab-Adl/nexora-vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1) \|
	\| GGUF Quantized \| [Open4bits/nexora-vector-v0.1-GGUF](https://huggingface.co/Open4bits/nexora-vector-v0.1-GGUF) \|
	\| MLX 4-Bit (this model) \| [Open4bits/nexora-vector-v0.1-mlx-4Bit](https://huggingface.co/Open4bits/nexora-vector-v0.1-mlx-4Bit) \|

	---

	## Evaluation

	Nexora-Vector-v0.1 has not yet undergone formal benchmark evaluation. Current assessment is qualitative, based on manual testing of SVG generation tasks.

	Planned evaluation metrics for future releases include:

	\| Metric \| Description \|
	\|---\|---\|
	\| SVG Validity Rate \| Percentage of outputs that are parseable, valid SVG \|
	\| Structural Correctness \| Adherence to SVG schema and element hierarchy \|
	\| Prompt Adherence \| Alignment between user intent and generated output \|
	\| Visual Consistency \| Stability of outputs across similar prompts \|

	---

	## Risks & Considerations

	Developers integrating this model should account for the following risks:

	- Generation of malformed or non-functional SVG code
	- Inconsistent instruction following across prompt variations
	- Unpredictable outputs due to limited training data coverage
	- Minor quality reduction inherent to 4-bit quantization

	Recommendation: Implement downstream validation layers and SVG syntax checking before any rendering or integration.

	---

	## Future Work

	The following improvements are planned for upcoming versions of the Nexora Vector series:

	- [ ] Expanded and more diverse training dataset
	- [ ] Improved SVG syntax correctness and validity rates
	- [ ] Reduced hallucination rates
	- [ ] Enhanced natural language understanding for complex prompts
	- [ ] Support for richer vector compositions and multi-element scenes
	- [ ] Formal benchmark evaluation suite
	- [ ] Updated MLX quantized releases aligned with future model versions

	---

	## Community & Support

	Join the community for updates and discussion:

	💬 [Join our Discord Server](https://discord.gg/mwdrgYbzuG)

	---

	## License

	This model is released under the Apache License 2.0.

	You may use, modify, and distribute this model in accordance with the terms of the Apache 2.0 license. See the [LICENSE](./LICENSE) file for full details, or refer to the [official Apache 2.0 license text](https://www.apache.org/licenses/LICENSE-2.0).

	---

	## Acknowledgements

	This is an official ArkAiLabs release, published under the [Open4bits](https://huggingface.co/Open4bits) project — ArkAiLabs' dedicated initiative for quantized model releases. The MLX 4-bit weights are derived from [Nexora-Vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1), which is itself built upon [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) by the Qwen team. We thank the MLX team at Apple and the open-source AI community for their continued contributions that make projects like this possible.

	---

	## About Nexora & Open4bits

	Nexora is an experimental AI initiative under ArkAiLabs, focused on building lightweight, practical, and creative AI systems for real-world applications. The Nexora Vector series represents our exploration into AI-assisted vector graphics generation.

	Open4bits is ArkAiLabs' official project for quantized model releases, providing optimized variants of Nexora models for efficient local inference across different hardware platforms.