Text Generation
Transformers
Safetensors
MLX
English
qwen3
nexora
chat
conversational
mlx-my-repo
text-generation-inference
4-bit precision
Instructions to use Open4bits/nexora-vector-v0.1-mlx-4Bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Open4bits/nexora-vector-v0.1-mlx-4Bit") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Open4bits/nexora-vector-v0.1-mlx-4Bit") model = AutoModelForCausalLM.from_pretrained("Open4bits/nexora-vector-v0.1-mlx-4Bit") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - MLX
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("Open4bits/nexora-vector-v0.1-mlx-4Bit") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- vLLM
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Open4bits/nexora-vector-v0.1-mlx-4Bit" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open4bits/nexora-vector-v0.1-mlx-4Bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Open4bits/nexora-vector-v0.1-mlx-4Bit
- SGLang
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Open4bits/nexora-vector-v0.1-mlx-4Bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open4bits/nexora-vector-v0.1-mlx-4Bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Open4bits/nexora-vector-v0.1-mlx-4Bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open4bits/nexora-vector-v0.1-mlx-4Bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Pi
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Open4bits/nexora-vector-v0.1-mlx-4Bit" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Open4bits/nexora-vector-v0.1-mlx-4Bit
Run Hermes
hermes
- OpenClaw new
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with OpenClaw:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"
Configure OpenClaw
# Install OpenClaw: npm install -g openclaw@latest # Register the local server and set it as the default model: openclaw onboard --non-interactive --mode local \ --auth-choice custom-api-key \ --custom-base-url http://127.0.0.1:8080/v1 \ --custom-model-id "Open4bits/nexora-vector-v0.1-mlx-4Bit" \ --custom-provider-id mlx-lm \ --custom-compatibility openai \ --custom-text-input \ --accept-risk \ --skip-health
Run OpenClaw
openclaw agent --local --agent main --message "Hello from Hugging Face"
- MLX LM
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "Open4bits/nexora-vector-v0.1-mlx-4Bit"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "Open4bits/nexora-vector-v0.1-mlx-4Bit" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Open4bits/nexora-vector-v0.1-mlx-4Bit", "messages": [ {"role": "user", "content": "Hello"} ] }' - Docker Model Runner
How to use Open4bits/nexora-vector-v0.1-mlx-4Bit with Docker Model Runner:
docker model run hf.co/Open4bits/nexora-vector-v0.1-mlx-4Bit
| license: apache-2.0 | |
| base_model: ArkAiLab-Adl/nexora-vector-v0.1 | |
| tags: | |
| - nexora | |
| - chat | |
| - qwen3 | |
| - conversational | |
| - mlx | |
| - mlx-my-repo | |
| language: | |
| - en | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| <p align="center"> | |
| <img src="https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1/resolve/main/assets/nexora-vector.png" alt="Nexora-Vector"/> | |
| </p> | |
| # Nexora-Vector-v0.1 · MLX 4-Bit | |
| <p align="center"> | |
| <img src="https://img.shields.io/badge/status-beta-orange" alt="Status: Beta"/> | |
| <img src="https://img.shields.io/badge/license-Apache%202.0-blue" alt="License: Apache 2.0"/> | |
| <img src="https://img.shields.io/badge/base_model-Qwen3--4B-blueviolet" alt="Base Model"/> | |
| <img src="https://img.shields.io/badge/output-SVG-green" alt="Output: SVG"/> | |
| <img src="https://img.shields.io/badge/format-MLX-lightgrey" alt="Format: MLX"/> | |
| <img src="https://img.shields.io/badge/quantization-4--Bit-yellow" alt="Quantization: 4-Bit"/> | |
| </p> | |
| > **Nexora-Vector-v0.1 MLX 4-Bit** is the official Apple MLX 4-bit quantized release of [Nexora-Vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1), published by **[Open4bits](https://huggingface.co/Open4bits)** — an official quantization project under **ArkAiLabs**. Nexora-Vector is an experimental text-to-vector model that generates structured SVG graphics from natural language prompts. This variant is optimized for efficient local inference on Apple Silicon hardware via the MLX framework. | |
| --- | |
| ## Table of Contents | |
| - [Overview](#overview) | |
| - [Model Details](#model-details) | |
| - [Capabilities](#capabilities) | |
| - [Limitations](#limitations) | |
| - [Intended Use](#intended-use) | |
| - [Architecture & Quantization](#architecture--quantization) | |
| - [Usage Recommendations](#usage-recommendations) | |
| - [Original Model](#original-model) | |
| - [Evaluation](#evaluation) | |
| - [Risks & Considerations](#risks--considerations) | |
| - [Future Work](#future-work) | |
| - [Community & Support](#community--support) | |
| - [License](#license) | |
| - [Acknowledgements](#acknowledgements) | |
| --- | |
| ## Overview | |
| This is the **official MLX 4-bit quantized** release of Nexora-Vector-v0.1, published by **[Open4bits](https://huggingface.co/Open4bits)** — the official quantization project under **ArkAiLabs** — and converted for use with Apple's [MLX](https://github.com/ml-explore/mlx) framework. The base model is a supervised fine-tuned variant of **Qwen3-4B**, adapted specifically to generate structured vector graphics in SVG format from natural language instructions. | |
| This release is in **beta** and is intended for research, experimentation, and early-stage design tooling on Apple Silicon machines. All outputs should be validated before use in any downstream pipeline. | |
| --- | |
| ## Model Details | |
| | Property | Details | | |
| |---|---| | |
| | **Model Type** | MLX 4-Bit Quantized | | |
| | **Base Model** | [Nexora-Vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1) | | |
| | **Original Base** | Qwen3-4B | | |
| | **Fine-tuning Method** | Supervised Fine-Tuning (SFT) | | |
| | **Quantization** | 4-Bit (MLX) | | |
| | **Target Hardware** | Apple Silicon (M1/M2/M3/M4 series) | | |
| | **Framework** | [MLX](https://github.com/ml-explore/mlx) | | |
| | **Output Format** | SVG | | |
| | **License** | Apache 2.0 | | |
| --- | |
| ## Capabilities | |
| Nexora-Vector-v0.1 is designed to translate textual instructions into structured SVG code. This MLX version retains all capabilities of the original model while enabling fast, memory-efficient inference on Apple Silicon. The model is best suited for: | |
| - Generating SVG markup for simple vector graphics | |
| - Producing geometric shapes and basic illustrations | |
| - Creating lightweight icons and minimal design assets | |
| - Supporting rapid prototyping in vector-based design workflows on macOS | |
| > **Tip:** The model performs best with concise, clearly scoped prompts focused on simple visual compositions. | |
| --- | |
| ## Limitations | |
| This is an early-stage beta release. Users should be aware of the following constraints: | |
| - **High hallucination rate** — outputs may be invalid or non-renderable SVG | |
| - **Limited generalization** — the small training dataset (~1,500 samples) affects output consistency | |
| - **Weak complex scene handling** — highly detailed or multi-element prompts may produce poor results | |
| - **Manual correction required** — outputs should be validated and post-processed before use | |
| - **Not production-ready** — not suitable for safety-critical or automated pipelines | |
| - **4-bit quality trade-off** — minor quality degradation is expected compared to the full-precision original model | |
| --- | |
| ## Intended Use | |
| ### ✅ Supported Use Cases | |
| - Academic and applied research in text-to-vector generation on Apple Silicon | |
| - Experimental AI-assisted design systems running locally on macOS | |
| - Educational exploration of structured output generation | |
| - Lightweight SVG prototyping and ideation with low memory overhead | |
| ### ❌ Out-of-Scope Use Cases | |
| - Production-grade or commercial vector asset pipelines | |
| - High-precision design deliverables without human validation | |
| - Automated systems where SVG correctness is required without manual review | |
| - Non-Apple-Silicon hardware (use the [GGUF version](https://huggingface.co/Open4bits/nexora-vector-v0.1-GGUF) instead) | |
| --- | |
| ## Architecture & Quantization | |
| This model is a 4-bit MLX quantization of the original Nexora-Vector-v0.1 weights, which are themselves a supervised fine-tune of **Qwen3-4B**. | |
| ### Quantization Details | |
| | Parameter | Details | | |
| |---|---| | |
| | **Quantization Method** | MLX 4-Bit | | |
| | **Source Model** | [ArkAiLab-Adl/nexora-vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1) | | |
| | **Framework** | Apple MLX | | |
| | **Memory Reduction** | ~75% vs. full-precision (fp16) | | |
| | **Target Platform** | macOS with Apple Silicon | | |
| ### Original Training Configuration | |
| | Parameter | Details | | |
| |---|---| | |
| | **Fine-tuning Method** | Supervised Fine-Tuning (SFT) | | |
| | **Dataset Composition** | Curated prompt–SVG pairs | | |
| | **Dataset Size** | ~1,500 samples | | |
| | **Training Objective** | Structured output generation for SVG formats | | |
| > **Note:** The relatively small dataset size may result in instability and limited generalization across diverse prompts. Improved dataset coverage is planned for future versions. | |
| --- | |
| ## Usage Recommendations | |
| To get the best results from this model: | |
| 1. **Keep prompts simple and specific** — avoid multi-scene or highly complex compositions | |
| 2. **Validate all SVG outputs** before rendering or integrating into any pipeline | |
| 3. **Post-process outputs** to correct syntax or structural issues | |
| 4. **Use iterative prompting** — refining prompts across multiple turns often yields better results | |
| 5. **Expect imperfections** — this is a beta model; treat outputs as drafts, not finals | |
| 6. **Run on Apple Silicon** — this MLX build is optimized for M1/M2/M3/M4 series chips | |
| --- | |
| ## Original Model | |
| | Version | Link | | |
| |---|---| | |
| | **Original (full precision)** | [ArkAiLab-Adl/nexora-vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1) | | |
| | **GGUF Quantized** | [Open4bits/nexora-vector-v0.1-GGUF](https://huggingface.co/Open4bits/nexora-vector-v0.1-GGUF) | | |
| | **MLX 4-Bit (this model)** | [Open4bits/nexora-vector-v0.1-mlx-4Bit](https://huggingface.co/Open4bits/nexora-vector-v0.1-mlx-4Bit) | | |
| --- | |
| ## Evaluation | |
| Nexora-Vector-v0.1 has not yet undergone formal benchmark evaluation. Current assessment is qualitative, based on manual testing of SVG generation tasks. | |
| Planned evaluation metrics for future releases include: | |
| | Metric | Description | | |
| |---|---| | |
| | **SVG Validity Rate** | Percentage of outputs that are parseable, valid SVG | | |
| | **Structural Correctness** | Adherence to SVG schema and element hierarchy | | |
| | **Prompt Adherence** | Alignment between user intent and generated output | | |
| | **Visual Consistency** | Stability of outputs across similar prompts | | |
| --- | |
| ## Risks & Considerations | |
| Developers integrating this model should account for the following risks: | |
| - Generation of malformed or non-functional SVG code | |
| - Inconsistent instruction following across prompt variations | |
| - Unpredictable outputs due to limited training data coverage | |
| - Minor quality reduction inherent to 4-bit quantization | |
| **Recommendation:** Implement downstream validation layers and SVG syntax checking before any rendering or integration. | |
| --- | |
| ## Future Work | |
| The following improvements are planned for upcoming versions of the Nexora Vector series: | |
| - [ ] Expanded and more diverse training dataset | |
| - [ ] Improved SVG syntax correctness and validity rates | |
| - [ ] Reduced hallucination rates | |
| - [ ] Enhanced natural language understanding for complex prompts | |
| - [ ] Support for richer vector compositions and multi-element scenes | |
| - [ ] Formal benchmark evaluation suite | |
| - [ ] Updated MLX quantized releases aligned with future model versions | |
| --- | |
| ## Community & Support | |
| Join the community for updates and discussion: | |
| 💬 **[Join our Discord Server](https://discord.gg/mwdrgYbzuG)** | |
| --- | |
| ## License | |
| This model is released under the **Apache License 2.0**. | |
| You may use, modify, and distribute this model in accordance with the terms of the Apache 2.0 license. See the [LICENSE](./LICENSE) file for full details, or refer to the [official Apache 2.0 license text](https://www.apache.org/licenses/LICENSE-2.0). | |
| --- | |
| ## Acknowledgements | |
| This is an official ArkAiLabs release, published under the **[Open4bits](https://huggingface.co/Open4bits)** project — ArkAiLabs' dedicated initiative for quantized model releases. The MLX 4-bit weights are derived from **[Nexora-Vector-v0.1](https://huggingface.co/ArkAiLab-Adl/nexora-vector-v0.1)**, which is itself built upon **[Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B)** by the Qwen team. We thank the MLX team at Apple and the open-source AI community for their continued contributions that make projects like this possible. | |
| --- | |
| ## About Nexora & Open4bits | |
| **Nexora** is an experimental AI initiative under **ArkAiLabs**, focused on building lightweight, practical, and creative AI systems for real-world applications. The Nexora Vector series represents our exploration into AI-assisted vector graphics generation. | |
| **Open4bits** is ArkAiLabs' official project for quantized model releases, providing optimized variants of Nexora models for efficient local inference across different hardware platforms. | |