Instructions to use QuantaSparkLabs/Antiplex-instruct-3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use QuantaSparkLabs/Antiplex-instruct-3B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="QuantaSparkLabs/Antiplex-instruct-3B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("QuantaSparkLabs/Antiplex-instruct-3B")
model = AutoModelForCausalLM.from_pretrained("QuantaSparkLabs/Antiplex-instruct-3B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use QuantaSparkLabs/Antiplex-instruct-3B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "QuantaSparkLabs/Antiplex-instruct-3B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuantaSparkLabs/Antiplex-instruct-3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/QuantaSparkLabs/Antiplex-instruct-3B

SGLang

How to use QuantaSparkLabs/Antiplex-instruct-3B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "QuantaSparkLabs/Antiplex-instruct-3B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuantaSparkLabs/Antiplex-instruct-3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "QuantaSparkLabs/Antiplex-instruct-3B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuantaSparkLabs/Antiplex-instruct-3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use QuantaSparkLabs/Antiplex-instruct-3B with Docker Model Runner:
```
docker model run hf.co/QuantaSparkLabs/Antiplex-instruct-3B
```

NYXIS Logo

NYXIS Name

Antiplex-Instruct-3B

A warm, direct, open‑world conversational AI built on Phi‑3‑mini — no corporate bot vibes, just honest chat.

⚠️ Important

This model has been completely rebuilt from the ground up. The previous version suffered from corrupted config files, fused-weight mismatches, and gibberish output. Those issues are now fully resolved.
You can load the model directly with AutoModelForCausalLM.from_pretrained — no special libraries, no hacks, no "as an AI" deflections.
Please review the model files (config.json, model.safetensors, and tokenizer files) before installation to ensure you are using the latest version. MODEL work done.

📋 Overview

Antiplex-Instruct-3B is a high-performance instruction-tuned language model developed by QuantaSparkLabs. Released in 2026, this model is engineered for dual-task capability, delivering accurate identity alignment, reliable SQL generation, and strong general reasoning, while remaining lightweight and efficient.

The model is fine-tuned using LoRA (PEFT) on curated datasets emphasizing identity consistency and structured reasoning, making it ideal for edge deployment and specialized assistant roles.

✨ Core Features

🎯 Task Versatility	⚡ Performance Optimized
Text Generation: SQL/NLP, creative writing, technical explanations.	LoRA Fine-tuning: Efficient parameter adaptation.
Classification: Intent detection, task routing, safety filtering.	Identity Alignment: Consistent persona across interactions.
Dual-Mode: Single model handling generation + classification.	Lightweight: ~3.8B parameters, edge-friendly VRAM footprint.

statics

---

📊 Performance Benchmarks

🏆 Accuracy Metrics

Task	Accuracy	Confidence
Identity Verification	100%	⭐⭐⭐⭐⭐
SQL Generation	100%	⭐⭐⭐⭐⭐
General Reasoning	90%	⭐⭐⭐⭐

🔬 Reliability Assessment

21-Test Internal Validation Suite

Passed: 16 tests (76.2%)
Failed: 5 tests (23.8%)
Overall Grade: B (Good)

📈 View Detailed Test Categories

Category	Tests	Passed	Rate
Identity Tasks	7	7	100%
SQL Generation	6	6	100%
Reasoning	5	3	60%
Classification	3	2	66.7%

Test Dataset: QuantaSparkLabs/antiplex-test-suite

🏗️ Model Architecture

Training Pipeline

graph TD
    A[Base Model Phi-3-mini] --> B[LoRA Fine-tuning]
    B --> C[Task-Specific Heads]
    C --> D[Text Generation Head]
    C --> E[Classification Head]
    D --> F[Generation Output]
    E --> G[Classification Output]
    H[Instruction Dataset] --> B
    I[SQL Dataset] --> B
    J[Identity Dataset] --> B

structure

Inference Flow

User Prompt → Tokenization → Antiplex Core → Task Router 
                ↓
       [Generation/Classification] → Post-processing → Output

🔧 Technical Specifications

Parameter	Value
Base Model	`unsloth/Phi-3-mini-4k-instruct-bnb-4bit`
Fine-tuning	LoRA (PEFT)
Rank (r)	16
Alpha (α)	32
Optimizer	AdamW (β₁=0.9, β₂=0.999)
Learning Rate	2e-4
Batch Size	8
Epochs	3
Total Parameters	~3.8B

Dataset Composition

Dataset Type	Samples	Purpose
Identity Alignment	30	Consistent persona training
SQL Generation	300	Structured query training
Instruction Tuning	2,500	General capability enhancement
Classification	1,000	Intent detection training

💻 Quick Start

Installation

pip install transformers torch accelerate

Basic Usage (Text Generation)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "QuantaSparkLabs/Antiplex-instruct-3B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

prompt = "Write an SQL query to fetch users created in the last 30 days."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Classification Mode

# Intent classification example
classification_prompt = """[CLASSIFY]
User Query: "I need to reset my account password"
Categories: account_issue, technical_support, billing, general_inquiry
"""

inputs = tokenizer(classification_prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=64,
    temperature=0.3,
    do_sample=False
)

detected_intent = tokenizer.decode(outputs[0], skip_special_tokens=True).split('[')[-1].split(']')[0]
print(f"Detected Intent: {detected_intent}")

Chat Interface

from transformers import pipeline

chatbot = pipeline(
    "text-generation",
    model=model_id,
    tokenizer=tokenizer,
    device=0 if torch.cuda.is_available() else -1
)

messages = [
    {"role": "system", "content": "You are Antiplex, a helpful AI assistant specialized in SQL and classification tasks."},
    {"role": "user", "content": "Classify this intent: 'Can you help me with invoice generation?' Then write a SQL query to find recent invoices."}
]

response = chatbot(messages, max_new_tokens=512, temperature=0.7)
print(response[0]['generated_text'][-1]['content'])

🚀 Deployment Options

Hardware Requirements

Environment	VRAM	Quantization	Speed
GPU (Optimal)	8-12 GB	FP16	⚡ Fast
GPU (Efficient)	4-6 GB	INT8	⚡ Fast
CPU	N/A	FP32	🐌 Slow
Edge Device	2-4 GB	INT4	⚡ Fast

Cloud Deployment (Docker)

FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .
EXPOSE 8000

CMD ["python", "app.py"]

📁 Repository Structure

Antiplex-Instruct-3B/
├── README.md
├── model.safetensors
├── config.json
├── tokenizer.json
├── tokenizer_config.json
├── generation_config.json
├── special_tokens_map.json
├── quantasparklogo.png
├── examples/
│   ├── classification_demo.py
│   ├── sql_generation_demo.py
│   └── chat_interface.py
└── evaluation/
    └── test_results.json

⚠️ Limitations & Safety

Known Limitations

Domain Specificity: Not trained for medical/legal/safety-critical domains
Bias Inheritance: May reflect biases in training data
Context Window: Limited to 4K tokens
Multilingual: Primarily English-focused

Safety Guidelines

# Recommended safety wrapper
def safety_check(text):
    blocked_terms = ["harmful", "dangerous", "illegal", "exploit"]
    if any(term in text.lower() for term in blocked_terms):
        return "Content filtered for safety reasons."
    return text

🔄 Version History

Version	Date	Changes
v1.0.0	2026-01-1	Initial release
v1.1.0	2026-01-10	Enhanced classification head
v1.2.0	2026-01-25	SQL generation improvements

📄 License & Citation

License: Apache 2.0

Citation:

@misc{antiplex2026,
  title={Antiplex-Instruct-3B: A Dual-Task Instruction-Tuned Language Model},
  author={QuantaSparkLabs},
  year={2026},
  url={https://huggingface.co/QuantaSparkLabs/Antiplex-instruct-3B}
}

👥 Credits & Acknowledgments

Base Model: Microsoft Phi-3 Mini team
Fine-tuning Framework: Unsloth for efficient LoRA training
Evaluation: Internal QuantaSparkLabs team
Testing: Community contributors

🤝 Contributing & Support

Reporting Issues

Please open an issue on our repository with:

Model version
Reproduction steps
Expected vs actual behavior

Built with ❤️ by QuantaSparkLabs
_{Model ID: Antiplex-Instruct-3B • Parameters: ~3.8B • Release: 2026}

Someone gimmi a cup of coffe!☕

Downloads last month: 21

Safetensors

Model size

4B params

Tensor type

F16

Model tree for QuantaSparkLabs/Antiplex-instruct-3B

Base model

unsloth/Phi-3-mini-4k-instruct-bnb-4bit

Adapter

(42)

this model

Adapters

2 models

Evaluation results

Anti‑Tic Success Rate on antiplex-eval-set
self-reported

1.000
Factual Accuracy on antiplex-eval-set
self-reported

0.850
Coherence Score on antiplex-eval-set
self-reported

0.880
Conversational Warmth on antiplex-eval-set
self-reported

0.900
Grammar Accuracy on antiplex-eval-set
self-reported

0.920
Anti‑Tic Success Rate on QuantaSparkLabs/antiplex-test-suite
self-reported

1.000
Factual Accuracy on QuantaSparkLabs/antiplex-test-suite
self-reported

0.850
Coherence Score on QuantaSparkLabs/antiplex-test-suite
self-reported

0.880