Instructions to use thaddickson/Delphi-7B-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use thaddickson/Delphi-7B-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="thaddickson/Delphi-7B-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("thaddickson/Delphi-7B-v1")
model = AutoModelForCausalLM.from_pretrained("thaddickson/Delphi-7B-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use thaddickson/Delphi-7B-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "thaddickson/Delphi-7B-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thaddickson/Delphi-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/thaddickson/Delphi-7B-v1

SGLang

How to use thaddickson/Delphi-7B-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "thaddickson/Delphi-7B-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thaddickson/Delphi-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "thaddickson/Delphi-7B-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thaddickson/Delphi-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use thaddickson/Delphi-7B-v1 with Docker Model Runner:
```
docker model run hf.co/thaddickson/Delphi-7B-v1
```

Delphi-7B-v1 / README.md

thaddickson

Update model card for voice-v3

45ef17c verified 3 months ago

preview code

raw

history blame contribute delete

4.55 kB

	---
	language:
	- en
	license: apache-2.0
	library_name: transformers
	tags:
	- mergekit
	- model_stock
	- slerp
	- lora
	- qwen2
	- healthcare
	- cybersecurity
	- reasoning
	base_model:
	- newsbang/Homer-v1.0-Qwen2.5-7B
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	- fblgit/cybertron-v4-qw7B-MGS
	- bespokelabs/Bespoke-Stratos-7B
	- Qwen/Qwen2.5-Math-7B-Instruct
	- Orion-zhen/Qwen2.5-7B-Instruct-Uncensored
	---

	# Delphi-7B-v1

	Delphi Cyber Pro \| Truth wrapped in complexity

	## The Model

	Delphi-7B is a 7.6B parameter reasoning model built for healthcare cybersecurity, clinical operations, and cross-domain problem solving. It combines a 6-model merge of Qwen 2.5 7B specialists with multi-stage training: LoRA refinement, SLERP blending, and voice SFT from expert reasoning pairs.

	Built by Thaddeus Dickson, CEO of Xpio Health. 20 years of healthcare cybersecurity and compliance expertise baked into the training data.

	This is Chapter 1 of a three-part build. The general model proves the pipeline. What comes next is the point.

	## Source Models

	\| Model \| Role \|
	\|---\|---\|
	\| Homer-v1.0-Qwen2.5-7B \| Base. Strongest instruction follower in the 7B bracket. \|
	\| DeepSeek-R1-Distill-Qwen-7B \| Reasoning. Chain-of-thought distilled from DeepSeek-R1. \|
	\| cybertron-v4-qw7B-MGS \| Math and multi-task. Shows up in every competitive 7B merge. \|
	\| Bespoke-Stratos-7B \| Reasoning distillation. 17K clean examples, proven results. \|
	\| Qwen2.5-Math-7B-Instruct \| Pure math specialist. \|
	\| Qwen2.5-7B-Instruct-Uncensored \| Breadth. Says what it means. \|

	## Training Pipeline

	Stage 1 — Merge. model_stock merge of 6 specialists using mergekit. Homer base provides instruction following, DeepSeek-R1-Distill brings chain-of-thought, cybertron and Math-7B cover quantitative tasks.

	Stage 2 — LoRA. Two rounds of LoRA refinement (rank 32, alpha 64) on 8x NVIDIA H100 80GB. Round 1: 5K math samples. Round 2: 5K math + 10K MMLU-Pro + 27 expert reasoning pairs. Preserved IFEval while improving MATH and MMLU-Pro.

	Stage 3 — SLERP. Blended full-SFT knowledge model (142K mixed samples, 3000 steps on H100s) with LoRA-refined model. Weight sweep across t=0.25, 0.35, 0.45, 0.55. Winner: t=0.55 — best IFEval + MATH + MMLU-Pro balance.

	Stage 4 — Voice SFT. QLoRA on RTX 5090. 308 hand-crafted domain examples teaching direct, specific, no-hedging responses that name exact standards (45 CFR citations, NIST SP references, CARC codes). Combined with 530 Claude-generated IFEval constraint-following examples. This stage transformed the model from generic Qwen output to domain-expert voice.

	Expert reasoning pairs carved from a literary background and a poetic mind, infused with 20 years of cyber and software experience. Diagnostic frameworks. Root cause tracing. Cross-domain problem solving.

	## Scores

	Open LLM Leaderboard v2 benchmarks (lm-eval-harness, leaderboard_* tasks, chat template applied):

	\| Benchmark \| Score \|
	\|---\|---\|
	\| IFEval (prompt strict) \| 0.500 \|
	\| IFEval (inst strict) \| 0.605 \|
	\| MATH Hard \| 0.187 \|
	\| MMLU-Pro \| 0.420 \|
	\| BBH \| ~0.48 \|
	\| GPQA Diamond \| ~0.31 \|
	\| MuSR \| ~0.37 \|

	IFEval, MATH, MMLU-Pro from full eval on SLERP t=0.55 base. Voice SFT improved IFEval from 0.39 to 0.50 prompt strict. BBH, GPQA, MuSR from LoRA R1 eval.

	## What Makes Delphi Different

	Ask ChatGPT about a HIPAA breach and you get a Wikipedia article. Ask Delphi and you get the specific CFR citations, the exact steps for breach notification, the realistic timeline, and the business impact.

	Delphi names specific standards (45 CFR 164.312, NIST SP 800-66), specific tools (Mirth Connect, Prowler, Burp Suite), and specific codes (CARC CO-4, ICD-10). It connects technical findings to business impact. It does not hedge when it knows the answer. It says "I don't know" when it doesn't.

	## The Oracle Philosophy

	The ancient Oracle at Delphi did not give people answers. She gave them frames through which to understand their questions. That is the design philosophy: teach people how to think about the problem, not just what the answer is.

	## The Roadmap

	Chapter 1: Delphi-7B — General reasoning model. You are looking at it.

	Chapter 2: Delphi-72B-Cyber — Healthcare cybersecurity specialist. HIPAA, NIST RMF, pen test analysis, FDA submissions.

	Chapter 3: Delphi-Health — Trained on de-identified clinical data for targeted analysis.

	## Who Built This

	Thaddeus Dickson. CEO of Xpio Health, CISO, 20 years in healthcare cybersecurity and compliance.

	## License

	Apache 2.0.