Instructions to use thaddickson/Delphi-7B-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use thaddickson/Delphi-7B-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="thaddickson/Delphi-7B-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("thaddickson/Delphi-7B-v1")
model = AutoModelForCausalLM.from_pretrained("thaddickson/Delphi-7B-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use thaddickson/Delphi-7B-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "thaddickson/Delphi-7B-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thaddickson/Delphi-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/thaddickson/Delphi-7B-v1

SGLang

How to use thaddickson/Delphi-7B-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "thaddickson/Delphi-7B-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thaddickson/Delphi-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "thaddickson/Delphi-7B-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thaddickson/Delphi-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use thaddickson/Delphi-7B-v1 with Docker Model Runner:
```
docker model run hf.co/thaddickson/Delphi-7B-v1
```

thaddickson commited on Mar 17

Commit

45ef17c

verified ·

1 Parent(s): 9775331

Update model card for voice-v3

Browse files

Files changed (1) hide show

README.md +32 -27

README.md CHANGED Viewed

@@ -7,6 +7,7 @@ tags:
   - mergekit
   - model_stock
   - slerp
   - qwen2
   - healthcare
   - cybersecurity
@@ -26,7 +27,9 @@ base_model:
 ## The Model
-Delphi-7B is a 7.6B parameter reasoning model built on a 6-model merge of Qwen 2.5 7B specialists, trained with mixed-domain SFT on 5090s and H100 GPUs in a weekend. It combines math reasoning, chain-of-thought logic, and instruction following into a single general-purpose model — then seasons it with hand-written expert reasoning pairs from two decades of healthcare cybersecurity work.
 This is Chapter 1 of a three-part build. The general model proves the pipeline. What comes next is the point.
@@ -41,53 +44,55 @@ This is Chapter 1 of a three-part build. The general model proves the pipeline.
 | Qwen2.5-Math-7B-Instruct | Pure math specialist. |
 | Qwen2.5-7B-Instruct-Uncensored | Breadth. Says what it means. |
-Merged with model_stock, normalize: false, int8_mask: true, bfloat16.
 ## Training Pipeline
-**Stage 1 — Merge.** model_stock merge of 6 Qwen 2.5 7B specialists using mergekit. Homer base provides instruction following, DeepSeek-R1-Distill brings chain-of-thought reasoning, cybertron and Math-7B cover quantitative tasks, Stratos adds reasoning distillation, Uncensored adds breadth. Logit lens confirmed clean merge with 0% oscillation.
-**Stage 2 — SFT.** Full bf16 SFT on 8x NVIDIA H100 80GB. Mixed data: 40% math, 25% instruction, 20% reasoning, 15% general knowledge. 142K samples, lr=1e-5, 3000 steps, effective batch 64. Diagnostic callbacks every 500 steps confirmed zero catastrophic forgetting across all checkpoints.
-**Stage 3 — LoRA.** LoRA refinement (rank 32, alpha 64) targeting math reasoning and instruction following recovery. 15K samples including 5K math, 10K MMLU-Pro, and 27 hand-crafted expert reasoning pairs.
-**Stage 4 — SLERP.** SLERP merge combining SFT knowledge depth with LoRA instruction discipline. Weight sweep across four t values (0.25, 0.35, 0.45, 0.55), evaluated on divergent benchmarks to find optimal balance. Winner: t=0.55.
-Includes hand-crafted expert reasoning pairs carved from a literary background and a poetic mind, infused with 20 years of cyber and software experience. Diagnostic frameworks. Root cause tracing. Cross-domain problem solving. Art, science, and philosophy, pointed towards the north star of good, fairness, and reasonable cognition.
 ## Scores
-Open LLM Leaderboard v2 benchmarks. All scores from lm-eval-harness using leaderboard_* tasks, full sample, chat template applied.
-| Benchmark | Delphi-7B | Falcon3-7B-Instruct (#1 on LB) | Delta |
-|---|---|---|---|
-| IFEval (strict) | 0.4573 | 0.7612 | -0.304 |
-| BBH | ~0.52* | 0.3792 | +0.14 |
-| MATH Level 5 | 0.1873 | 0.3187 | -0.131 |
-| GPQA Diamond | ~0.31* | 0.0805 | +0.23 |
-| MuSR | ~0.43* | 0.2117 | +0.22 |
-| MMLU-Pro | 0.4198 | 0.3430 | +0.077 |
-| **Average** | **~0.387** | **0.3491** | **+0.038** |
-*BBH, GPQA, MuSR projected from SFT v1 baseline — full eval pending. IFEval, MATH, MMLU-Pro confirmed on SLERP t=0.55 winner.*
-## Evaluation Methodology
-All scores from [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness) using `leaderboard_*` tasks, full sample count, chat template applied via `--apply_chat_template`. IFEval score is the average of `prompt_level_strict_acc` and `inst_level_strict_acc` per leaderboard methodology.
 ## The Roadmap
-**Chapter 1: Delphi-7B** — General reasoning model. Open LLM Leaderboard v2. You're looking at it.
-**Chapter 2: Delphi-72B-Cyber** — Healthcare cybersecurity specialist. HIPAA risk assessment. NIST RMF mapping. Pen test analysis. FDA 510(k) submissions. Trained on domain expertise.
-**Chapter 3: Delphi-Health** — Trained on de-identified data from blended sources for targeted analysis.
 ## Who Built This
-Thaddeus Dickson. CEO of Xpio Health, Co-Founder of Pryzmatech, and CTO and CISO across healthcare cybersecurity, compliance, and healthcare domains.
-The Oracle of Delphi is the philosophy: don't give people answers. Teach them how to think about the problem.
 ## License

   - mergekit
   - model_stock
   - slerp
+  - lora
   - qwen2
   - healthcare
   - cybersecurity
 ## The Model
+Delphi-7B is a 7.6B parameter reasoning model built for healthcare cybersecurity, clinical operations, and cross-domain problem solving. It combines a 6-model merge of Qwen 2.5 7B specialists with multi-stage training: LoRA refinement, SLERP blending, and voice SFT from expert reasoning pairs.
+Built by Thaddeus Dickson, CEO of Xpio Health. 20 years of healthcare cybersecurity and compliance expertise baked into the training data.
 This is Chapter 1 of a three-part build. The general model proves the pipeline. What comes next is the point.
 | Qwen2.5-Math-7B-Instruct | Pure math specialist. |
 | Qwen2.5-7B-Instruct-Uncensored | Breadth. Says what it means. |
 ## Training Pipeline
+**Stage 1 — Merge.** model_stock merge of 6 specialists using mergekit. Homer base provides instruction following, DeepSeek-R1-Distill brings chain-of-thought, cybertron and Math-7B cover quantitative tasks.
+**Stage 2 — LoRA.** Two rounds of LoRA refinement (rank 32, alpha 64) on 8x NVIDIA H100 80GB. Round 1: 5K math samples. Round 2: 5K math + 10K MMLU-Pro + 27 expert reasoning pairs. Preserved IFEval while improving MATH and MMLU-Pro.
+**Stage 3 — SLERP.** Blended full-SFT knowledge model (142K mixed samples, 3000 steps on H100s) with LoRA-refined model. Weight sweep across t=0.25, 0.35, 0.45, 0.55. Winner: t=0.55 — best IFEval + MATH + MMLU-Pro balance.
+**Stage 4 — Voice SFT.** QLoRA on RTX 5090. 308 hand-crafted domain examples teaching direct, specific, no-hedging responses that name exact standards (45 CFR citations, NIST SP references, CARC codes). Combined with 530 Claude-generated IFEval constraint-following examples. This stage transformed the model from generic Qwen output to domain-expert voice.
+Expert reasoning pairs carved from a literary background and a poetic mind, infused with 20 years of cyber and software experience. Diagnostic frameworks. Root cause tracing. Cross-domain problem solving.
 ## Scores
+Open LLM Leaderboard v2 benchmarks (lm-eval-harness, leaderboard_* tasks, chat template applied):
+| Benchmark | Score |
+|---|---|
+| IFEval (prompt strict) | 0.500 |
+| IFEval (inst strict) | 0.605 |
+| MATH Hard | 0.187 |
+| MMLU-Pro | 0.420 |
+| BBH | ~0.48 |
+| GPQA Diamond | ~0.31 |
+| MuSR | ~0.37 |
+IFEval, MATH, MMLU-Pro from full eval on SLERP t=0.55 base. Voice SFT improved IFEval from 0.39 to 0.50 prompt strict. BBH, GPQA, MuSR from LoRA R1 eval.
+## What Makes Delphi Different
+Ask ChatGPT about a HIPAA breach and you get a Wikipedia article. Ask Delphi and you get the specific CFR citations, the exact steps for breach notification, the realistic timeline, and the business impact.
+Delphi names specific standards (45 CFR 164.312, NIST SP 800-66), specific tools (Mirth Connect, Prowler, Burp Suite), and specific codes (CARC CO-4, ICD-10). It connects technical findings to business impact. It does not hedge when it knows the answer. It says "I don't know" when it doesn't.
+## The Oracle Philosophy
+The ancient Oracle at Delphi did not give people answers. She gave them frames through which to understand their questions. That is the design philosophy: teach people how to think about the problem, not just what the answer is.
 ## The Roadmap
+**Chapter 1: Delphi-7B** — General reasoning model. You are looking at it.
+**Chapter 2: Delphi-72B-Cyber** — Healthcare cybersecurity specialist. HIPAA, NIST RMF, pen test analysis, FDA submissions.
+**Chapter 3: Delphi-Health** — Trained on de-identified clinical data for targeted analysis.
 ## Who Built This
+Thaddeus Dickson. CEO of Xpio Health, CISO, 20 years in healthcare cybersecurity and compliance.
 ## License