Text Generation
Transformers
Safetensors
English
qwen2
mergekit
model_stock
slerp
lora
healthcare
cybersecurity
reasoning
conversational
text-generation-inference
Instructions to use thaddickson/Delphi-7B-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use thaddickson/Delphi-7B-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="thaddickson/Delphi-7B-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("thaddickson/Delphi-7B-v1") model = AutoModelForCausalLM.from_pretrained("thaddickson/Delphi-7B-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use thaddickson/Delphi-7B-v1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "thaddickson/Delphi-7B-v1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "thaddickson/Delphi-7B-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/thaddickson/Delphi-7B-v1
- SGLang
How to use thaddickson/Delphi-7B-v1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "thaddickson/Delphi-7B-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "thaddickson/Delphi-7B-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "thaddickson/Delphi-7B-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "thaddickson/Delphi-7B-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use thaddickson/Delphi-7B-v1 with Docker Model Runner:
docker model run hf.co/thaddickson/Delphi-7B-v1
| language: | |
| - en | |
| license: apache-2.0 | |
| library_name: transformers | |
| tags: | |
| - mergekit | |
| - model_stock | |
| - slerp | |
| - lora | |
| - qwen2 | |
| - healthcare | |
| - cybersecurity | |
| - reasoning | |
| base_model: | |
| - newsbang/Homer-v1.0-Qwen2.5-7B | |
| - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | |
| - fblgit/cybertron-v4-qw7B-MGS | |
| - bespokelabs/Bespoke-Stratos-7B | |
| - Qwen/Qwen2.5-Math-7B-Instruct | |
| - Orion-zhen/Qwen2.5-7B-Instruct-Uncensored | |
| # Delphi-7B-v1 | |
| **Delphi Cyber Pro | Truth wrapped in complexity** | |
| ## The Model | |
| Delphi-7B is a 7.6B parameter reasoning model built for healthcare cybersecurity, clinical operations, and cross-domain problem solving. It combines a 6-model merge of Qwen 2.5 7B specialists with multi-stage training: LoRA refinement, SLERP blending, and voice SFT from expert reasoning pairs. | |
| Built by Thaddeus Dickson, CEO of Xpio Health. 20 years of healthcare cybersecurity and compliance expertise baked into the training data. | |
| This is Chapter 1 of a three-part build. The general model proves the pipeline. What comes next is the point. | |
| ## Source Models | |
| | Model | Role | | |
| |---|---| | |
| | Homer-v1.0-Qwen2.5-7B | Base. Strongest instruction follower in the 7B bracket. | | |
| | DeepSeek-R1-Distill-Qwen-7B | Reasoning. Chain-of-thought distilled from DeepSeek-R1. | | |
| | cybertron-v4-qw7B-MGS | Math and multi-task. Shows up in every competitive 7B merge. | | |
| | Bespoke-Stratos-7B | Reasoning distillation. 17K clean examples, proven results. | | |
| | Qwen2.5-Math-7B-Instruct | Pure math specialist. | | |
| | Qwen2.5-7B-Instruct-Uncensored | Breadth. Says what it means. | | |
| ## Training Pipeline | |
| **Stage 1 β Merge.** model_stock merge of 6 specialists using mergekit. Homer base provides instruction following, DeepSeek-R1-Distill brings chain-of-thought, cybertron and Math-7B cover quantitative tasks. | |
| **Stage 2 β LoRA.** Two rounds of LoRA refinement (rank 32, alpha 64) on 8x NVIDIA H100 80GB. Round 1: 5K math samples. Round 2: 5K math + 10K MMLU-Pro + 27 expert reasoning pairs. Preserved IFEval while improving MATH and MMLU-Pro. | |
| **Stage 3 β SLERP.** Blended full-SFT knowledge model (142K mixed samples, 3000 steps on H100s) with LoRA-refined model. Weight sweep across t=0.25, 0.35, 0.45, 0.55. Winner: t=0.55 β best IFEval + MATH + MMLU-Pro balance. | |
| **Stage 4 β Voice SFT.** QLoRA on RTX 5090. 308 hand-crafted domain examples teaching direct, specific, no-hedging responses that name exact standards (45 CFR citations, NIST SP references, CARC codes). Combined with 530 Claude-generated IFEval constraint-following examples. This stage transformed the model from generic Qwen output to domain-expert voice. | |
| Expert reasoning pairs carved from a literary background and a poetic mind, infused with 20 years of cyber and software experience. Diagnostic frameworks. Root cause tracing. Cross-domain problem solving. | |
| ## Scores | |
| Open LLM Leaderboard v2 benchmarks (lm-eval-harness, leaderboard_* tasks, chat template applied): | |
| | Benchmark | Score | | |
| |---|---| | |
| | IFEval (prompt strict) | 0.500 | | |
| | IFEval (inst strict) | 0.605 | | |
| | MATH Hard | 0.187 | | |
| | MMLU-Pro | 0.420 | | |
| | BBH | ~0.48 | | |
| | GPQA Diamond | ~0.31 | | |
| | MuSR | ~0.37 | | |
| IFEval, MATH, MMLU-Pro from full eval on SLERP t=0.55 base. Voice SFT improved IFEval from 0.39 to 0.50 prompt strict. BBH, GPQA, MuSR from LoRA R1 eval. | |
| ## What Makes Delphi Different | |
| Ask ChatGPT about a HIPAA breach and you get a Wikipedia article. Ask Delphi and you get the specific CFR citations, the exact steps for breach notification, the realistic timeline, and the business impact. | |
| Delphi names specific standards (45 CFR 164.312, NIST SP 800-66), specific tools (Mirth Connect, Prowler, Burp Suite), and specific codes (CARC CO-4, ICD-10). It connects technical findings to business impact. It does not hedge when it knows the answer. It says "I don't know" when it doesn't. | |
| ## The Oracle Philosophy | |
| The ancient Oracle at Delphi did not give people answers. She gave them frames through which to understand their questions. That is the design philosophy: teach people how to think about the problem, not just what the answer is. | |
| ## The Roadmap | |
| **Chapter 1: Delphi-7B** β General reasoning model. You are looking at it. | |
| **Chapter 2: Delphi-72B-Cyber** β Healthcare cybersecurity specialist. HIPAA, NIST RMF, pen test analysis, FDA submissions. | |
| **Chapter 3: Delphi-Health** β Trained on de-identified clinical data for targeted analysis. | |
| ## Who Built This | |
| Thaddeus Dickson. CEO of Xpio Health, CISO, 20 years in healthcare cybersecurity and compliance. | |
| ## License | |
| Apache 2.0. | |