Instructions to use piyushptiwari/InsureLLM-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use piyushptiwari/InsureLLM-4B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="piyushptiwari/InsureLLM-4B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("piyushptiwari/InsureLLM-4B")
model = AutoModelForCausalLM.from_pretrained("piyushptiwari/InsureLLM-4B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use piyushptiwari/InsureLLM-4B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "piyushptiwari/InsureLLM-4B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "piyushptiwari/InsureLLM-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/piyushptiwari/InsureLLM-4B

SGLang

How to use piyushptiwari/InsureLLM-4B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "piyushptiwari/InsureLLM-4B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "piyushptiwari/InsureLLM-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "piyushptiwari/InsureLLM-4B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "piyushptiwari/InsureLLM-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use piyushptiwari/InsureLLM-4B with Docker Model Runner:
```
docker model run hf.co/piyushptiwari/InsureLLM-4B
```

InsureLLM-4B — Insurance Domain Language Model

Created by Bytical AI — AI agents that run insurance operations.

Model Description

InsureLLM-4B is a domain-specific language model fine-tuned for the UK and European insurance industry. Built on Qwen3-4B, it has been trained through a 3-stage pipeline:

QLoRA Fine-tuning — 10,000 synthetic insurance SFT pairs covering claims, underwriting, regulation, pricing, and market structure
DPO Alignment — 5,000 preference pairs teaching the model to prefer accurate, regulatory-compliant responses
Real-World Data Fine-tuning — 3,685 SFT pairs from Wikipedia, UK legislation, HuggingFace insurance datasets, RSS feeds, and educational sources

Training Details

Parameter	Value
Base Model	Qwen/Qwen3-4B
Method	QLoRA (4-bit NF4) → DPO → Real-World QLoRA
LoRA Rank	64
LoRA Alpha	128
Learning Rate	2e-4 (QLoRA), 5e-7 (DPO), 2e-4 (Real-World)
Epochs	2 per stage
Sequence Length	1024
Batch Size	2 (gradient accumulation 4)
Optimizer	AdamW (paged, 8-bit)
GPU	NVIDIA Tesla T4 16GB
Total Training Time	~20 hours across 3 stages

Evaluation Results

Domain Knowledge (8-prompt rubric):

Topic	Score
FCA Consumer Duty	0.00
GDPR Data Protection	0.00
Claims Process	0.60
Fraud Indicators	0.25
Lloyd's Market	0.20
Pricing Fairness	0.25
Subrogation	0.50
Renewal Transparency	0.20
Average	0.25

Generation Quality:

Metric	Score
ROUGE-1	0.384
ROUGE-2	0.109
ROUGE-L	0.199

Intended Use

Insurance domain question answering
Claims process guidance
Underwriting knowledge retrieval
UK/EU regulatory compliance queries
Insurance terminology explanation
Part of a RAG pipeline for insurance operations

Limitations

4B parameter model — smaller models may not reliably produce exact regulatory terminology
Best used with RAG (retrieval-augmented generation) using the companion InsureSearch engine
Trained primarily on UK insurance context; may be less accurate for other jurisdictions
Not a substitute for professional insurance or legal advice

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("piyushptiwari/InsureLLM-4B")
tokenizer = AutoTokenizer.from_pretrained("piyushptiwari/InsureLLM-4B")

messages = [
    {"role": "user", "content": "Explain the subrogation process in UK motor insurance."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Inject thinking tags to prevent infinite thinking loop
text += "<think>\n</think>\n"

inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Learn How This Model Works

New to ML or insurance AI? We wrote a plain-English guide that explains the concepts, the data, and the actual training code behind every INSUREOS model — from tabular models to this LLM:

Learning guide: LEARN.md
Training code: piyushptiwari/insureos-models — this model is trained by training/qlora_finetune.py and training/dpo_train.py

Part of the INSUREOS Model Suite

This model is part of the INSUREOS — a complete AI/ML suite for insurance operations built by Bytical AI:

Model	Task	Metric
InsureLLM-4B (this model)	Insurance domain LLM	ROUGE-1: 0.384
InsureDocClassifier	12-class document classification	F1: 1.0
InsureNER	13-entity Named Entity Recognition	F1: 1.0
InsureFraudNet	Fraud detection (Motor/Property/Liability)	AUC-ROC: 1.0
InsurePricing	Insurance pricing (GLM + EBM)	MAE: £11,132
InsureSearch	Hybrid search engine (Vector + BM25)	33K docs indexed

Citation

@misc{bytical2026insurellm,
  title={InsureLLM-4B: A Domain-Specific Language Model for UK Insurance},
  author={Bytical AI},
  year={2026},
  url={https://huggingface.co/piyushptiwari/InsureLLM-4B}
}

About Bytical AI

Bytical builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.

Downloads last month: 10

Safetensors

Model size

5B params

Tensor type

F32

BF16

Model tree for piyushptiwari/InsureLLM-4B

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Quantized

(255)

this model

Dataset used to train piyushptiwari/InsureLLM-4B

Evaluation results

ROUGE-1
self-reported

0.384
ROUGE-L
self-reported

0.199
Domain Score (8-prompt rubric)
self-reported

0.250