Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

.gitattributes +1 -0
README.md +156 -0
chat_template.jinja +89 -0
config.json +86 -0
model.safetensors +3 -0
tokenizer.json +3 -0
tokenizer_config.json +30 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,156 @@

+---
+language:
+- en
+license: apache-2.0
+tags:
+- insurance
+- uk-insurance
+- llm
+- qwen3
+- qlora
+- dpo
+- fine-tuned
+- text-generation
+- claims
+- underwriting
+- bytical
+library_name: transformers
+pipeline_tag: text-generation
+base_model: Qwen/Qwen3-4B
+datasets:
+- piyushptiwari/insureos-training-data
+model-index:
+- name: InsureLLM-4B
+  results:
+  - task:
+      type: text-generation
+      name: Insurance Domain QA
+    metrics:
+    - type: rouge1
+      value: 0.384
+      name: ROUGE-1
+    - type: rougeL
+      value: 0.199
+      name: ROUGE-L
+    - type: custom
+      value: 0.25
+      name: Domain Score (8-prompt rubric)
+---
+# InsureLLM-4B — Insurance Domain Language Model
+**Created by [Bytical AI](https://bytical.ai)** — AI agents that run insurance operations.
+## Model Description
+InsureLLM-4B is a domain-specific language model fine-tuned for the UK and European insurance industry. Built on Qwen3-4B, it has been trained through a 3-stage pipeline:
+1. **QLoRA Fine-tuning** — 10,000 synthetic insurance SFT pairs covering claims, underwriting, regulation, pricing, and market structure
+2. **DPO Alignment** — 5,000 preference pairs teaching the model to prefer accurate, regulatory-compliant responses
+3. **Real-World Data Fine-tuning** — 3,685 SFT pairs from Wikipedia, UK legislation, HuggingFace insurance datasets, RSS feeds, and educational sources
+### Training Details
+| Parameter | Value |
+|-----------|-------|
+| Base Model | Qwen/Qwen3-4B |
+| Method | QLoRA (4-bit NF4) → DPO → Real-World QLoRA |
+| LoRA Rank | 64 |
+| LoRA Alpha | 128 |
+| Learning Rate | 2e-4 (QLoRA), 5e-7 (DPO), 2e-4 (Real-World) |
+| Epochs | 2 per stage |
+| Sequence Length | 1024 |
+| Batch Size | 2 (gradient accumulation 4) |
+| Optimizer | AdamW (paged, 8-bit) |
+| GPU | NVIDIA Tesla T4 16GB |
+| Total Training Time | ~20 hours across 3 stages |
+### Evaluation Results
+**Domain Knowledge (8-prompt rubric):**
+| Topic | Score |
+|-------|-------|
+| FCA Consumer Duty | 0.00 |
+| GDPR Data Protection | 0.00 |
+| Claims Process | 0.60 |
+| Fraud Indicators | 0.25 |
+| Lloyd's Market | 0.20 |
+| Pricing Fairness | 0.25 |
+| Subrogation | 0.50 |
+| Renewal Transparency | 0.20 |
+| **Average** | **0.25** |
+**Generation Quality:**
+| Metric | Score |
+|--------|-------|
+| ROUGE-1 | 0.384 |
+| ROUGE-2 | 0.109 |
+| ROUGE-L | 0.199 |
+### Intended Use
+- Insurance domain question answering
+- Claims process guidance
+- Underwriting knowledge retrieval
+- UK/EU regulatory compliance queries
+- Insurance terminology explanation
+- Part of a RAG pipeline for insurance operations
+### Limitations
+- 4B parameter model — smaller models may not reliably produce exact regulatory terminology
+- Best used with RAG (retrieval-augmented generation) using the companion [InsureSearch engine](https://huggingface.co/piyushptiwari/insureos-search-engine)
+- Trained primarily on UK insurance context; may be less accurate for other jurisdictions
+- Not a substitute for professional insurance or legal advice
+## How to Use
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("piyushptiwari/InsureLLM-4B")
+tokenizer = AutoTokenizer.from_pretrained("piyushptiwari/InsureLLM-4B")
+messages = [
+    {"role": "user", "content": "Explain the subrogation process in UK motor insurance."}
+]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+# Inject thinking tags to prevent infinite thinking loop
+text += "<think>\n</think>\n"
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
+response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
+print(response)
+```
+## Part of the INSUREOS Model Suite
+This model is part of the **INSUREOS** — a complete AI/ML suite for insurance operations built by Bytical AI:
+| Model | Task | Metric |
+|-------|------|--------|
+| **InsureLLM-4B** (this model) | Insurance domain LLM | ROUGE-1: 0.384 |
+| [InsureDocClassifier](https://huggingface.co/piyushptiwari/InsureDocClassifier) | 12-class document classification | F1: 1.0 |
+| [InsureNER](https://huggingface.co/piyushptiwari/InsureNER) | 13-entity Named Entity Recognition | F1: 1.0 |
+| [InsureFraudNet](https://huggingface.co/piyushptiwari/InsureFraudNet) | Fraud detection (Motor/Property/Liability) | AUC-ROC: 1.0 |
+| [InsurePricing](https://huggingface.co/piyushptiwari/InsurePricing) | Insurance pricing (GLM + EBM) | MAE: £11,132 |
+| [InsureSearch](https://huggingface.co/piyushptiwari/insureos-search-engine) | Hybrid search engine (Vector + BM25) | 33K docs indexed |
+## Citation
+```bibtex
+@misc{bytical2026insurellm,
+  title={InsureLLM-4B: A Domain-Specific Language Model for UK Insurance},
+  author={Bytical AI},
+  year={2026},
+  url={https://huggingface.co/piyushptiwari/InsureLLM-4B}
+}
+```
+## About Bytical AI
+[Bytical](https://bytical.ai) builds AI agents that run insurance operations — claims automation, underwriting intelligence, digital sales, and core system modernization for insurers across the UK and Europe. Microsoft AI Partner | NVIDIA | Salesforce.

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,89 @@

+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0].role == 'system' %}
+        {{- messages[0].content + '\n\n' }}
+    {%- endif %}
+    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+{%- for message in messages[::-1] %}
+    {%- set index = (messages|length - 1) - loop.index0 %}
+    {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
+        {%- set ns.multi_step_tool = false %}
+        {%- set ns.last_query_index = index %}
+    {%- endif %}
+{%- endfor %}
+{%- for message in messages %}
+    {%- if message.content is string %}
+        {%- set content = message.content %}
+    {%- else %}
+        {%- set content = '' %}
+    {%- endif %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {%- set reasoning_content = '' %}
+        {%- if message.reasoning_content is string %}
+            {%- set reasoning_content = message.reasoning_content %}
+        {%- else %}
+            {%- if '</think>' in content %}
+                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
+                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
+            {%- endif %}
+        {%- endif %}
+        {%- if loop.index0 > ns.last_query_index %}
+            {%- if loop.last or (not loop.last and reasoning_content) %}
+                {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
+            {%- else %}
+                {{- '<|im_start|>' + message.role + '\n' + content }}
+            {%- endif %}
+        {%- else %}
+            {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- endif %}
+        {%- if message.tool_calls %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if (loop.first and content) or (not loop.first) %}
+                    {{- '\n' }}
+                {%- endif %}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<tool_call>\n{"name": "' }}
+                {{- tool_call.name }}
+                {{- '", "arguments": ' }}
+                {%- if tool_call.arguments is string %}
+                    {{- tool_call.arguments }}
+                {%- else %}
+                    {{- tool_call.arguments | tojson }}
+                {%- endif %}
+                {{- '}\n</tool_call>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+    {%- if enable_thinking is defined and enable_thinking is false %}
+        {{- '<think>\n\n</think>\n\n' }}
+    {%- endif %}
+{%- endif %}

config.json ADDED Viewed

	@@ -0,0 +1,86 @@

+{
+  "architectures": [
+    "Qwen3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": null,
+  "dtype": "bfloat16",
+  "eos_token_id": 151645,
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 2560,
+  "initializer_range": 0.02,
+  "intermediate_size": 9728,
+  "layer_types": [
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention"
+  ],
+  "max_position_embeddings": 40960,
+  "max_window_layers": 36,
+  "model_type": "qwen3",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 36,
+  "num_key_value_heads": 8,
+  "pad_token_id": 151643,
+  "quantization_config": {
+    "_load_in_4bit": true,
+    "_load_in_8bit": false,
+    "bnb_4bit_compute_dtype": "bfloat16",
+    "bnb_4bit_quant_storage": "uint8",
+    "bnb_4bit_quant_type": "nf4",
+    "bnb_4bit_use_double_quant": true,
+    "llm_int8_enable_fp32_cpu_offload": false,
+    "llm_int8_has_fp16_weight": false,
+    "llm_int8_skip_modules": null,
+    "llm_int8_threshold": 6.0,
+    "load_in_4bit": true,
+    "load_in_8bit": false,
+    "quant_method": "bitsandbytes"
+  },
+  "rms_norm_eps": 1e-06,
+  "rope_parameters": {
+    "rope_theta": 1000000,
+    "rope_type": "default"
+  },
+  "sliding_window": null,
+  "tie_word_embeddings": true,
+  "transformers_version": "5.4.0",
+  "use_cache": false,
+  "use_sliding_window": false,
+  "vocab_size": 151936
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3a0c5dec9cad86bd79972a0ca756142dbb5d97acd37992a74d21a01d9ca4f61e
+size 3431046484

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:be75606093db2094d7cd20f3c2f385c212750648bd6ea4fb2bf507a6a4c55506
+size 11422650

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,30 @@

+{
+  "add_prefix_space": false,
+  "backend": "tokenizers",
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "is_local": true,
+  "model_max_length": 131072,
+  "pad_token": "<|endoftext|>",
+  "padding_side": "left",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}