sdlc-agent / docs /guides /ft_process.md
Veeru-c's picture
initial commit
23f437b

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Fine-Tuning Process for Japan Insurance Product Design

Overview

This document outlines the recommended approach for fine-tuning a language model using LoRA for Japan-specific insurance product design.

1. Base Model Selection

Recommended Models

  • For Japanese:
    • Llama 3.1 8B/70B (multilingual, good Japanese support)
    • Qwen 2.5 (excellent Asian language performance)
    • Japanese-specific: rinna/japanese-gpt-neox-3.6b or cyberagent/open-calm-7b
  • For English with Japanese context: Llama 3.1 or Mistral 7B

2. Dataset Preparation

Data Sources

A. Census/Demographics Data (from e-Stat)

  • Population age distribution
  • Income levels by region
  • Household composition
  • Employment statistics

B. Insurance Domain Data

  • Existing insurance product documents
  • Coverage details, exclusions, premiums
  • Target demographics for existing products

C. Synthetic Training Data

Create QA pairs or instruction-tuning format:

{
  "instruction": "Design an insurance product for Tokyo residents aged 30-45 with average household income of ¥6M",
  "input": "Demographics: Tokyo, Age 30-45, Income ¥6M, Household size 3",
  "output": "Recommended product: Family Health Insurance with..."
}

3. LoRA Configuration

Recommended Hyperparameters

from peft import LoraConfig

lora_config = LoraConfig(
    r=16,                    # Rank (start with 16, can go up to 64)
    lora_alpha=32,           # Scaling factor (typically 2x rank)
    target_modules=["q_proj", "v_proj", "k_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

Parameter Tuning Guide

  • r (rank): Start with 16, increase to 32-64 for more capacity
  • lora_alpha: Typically 2x the rank value
  • target_modules: Focus on attention layers for efficiency
  • lora_dropout: 0.05-0.1 for regularization

4. Training Framework

Option 1: Unsloth (Recommended - Fastest)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/llama-3-8b-bnb-4bit",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)

model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",
)

Option 2: Axolotl (More Configurable)

# config.yml
base_model: meta-llama/Llama-3.1-8B
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

adapter: lora
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_modules:
  - q_proj
  - v_proj
  - k_proj
  - o_proj

datasets:
  - path: data/insurance_training.jsonl
    type: alpaca

sequence_len: 2048
micro_batch_size: 4
gradient_accumulation_steps: 4
num_epochs: 3
learning_rate: 0.0002

5. Data Processing Pipeline

Step-by-Step Process

  1. Extract Census Data

    python3 download_census_data.py --workers 100
    
  2. Convert to Structured Format

    • Parse Excel/CSV files
    • Extract key demographics (age, income, location, household)
    • Create demographic profiles
  3. Combine with Insurance Documents

    • Extract text from insurance PDFs
    • Create context-aware examples
    • Map demographics to product features
  4. Generate Training Pairs

    • Use GPT-4/Claude to create synthetic examples
    • Format: Instruction → Input → Output
    • Include diverse scenarios
  5. Format for Training

    • Convert to Alpaca or ShareGPT format
    • Split into train/validation sets (90/10)
    • Save as JSONL

Example Training Data Format

{
  "instruction": "Based on the following demographic data, design an appropriate insurance product.",
  "input": "Location: Tokyo\nAge Group: 30-45\nAverage Income: ¥6,000,000\nHousehold Size: 3\nEmployment: Full-time",
  "output": "Product Recommendation: Comprehensive Family Health Insurance\n\nKey Features:\n- Coverage: ¥10M medical expenses\n- Premium: ¥15,000/month\n- Target: Young families with stable income\n- Benefits: Hospitalization, outpatient, dental\n- Exclusions: Pre-existing conditions (first year)\n\nRationale: This demographic shows stable income and family responsibility, making comprehensive health coverage with moderate premiums ideal."
}

6. Training Configuration

Recommended Settings

training_args = TrainingArguments(
    output_dir="./insurance-lora",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    fp16=True,
    logging_steps=10,
    save_strategy="epoch",
    evaluation_strategy="epoch",
    warmup_ratio=0.1,
    lr_scheduler_type="cosine",
)

7. Evaluation Strategy

Quantitative Metrics

  • Perplexity: Measure on held-out insurance product descriptions
  • BLEU/ROUGE: Compare generated products to reference designs
  • Accuracy: Classification of appropriate product types

Qualitative Evaluation

  • Human Review: Insurance experts evaluate product designs
  • Coherence: Check logical consistency of recommendations
  • Domain Accuracy: Verify compliance with insurance regulations

Domain-Specific Tests

  • Test on real demographic scenarios
  • Validate premium calculations
  • Check coverage appropriateness

8. Deployment Considerations

Model Serving

  • Use vLLM or TGI for efficient inference
  • Quantize to 4-bit for production (GPTQ/AWQ)
  • Deploy on Modal, RunPod, or local GPU

Monitoring

  • Track inference latency
  • Monitor output quality
  • Collect user feedback for continuous improvement

Next Steps

  1. Data Processing: Create script to convert census data to training format
  2. Training Pipeline: Set up Unsloth/Axolotl environment
  3. Synthetic Generation: Use LLM to create insurance examples from demographics
  4. Fine-tune: Run LoRA training
  5. Evaluate: Test on held-out scenarios
  6. Deploy: Serve model for product design queries