Spaces:

MCP-1st-Birthday
/

sdlc-agent

Runtime error

App Files Files Community

sdlc-agent / docs /guides /ft_process.md

Veeru-c

initial commit

23f437b 17 days ago

preview code

raw

history blame contribute delete

6.03 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

Fine-Tuning Process for Japan Insurance Product Design

Overview

This document outlines the recommended approach for fine-tuning a language model using LoRA for Japan-specific insurance product design.

1. Base Model Selection

Recommended Models

For Japanese:
- Llama 3.1 8B/70B (multilingual, good Japanese support)
- Qwen 2.5 (excellent Asian language performance)
- Japanese-specific: rinna/japanese-gpt-neox-3.6b or cyberagent/open-calm-7b
For English with Japanese context: Llama 3.1 or Mistral 7B

2. Dataset Preparation

Data Sources

A. Census/Demographics Data (from e-Stat)

Population age distribution
Income levels by region
Household composition
Employment statistics

B. Insurance Domain Data

Existing insurance product documents
Coverage details, exclusions, premiums
Target demographics for existing products

C. Synthetic Training Data

Create QA pairs or instruction-tuning format:

{
  "instruction": "Design an insurance product for Tokyo residents aged 30-45 with average household income of ¥6M",
  "input": "Demographics: Tokyo, Age 30-45, Income ¥6M, Household size 3",
  "output": "Recommended product: Family Health Insurance with..."
}

3. LoRA Configuration

Recommended Hyperparameters

from peft import LoraConfig

lora_config = LoraConfig(
    r=16,                    # Rank (start with 16, can go up to 64)
    lora_alpha=32,           # Scaling factor (typically 2x rank)
    target_modules=["q_proj", "v_proj", "k_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

Parameter Tuning Guide

r (rank): Start with 16, increase to 32-64 for more capacity
lora_alpha: Typically 2x the rank value
target_modules: Focus on attention layers for efficiency
lora_dropout: 0.05-0.1 for regularization

4. Training Framework

Option 1: Unsloth (Recommended - Fastest)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/llama-3-8b-bnb-4bit",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)

model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",
)

Option 2: Axolotl (More Configurable)

# config.yml
base_model: meta-llama/Llama-3.1-8B
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

adapter: lora
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_modules:
  - q_proj
  - v_proj
  - k_proj
  - o_proj

datasets:
  - path: data/insurance_training.jsonl
    type: alpaca

sequence_len: 2048
micro_batch_size: 4
gradient_accumulation_steps: 4
num_epochs: 3
learning_rate: 0.0002

5. Data Processing Pipeline

Step-by-Step Process

Extract Census Data

python3 download_census_data.py --workers 100

Convert to Structured Format
- Parse Excel/CSV files
- Extract key demographics (age, income, location, household)
- Create demographic profiles
Combine with Insurance Documents
- Extract text from insurance PDFs
- Create context-aware examples
- Map demographics to product features
Generate Training Pairs
- Use GPT-4/Claude to create synthetic examples
- Format: Instruction → Input → Output
- Include diverse scenarios
Format for Training
- Convert to Alpaca or ShareGPT format
- Split into train/validation sets (90/10)
- Save as JSONL

Example Training Data Format

{
  "instruction": "Based on the following demographic data, design an appropriate insurance product.",
  "input": "Location: Tokyo\nAge Group: 30-45\nAverage Income: ¥6,000,000\nHousehold Size: 3\nEmployment: Full-time",
  "output": "Product Recommendation: Comprehensive Family Health Insurance\n\nKey Features:\n- Coverage: ¥10M medical expenses\n- Premium: ¥15,000/month\n- Target: Young families with stable income\n- Benefits: Hospitalization, outpatient, dental\n- Exclusions: Pre-existing conditions (first year)\n\nRationale: This demographic shows stable income and family responsibility, making comprehensive health coverage with moderate premiums ideal."
}

6. Training Configuration

Recommended Settings

training_args = TrainingArguments(
    output_dir="./insurance-lora",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    fp16=True,
    logging_steps=10,
    save_strategy="epoch",
    evaluation_strategy="epoch",
    warmup_ratio=0.1,
    lr_scheduler_type="cosine",
)

7. Evaluation Strategy

Quantitative Metrics

Perplexity: Measure on held-out insurance product descriptions
BLEU/ROUGE: Compare generated products to reference designs
Accuracy: Classification of appropriate product types

Qualitative Evaluation

Human Review: Insurance experts evaluate product designs
Coherence: Check logical consistency of recommendations
Domain Accuracy: Verify compliance with insurance regulations

Domain-Specific Tests

Test on real demographic scenarios
Validate premium calculations
Check coverage appropriateness

8. Deployment Considerations

Model Serving

Use vLLM or TGI for efficient inference
Quantize to 4-bit for production (GPTQ/AWQ)
Deploy on Modal, RunPod, or local GPU

Monitoring

Track inference latency
Monitor output quality
Collect user feedback for continuous improvement

Next Steps

Data Processing: Create script to convert census data to training format
Training Pipeline: Set up Unsloth/Axolotl environment
Synthetic Generation: Use LLM to create insurance examples from demographics
Fine-tune: Run LoRA training
Evaluate: Test on held-out scenarios
Deploy: Serve model for product design queries