sdlc-agent / docs /guides /ft_process.md
Veeru-c's picture
initial commit
23f437b
# Fine-Tuning Process for Japan Insurance Product Design
## Overview
This document outlines the recommended approach for fine-tuning a language model using LoRA for Japan-specific insurance product design.
## 1. Base Model Selection
### Recommended Models
- **For Japanese**:
- **Llama 3.1 8B/70B** (multilingual, good Japanese support)
- **Qwen 2.5** (excellent Asian language performance)
- **Japanese-specific**: `rinna/japanese-gpt-neox-3.6b` or `cyberagent/open-calm-7b`
- **For English with Japanese context**: Llama 3.1 or Mistral 7B
## 2. Dataset Preparation
### Data Sources
#### A. Census/Demographics Data (from e-Stat)
- Population age distribution
- Income levels by region
- Household composition
- Employment statistics
#### B. Insurance Domain Data
- Existing insurance product documents
- Coverage details, exclusions, premiums
- Target demographics for existing products
#### C. Synthetic Training Data
Create QA pairs or instruction-tuning format:
```json
{
"instruction": "Design an insurance product for Tokyo residents aged 30-45 with average household income of ¥6M",
"input": "Demographics: Tokyo, Age 30-45, Income ¥6M, Household size 3",
"output": "Recommended product: Family Health Insurance with..."
}
```
## 3. LoRA Configuration
### Recommended Hyperparameters
```python
from peft import LoraConfig
lora_config = LoraConfig(
r=16, # Rank (start with 16, can go up to 64)
lora_alpha=32, # Scaling factor (typically 2x rank)
target_modules=["q_proj", "v_proj", "k_proj", "o_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
```
### Parameter Tuning Guide
- **r (rank)**: Start with 16, increase to 32-64 for more capacity
- **lora_alpha**: Typically 2x the rank value
- **target_modules**: Focus on attention layers for efficiency
- **lora_dropout**: 0.05-0.1 for regularization
## 4. Training Framework
### Option 1: Unsloth (Recommended - Fastest)
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/llama-3-8b-bnb-4bit",
max_seq_length = 2048,
dtype = None,
load_in_4bit = True,
)
model = FastLanguageModel.get_peft_model(
model,
r = 16,
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj"],
lora_alpha = 16,
lora_dropout = 0,
bias = "none",
)
```
### Option 2: Axolotl (More Configurable)
```yaml
# config.yml
base_model: meta-llama/Llama-3.1-8B
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer
adapter: lora
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_modules:
- q_proj
- v_proj
- k_proj
- o_proj
datasets:
- path: data/insurance_training.jsonl
type: alpaca
sequence_len: 2048
micro_batch_size: 4
gradient_accumulation_steps: 4
num_epochs: 3
learning_rate: 0.0002
```
## 5. Data Processing Pipeline
### Step-by-Step Process
1. **Extract Census Data**
```bash
python3 download_census_data.py --workers 100
```
2. **Convert to Structured Format**
- Parse Excel/CSV files
- Extract key demographics (age, income, location, household)
- Create demographic profiles
3. **Combine with Insurance Documents**
- Extract text from insurance PDFs
- Create context-aware examples
- Map demographics to product features
4. **Generate Training Pairs**
- Use GPT-4/Claude to create synthetic examples
- Format: Instruction → Input → Output
- Include diverse scenarios
5. **Format for Training**
- Convert to Alpaca or ShareGPT format
- Split into train/validation sets (90/10)
- Save as JSONL
### Example Training Data Format
```json
{
"instruction": "Based on the following demographic data, design an appropriate insurance product.",
"input": "Location: Tokyo\nAge Group: 30-45\nAverage Income: ¥6,000,000\nHousehold Size: 3\nEmployment: Full-time",
"output": "Product Recommendation: Comprehensive Family Health Insurance\n\nKey Features:\n- Coverage: ¥10M medical expenses\n- Premium: ¥15,000/month\n- Target: Young families with stable income\n- Benefits: Hospitalization, outpatient, dental\n- Exclusions: Pre-existing conditions (first year)\n\nRationale: This demographic shows stable income and family responsibility, making comprehensive health coverage with moderate premiums ideal."
}
```
## 6. Training Configuration
### Recommended Settings
```python
training_args = TrainingArguments(
output_dir="./insurance-lora",
num_train_epochs=3,
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
learning_rate=2e-4,
fp16=True,
logging_steps=10,
save_strategy="epoch",
evaluation_strategy="epoch",
warmup_ratio=0.1,
lr_scheduler_type="cosine",
)
```
## 7. Evaluation Strategy
### Quantitative Metrics
- **Perplexity**: Measure on held-out insurance product descriptions
- **BLEU/ROUGE**: Compare generated products to reference designs
- **Accuracy**: Classification of appropriate product types
### Qualitative Evaluation
- **Human Review**: Insurance experts evaluate product designs
- **Coherence**: Check logical consistency of recommendations
- **Domain Accuracy**: Verify compliance with insurance regulations
### Domain-Specific Tests
- Test on real demographic scenarios
- Validate premium calculations
- Check coverage appropriateness
## 8. Deployment Considerations
### Model Serving
- Use vLLM or TGI for efficient inference
- Quantize to 4-bit for production (GPTQ/AWQ)
- Deploy on Modal, RunPod, or local GPU
### Monitoring
- Track inference latency
- Monitor output quality
- Collect user feedback for continuous improvement
## Next Steps
1. **Data Processing**: Create script to convert census data to training format
2. **Training Pipeline**: Set up Unsloth/Axolotl environment
3. **Synthetic Generation**: Use LLM to create insurance examples from demographics
4. **Fine-tune**: Run LoRA training
5. **Evaluate**: Test on held-out scenarios
6. **Deploy**: Serve model for product design queries