Spaces:

MCP-1st-Birthday
/

sdlc-agent

Runtime error

App Files Files Community

sdlc-agent / docs /guides /ft_process.md

Veeru-c

initial commit

23f437b 18 days ago

preview code

raw

history blame contribute delete

6.03 kB

	# Fine-Tuning Process for Japan Insurance Product Design

	## Overview
	This document outlines the recommended approach for fine-tuning a language model using LoRA for Japan-specific insurance product design.

	## 1. Base Model Selection

	### Recommended Models
	- For Japanese:
	- Llama 3.1 8B/70B (multilingual, good Japanese support)
	- Qwen 2.5 (excellent Asian language performance)
	- Japanese-specific: `rinna/japanese-gpt-neox-3.6b` or `cyberagent/open-calm-7b`
	- For English with Japanese context: Llama 3.1 or Mistral 7B

	## 2. Dataset Preparation

	### Data Sources

	#### A. Census/Demographics Data (from e-Stat)
	- Population age distribution
	- Income levels by region
	- Household composition
	- Employment statistics

	#### B. Insurance Domain Data
	- Existing insurance product documents
	- Coverage details, exclusions, premiums
	- Target demographics for existing products

	#### C. Synthetic Training Data
	Create QA pairs or instruction-tuning format:

	```json
	{
	"instruction": "Design an insurance product for Tokyo residents aged 30-45 with average household income of ¥6M",
	"input": "Demographics: Tokyo, Age 30-45, Income ¥6M, Household size 3",
	"output": "Recommended product: Family Health Insurance with..."
	}
	```

	## 3. LoRA Configuration

	### Recommended Hyperparameters

	```python
	from peft import LoraConfig

	lora_config = LoraConfig(
	r=16, # Rank (start with 16, can go up to 64)
	lora_alpha=32, # Scaling factor (typically 2x rank)
	target_modules=["q_proj", "v_proj", "k_proj", "o_proj"],
	lora_dropout=0.05,
	bias="none",
	task_type="CAUSAL_LM"
	)
	```

	### Parameter Tuning Guide
	- r (rank): Start with 16, increase to 32-64 for more capacity
	- lora_alpha: Typically 2x the rank value
	- target_modules: Focus on attention layers for efficiency
	- lora_dropout: 0.05-0.1 for regularization

	## 4. Training Framework

	### Option 1: Unsloth (Recommended - Fastest)

	```python
	from unsloth import FastLanguageModel

	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name = "unsloth/llama-3-8b-bnb-4bit",
	max_seq_length = 2048,
	dtype = None,
	load_in_4bit = True,
	)

	model = FastLanguageModel.get_peft_model(
	model,
	r = 16,
	target_modules = ["q_proj", "k_proj", "v_proj", "o_proj"],
	lora_alpha = 16,
	lora_dropout = 0,
	bias = "none",
	)
	```

	### Option 2: Axolotl (More Configurable)

	```yaml
	# config.yml
	base_model: meta-llama/Llama-3.1-8B
	model_type: LlamaForCausalLM
	tokenizer_type: AutoTokenizer

	adapter: lora
	lora_r: 16
	lora_alpha: 32
	lora_dropout: 0.05
	lora_target_modules:
	- q_proj
	- v_proj
	- k_proj
	- o_proj

	datasets:
	- path: data/insurance_training.jsonl
	type: alpaca

	sequence_len: 2048
	micro_batch_size: 4
	gradient_accumulation_steps: 4
	num_epochs: 3
	learning_rate: 0.0002
	```

	## 5. Data Processing Pipeline

	### Step-by-Step Process

	1. Extract Census Data
	```bash
	python3 download_census_data.py --workers 100
	```

	2. Convert to Structured Format
	- Parse Excel/CSV files
	- Extract key demographics (age, income, location, household)
	- Create demographic profiles

	3. Combine with Insurance Documents
	- Extract text from insurance PDFs
	- Create context-aware examples
	- Map demographics to product features

	4. Generate Training Pairs
	- Use GPT-4/Claude to create synthetic examples
	- Format: Instruction → Input → Output
	- Include diverse scenarios

	5. Format for Training
	- Convert to Alpaca or ShareGPT format
	- Split into train/validation sets (90/10)
	- Save as JSONL

	### Example Training Data Format

	```json
	{
	"instruction": "Based on the following demographic data, design an appropriate insurance product.",
	"input": "Location: Tokyo\nAge Group: 30-45\nAverage Income: ¥6,000,000\nHousehold Size: 3\nEmployment: Full-time",
	"output": "Product Recommendation: Comprehensive Family Health Insurance\n\nKey Features:\n- Coverage: ¥10M medical expenses\n- Premium: ¥15,000/month\n- Target: Young families with stable income\n- Benefits: Hospitalization, outpatient, dental\n- Exclusions: Pre-existing conditions (first year)\n\nRationale: This demographic shows stable income and family responsibility, making comprehensive health coverage with moderate premiums ideal."
	}
	```

	## 6. Training Configuration

	### Recommended Settings

	```python
	training_args = TrainingArguments(
	output_dir="./insurance-lora",
	num_train_epochs=3,
	per_device_train_batch_size=4,
	gradient_accumulation_steps=4,
	learning_rate=2e-4,
	fp16=True,
	logging_steps=10,
	save_strategy="epoch",
	evaluation_strategy="epoch",
	warmup_ratio=0.1,
	lr_scheduler_type="cosine",
	)
	```

	## 7. Evaluation Strategy

	### Quantitative Metrics
	- Perplexity: Measure on held-out insurance product descriptions
	- BLEU/ROUGE: Compare generated products to reference designs
	- Accuracy: Classification of appropriate product types

	### Qualitative Evaluation
	- Human Review: Insurance experts evaluate product designs
	- Coherence: Check logical consistency of recommendations
	- Domain Accuracy: Verify compliance with insurance regulations

	### Domain-Specific Tests
	- Test on real demographic scenarios
	- Validate premium calculations
	- Check coverage appropriateness

	## 8. Deployment Considerations

	### Model Serving
	- Use vLLM or TGI for efficient inference
	- Quantize to 4-bit for production (GPTQ/AWQ)
	- Deploy on Modal, RunPod, or local GPU

	### Monitoring
	- Track inference latency
	- Monitor output quality
	- Collect user feedback for continuous improvement

	## Next Steps

	1. Data Processing: Create script to convert census data to training format
	2. Training Pipeline: Set up Unsloth/Axolotl environment
	3. Synthetic Generation: Use LLM to create insurance examples from demographics
	4. Fine-tune: Run LoRA training
	5. Evaluate: Test on held-out scenarios
	6. Deploy: Serve model for product design queries