birdclef-2026-improved / nb02_patch_notes.md

Add NB2 safe run patch notes for Kaggle kernel death

0a614fa verified 16 days ago

1.01 kB

	# NB2 Kaggle kernel-death fix

	Version 5/6 died before the first epoch print. The data/label fixes are correct (`soundscape positive labels: 3122`), so the remaining issue is memory pressure during the first training epoch.

	Use these safer NB2 settings before running:

	```python
	class CFG:
	epochs = 2
	model_name = "b0"
	folds_to_run = [0] # train ONE fold per Kaggle run first
	batch_size = 4 # micro-batch
	grad_accum_steps = 3 # effective batch 12
	num_workers = 0
	use_data_parallel = False # DataParallel caused kernel death on T4x2
	max_train_audio_samples = None
	max_sc_train_samples = None
	```

	Then repeat runs:

	```python
	# B0
	folds_to_run = [0]
	folds_to_run = [1]
	folds_to_run = [2]
	folds_to_run = [3]
	folds_to_run = [4]

	# B3, even safer
	model_name = "b3"
	folds_to_run = [0]
	batch_size = 2
	grad_accum_steps = 6
	```

	Also patch the optimizer loop: divide loss by `grad_accum_steps`, step only every N batches, and print every 100 batches.