Fix dead split parameter in PackedStreamingDataset._load_dataset 0cd5689 Vjeong Claude Sonnet 4.6 commited on about 15 hours ago
Refactor runner.py: extract shared setup logic into _setup_and_train helper 9b6bd85 Vjeong Claude Sonnet 4.6 commited on 4 days ago
Fix device mismatch in notebook forward pass and generation cells 5345ea1 Vjeong Claude Sonnet 4.6 commited on 4 days ago
Fix dtype mismatch in RoPE cos/sin for mixed precision training 331cfcd Vjeong Claude Sonnet 4.6 commited on 4 days ago
Replace F.silu with explicit SiLU implementation in SwiGLUFeedForward baf4768 Vjeong Claude Sonnet 4.6 commited on 4 days ago
Replace F.scaled_dot_product_attention with explicit implementation e072b51 Vjeong Claude Sonnet 4.6 commited on 4 days ago
Add Claude Code project settings and update gitignore fac7da2 Vjeong Claude Sonnet 4.6 commited on 4 days ago
Remove dead attn_dropout layer from GroupedQueryAttention 9f5773b Vjeong Claude Sonnet 4.6 commited on 5 days ago
Fix LR warmup ordering and align adam_eps with Meta LLaMA af13727 Vjeong Claude Opus 4.6 commited on 6 days ago
Remove redundant detect_scenario from LossDebugger 2a50172 Vjeong Claude Opus 4.6 commited on 6 days ago
Add Code CPT pipeline for injecting Python code capability a424729 Vjeong Claude Opus 4.6 commited on 7 days ago
Fix LR reference table and batch-LR scaling guidance in LossDebugger e96b9d3 Vjeong Claude Opus 4.6 commited on 9 days ago
Fix batch size diagnostic: widen window and list multiple causes fb048e4 Vjeong Claude Opus 4.6 commited on 11 days ago
Fix gradient clipping thresholds in dynamics and checklist modules a671953 Vjeong Claude Opus 4.6 commited on 11 days ago
Fix gradient diagnostic thresholds with evidence-based criteria in LossDebugger 362e9ea Vjeong Claude Opus 4.6 commited on 11 days ago
Update 02_model notebook: switch default to debug_10m and improve model summary display 5359f06 Vjeong Claude Sonnet 4.6 commited on 12 days ago
Translate all notebook content from Korean to English 8c54470 Vjeong Claude Sonnet 4.6 commited on 12 days ago
Fix check_numerical_stability accuracy and completeness 2fb0306 Vjeong Claude Opus 4.6 commited on 12 days ago
Scale overfit test LR and steps by model size in LossDebugger 6b7ca0e Vjeong Claude Sonnet 4.6 commited on 12 days ago
Reduce LOSS_BOUNCE false positives with moving-average smoothing 1451cc6 Vjeong Claude Opus 4.6 commited on 13 days ago
Clear notebook outputs and add real checkpoint diagnosis cell 95e3e6a Vjeong Claude Opus 4.6 commited on 13 days ago
Add missing mock_history cases for full diagnose_status coverage 3570f22 Vjeong Claude Opus 4.6 commited on 13 days ago
Improve LOSS_BOUNCE detection with pre-computed bounce metrics 6c7b430 Vjeong Claude Opus 4.6 commited on 13 days ago
Add LOSS_BOUNCE detection to diagnose_status classification chain 38fd260 Vjeong Claude Opus 4.6 commited on 13 days ago
Add NaN detection to diagnose_status classification chain 8313ca8 Vjeong Claude Opus 4.6 commited on 13 days ago
Fix debugger reference data and scenario detection logic d789de8 Vjeong Claude Sonnet 4.6 commited on 13 days ago
Remove unused tokenizer training code (train_bpe, load_sentencepiece, load_trained_hf) 33ba3d1 Vjeong Claude Opus 4.6 commited on 14 days ago
Update upload notebook for pretrained LLaMA 2 tokenizer e70bc05 Vjeong Claude Opus 4.6 commited on 14 days ago
Standardize notebooks for Colab/local dual environment support e12c067 Vjeong Claude Opus 4.6 commited on 14 days ago
Tighten expected loss ranges for FineWeb-Edu dataset a02e949 Vjeong Claude Opus 4.6 commited on 15 days ago
Use LLaMA 2 pretrained tokenizer and remove tokenizer_mode option a5ca4e4 Vjeong Claude Opus 4.6 commited on 18 days ago
Fix BPE tokenizer ByteLevel decoder and update evaluation notebook 8626149 Vjeong Claude Sonnet 4.6 commited on 20 days ago
feat(notebook): add HuggingFace Hub upload notebook ac456b8 Vjeong Claude Opus 4.6 commited on 21 days ago
feat(training): add LossDebugger 5-level diagnostic framework 5b7ea5e Vjeong Claude Opus 4.6 commited on 22 days ago
refactor(notebook): replace manual TrainConfig args with preset classmethod c1a8df8 Vjeong Claude Sonnet 4.6 commited on 25 days ago
feat(config): add scale-specific presets to TrainConfig 1c8f3e6 Vjeong Claude Sonnet 4.6 commited on 25 days ago
fix(trainer): correct attribute name from total_mem to total_memory f5ab21e Vjeong Claude Sonnet 4.6 commited on 25 days ago
refactor(runner): replace manual seed setup with set_seed utility 4733791 Vjeong Claude Sonnet 4.6 commited on 27 days ago
fix(device): correct attribute name from total_mem to total_memory f91d771 Vjeong Claude Sonnet 4.6 commited on 29 days ago
refactor(model): replace single-letter vars with descriptive names for readability 81a9145 Vjeong Claude Sonnet 4.6 commited on Mar 6
refactor(notebook): replace manual param counting with torchinfo.summary in 02_model 99c1b85 Vjeong Claude Sonnet 4.6 commited on Mar 5