Fix dtype mismatch in RoPE cos/sin for mixed precision training 331cfcd Vjeong Claude Sonnet 4.6 commited on 4 days ago
Replace F.silu with explicit SiLU implementation in SwiGLUFeedForward baf4768 Vjeong Claude Sonnet 4.6 commited on 5 days ago
Replace F.scaled_dot_product_attention with explicit implementation e072b51 Vjeong Claude Sonnet 4.6 commited on 5 days ago
Remove dead attn_dropout layer from GroupedQueryAttention 9f5773b Vjeong Claude Sonnet 4.6 commited on 5 days ago
refactor(model): replace single-letter vars with descriptive names for readability 81a9145 Vjeong Claude Sonnet 4.6 commited on Mar 6
docs: translate all Korean comments and docstrings to English 858e8b2 Vjeong Claude Sonnet 4.6 commited on Feb 27