Fix Gradio examples caching bug + disable cache 9d6133e verified OpenTransformer commited on 28 days ago
Match exact n.py architecture: TuneableAttentionMHA + Sequential FFN 52356c8 verified OpenTransformer commited on 28 days ago
Fix checkpoint path - auto-find latest in checkpoints/ 4c75b56 verified OpenTransformer commited on 28 days ago