fix: trans_a=True also need to be modified to tl.trans() 2c40edc verified Taykhoom commited on about 17 hours ago
add attention return + support eager attention or triton FA2 via config.use_flash_attn f2409f7 verified Taykhoom commited on about 17 hours ago