Zicheng Xu PRO
AI & ML interests
Recent Activity
Organizations
🚀 DTS: A Candidate for the Best Parallel Reasoning in LLMs
4D masks support in Transformers
Just tested it out with your previous testcase (also made some fixes such as the input 4D mask does not accept bool but either 0 or -inf and a variable name error). Both tests went through and I guess perhaps it is my coding errors that caused the inconsistency. Sorry for bothering you with it!
Thank you! The test case that you provided is very similar to what I'm trying to do. I will start with running that first and check whether there will be an issue.
Hi @poedator , thank you for your contribution. I'm wondering whether the pipeline you mentioned in section 1, i.e. 4D custom attention mask + custom positional encoding still works in the current transformers version 4.49.0. I'm using the LLama model through the LlamaForCausalLM in the modeling_llama file and I'm getting inconsistent logits from the unpacked and packed method.