Is this broken or out of date? Lot's of errors trying to get this to work.

#1
by Fieldsweeper - opened

I am seeing errors related to kernel size pattern in the bezzam encoder—those strided convolutions with kernels [4, 4, 8, 10, 10, 16] and corresponding downsampling ratios. Among other things actually.

Not sure what could be affecting it, other than perhaps some overlap of your references to the Microsoft official one, (also having this model removed)

@Fieldsweeper this is a draft checkpoint to get a version working with Transformers. so it's normal that it isn't working as the code to use it (here) is still in progress / under review. I can let you know when it's ready for testing if you want to try out before it gets merged into Transformers?

Sign up or log in to comment