EsotericsEnjoyer/t5gemma-2-2-ul2-merged-poorlyconfigured

#1185
by EsotericsEnjoyer - opened

--- poor model, IGNORE ---

This might be an odd request. I have a 6b model merged at 32 precision, that still needs to be configured properly.

https://huggingface.co/EsotericsEnjoyer/t5gemma-2-2-ul2-merged-poorlyconfigured

I assume that gguf isn't an option right off the bat, and i don't know if t5gemma can be converted to any other cpu-friendly format. Not with my hardware in any case. Would you give this a shot? i have not even ran inference tests as it is, yet.

It's queued! :D
T5ForConditionalGeneration should be supported by llama.cpp but we will see.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#t5gemma-2-2-ul2-merged-poorlyconfigured-GGUF for quants to appear.

This comment has been hidden (marked as Resolved)

no, that error means it encountered a tensor name that it didn't understand, specifically in this case:

ValueError: Can not map tensor 'model.encoder.embed_tokens.weight'

thank you. i got it to work, re merged and configured, but i used too large of a batch size during training, so the adapters are bad. the model is overfit, even with early checkpoints.

This comment has been hidden (marked as Resolved)

Nobody knows what they are doing until they did it a few times :)= Anyway, I have even less clue about training than you, and this is not the right place to ask fro help.

But maybe @nicoboss boss wants to pipe in on your questions, he has a lot more experience than me.,

Regarding "bad" models, don't worry about it, you can still ask us to quant them and see what problems come up. That way you can concentrate on your model. Don't worry too much about submitting a potentially broken model, we have lots of them every day :)

This comment has been hidden (marked as Resolved)
EsotericsEnjoyer changed discussion status to closed

Sign up or log in to comment