fix integration with huggingface

All good (i think), i don't have enough compute power to test this out since i'm on free tier, so let me know if everything is running like it's supposed to be
you can test this pr before merging via the following code

pip install -qU  "transformers>=4.39.1" flash_attn

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Tensoic/Cerule-v0.1", trust_remote_code=True,
   revision="refs/pr/2" # the revision parameter is only used to run the code from this pr
)

I also updated the readme to let people know how to use the model.

tips

when working with custom architectures I recommend using huggingface's PyTorchModelHubMixin I also made a basic template on how to use it in this github repo integrating it with pip.

If you have any more questions or feedbacks or if you have any other custom models do not hesitate to reach out

not-lain changed pull request status to open Apr 3, 2024

adarshxs

Tensoic AI org Apr 3, 2024

Damn THANKS A LOT! will test it out asap

update config.json541f04ed

adarshxs

Tensoic AI org Apr 3, 2024

Hey @not-lain why did you change the model type to phi-msft here?
https://huggingface.co/Tensoic/Cerule-v0.1/commit/4a02b161d5142cd92a2082aae885bc3cc9584aca

not-lain

Apr 3, 2024

@adarshxs I'll try to open another pr and debug things slowly, thanks for pointing this out

not-lain

Apr 3, 2024

@adarshxs you are right this pr is absolutely useless XD.
the only thing that was broken was my colab envirenment.
all I had to do from the beginning is

!pip install -qU  "transformers>=4.39.1" flash_attn

I'm closing this pr
but i'm keeping pr/3 open since the _name_or_path is essential for cases such as finetuning

not-lain changed pull request status to closed Apr 3, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment