Spaces:
Running
Running
Apply for a GPU community grant: Personal project
#1
by stanley-00 - opened
This is demo to evaluate small LLM model without needing for inference provider.
Hii !!
Thank You to put my model "Archaea-74M" On your inference engine.
I would suggest you to put a "token per sec" Metric in this inference engine , It would help to evalutate model speed further.
Thank You !!
Thanks for the suggestion, I would consider that.
Updated: the "token per sec" has been added
Oh also could you add smollM2-135m instruct?
Oh also could you add smollM2-135m instruct?
Actually, you can paste HuggingFaceTB/SmolLM2-135M-Instruct directly to the Model field and it also work.
But let me add clear instruction for that