What is the precision of the model parameters, int8 or Int4? For execution on a mobile device, Int4 is preferable.
· Sign up or log in to comment