Re-quantize models in FP16 in keep positional encoding in FP32 to avoid bad accuracy f9a6075 verified Luigi commited on Mar 27, 2024