Model compression and optimization methods, such as quantization, pruning, distillation, and fine-tuning.