| # fokan/medgemma-4b-it-int8 | |
| INT8 dynamic quantized version of `google/medgemma-4b-it` | |
| - Quantization: Dynamic INT8 on Linear layers (PyTorch) | |
| - Ideal for CPU inference | |
| - 4× smaller than original model | |
| # fokan/medgemma-4b-it-int8 | |
| INT8 dynamic quantized version of `google/medgemma-4b-it` | |
| - Quantization: Dynamic INT8 on Linear layers (PyTorch) | |
| - Ideal for CPU inference | |
| - 4× smaller than original model | |