Revert to F32 KV: int8 KV bridge produced DEQUANTIZE 1-dim + STABLEHLO_COMPOSITE missing data → Metal delegate rejects 224e164 verified sarmientoF commited on May 13
int8 KV cache export (cache_update_composite + int8_kv_bridge) 3971892 verified sarmientoF commited on May 13
GPU-compatible export: no cache_update_composite, cache_length=4096, lora_rank=8 83de808 verified sarmientoF commited on May 9
Re-export cache_length=4096 (was 32768) for 6GB devices 0b3ad5a verified sarmientoF commited on May 8
Replace with Modal-exported 32k bundle (int8 KV, LoRA r8) c0c3eda verified sarmientoF commited on May 7
Replace weight_only with dynamic_wi4_afp32 (fixes gibberish) 9957c4e verified sarmientoF commited on Apr 18
Replace weight_only with dynamic_wi4_afp32 (fixes gibberish) 38517fa verified sarmientoF commited on Apr 18
Upload gemma4-android-2k.litertlm with huggingface_hub f74a45b verified sarmientoF commited on Apr 16
Upload adapters/pirate-scale-4.5.tflite with huggingface_hub 26a77e8 verified sarmientoF commited on Apr 15
Upload adapters/alpaca-2ep.tflite with huggingface_hub c72ec29 verified sarmientoF commited on Apr 15
Upload adapters/alpaca-3ep.tflite with huggingface_hub 7cb5925 verified sarmientoF commited on Apr 15
Upload adapters/alpaca-3ep-scale-0.625.tflite with huggingface_hub 58734a5 verified sarmientoF commited on Apr 15
Upload adapters/alpaca-1ep.tflite with huggingface_hub 6f6bf12 verified sarmientoF commited on Apr 15
Upload adapters/pirate-v9-scale-0.25.tflite with huggingface_hub 85764d0 verified sarmientoF commited on Apr 15