prefill: bump prefillN from 64 to 512 (~8x faster TTFT on 50-500 token prompts) dfaa01b verified mlboydaisuke commited on 3 days ago
Upload prefill/chunk2.mlmodelc/analytics/coremldata.bin with huggingface_hub 9f95465 verified mlboydaisuke commited on 8 days ago
Upload prefill/chunk2.mlmodelc/weights/weight.bin with huggingface_hub 57c97a3 verified mlboydaisuke commited on 8 days ago
Upload prefill/chunk2.mlmodelc/coremldata.bin with huggingface_hub f528c21 verified mlboydaisuke commited on 8 days ago
Upload prefill/chunk2.mlmodelc/model.mil with huggingface_hub 1e10260 verified mlboydaisuke commited on 8 days ago