prefill: bump prefillN from 64 to 512 (~8x faster TTFT on 50-500 token prompts) dfaa01b verified mlboydaisuke commited on about 9 hours ago
Upload prefill/chunk3.mlmodelc/analytics/coremldata.bin with huggingface_hub 5e8bb31 verified mlboydaisuke commited on 5 days ago