Commit History

prefill: bump prefillN from 64 to 512 (~8x faster TTFT on 50-500 token prompts)
dfaa01b
verified

mlboydaisuke commited on

Upload prefill/chunk2.mlmodelc/analytics/coremldata.bin with huggingface_hub
9f95465
verified

mlboydaisuke commited on