Commit History

prefill: bump prefillN from 64 to 512 (~8x faster TTFT on 50-500 token prompts)
dfaa01b
verified

mlboydaisuke commited on

Upload prefill/chunk3.mlmodelc/analytics/coremldata.bin with huggingface_hub
5e8bb31
verified

mlboydaisuke commited on