Set max_position_embeddings: 40000 for engine-builder workaround
#2
by aaronmeoded-b10 - opened
Workaround for Baseten engine-builder bug where max_seq_len in truss config.yaml is overridden by max_position_embeddings from config.json, causing OOM at runtime for Llama-3-70B SeqCls FP8. Setting max_position_embeddings to a value ≤ desired max_seq_len (45000) makes the override benign. See Slack thread w/ Dhruv Singal 2026-04-28 (Slingshot debug).