feat(evaluation): add beam search, metrics pipeline, and stabilized training workflow 91a1214 apoorvrajdev commited on 21 days ago