Fix transformers compatibility: pin versions and rename past_key_value to past_key_values 687049b Florian valade commited on 10 days ago
Track metrics during streaming, remove redundant generation re-runs 33efa44 Florian valade commited on 10 days ago
Fix early exit inference loop to eliminate redundant computation a781577 Florian valade commited on 10 days ago