| added arbritrary changes that might lead to be faster inference and hence | |
| will be used for all further changes | |
| changes are :: | |
| --remove_input_padding \ | |
| this is simple change to get contextual embeddings and patches them together | |
| or something these don't slow it might be better to increase speed | |
| --use_inflight_batching \ | |
| increases latency period | |