TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]