Text Generation
Transformers
PyTorch
llama
uncensored
text-generation-inference
TheBloke's picture
Change use_cache to True which significantly speeds up inference
721f3de