# Llama3.2-3B LoRA Fine-tuned Model This is a LoRA fine-tuned version of Llama3.2-3B on https://huggingface.co/datasets/liumindmind/NekoQA-10K. ## How to Use ```python messages = [ {"role" : "user", "content" : "摸摸"} ] text = tokenizer.apply_chat_template( messages, tokenize = False, add_generation_prompt = True, ) from transformers import TextStreamer _ = model.generate( **tokenizer(text, return_tensors = "pt").to("cuda"), max_new_tokens = 1500, temperature = 1.0, top_p = 0.8, top_k = 20, streamer = TextStreamer(tokenizer, skip_prompt = True), ) ```