| # Llama3.2-3B LoRA Fine-tuned Model | |
| This is a LoRA fine-tuned version of Llama3.2-3B on https://huggingface.co/datasets/liumindmind/NekoQA-10K. | |
| ## How to Use | |
| ```python | |
| messages = [ | |
| {"role" : "user", "content" : "摸摸"} | |
| ] | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize = False, | |
| add_generation_prompt = True, | |
| ) | |
| from transformers import TextStreamer | |
| _ = model.generate( | |
| **tokenizer(text, return_tensors = "pt").to("cuda"), | |
| max_new_tokens = 1500, | |
| temperature = 1.0, top_p = 0.8, top_k = 20, | |
| streamer = TextStreamer(tokenizer, skip_prompt = True), | |
| ) | |
| ``` |