| --- |
| library_name: transformers |
| datasets: |
| - cwestbrook/lotrdata |
| language: |
| - en |
| base_model: |
| - deepseek-ai/DeepSeek-R1 |
| --- |
| |
| # DeepTolkien |
|
|
| This LLM is an OpenSeek R1 fine-tuned using the LoRA method on text extracted from JRR Tolkien's The Lord of the Rings. |
|
|
| ## Model Details |
|
|
| This LLM is an OpenSeek R1 fine-tuned using the LoRA method on text extracted from JRR Tolkien's The Lord of the Rings. The model can be prompted with a stub, for example "Frodo looked up and saw", and will then generate a story in the style of Tolkien's writing that continues from this stub. Have fun! |
|
|
| If you have played with OpenSeek R1, you have almost certainly noticed that at times the reasoning model seems to get caught up in a loop. This behavior is also seen here: for example, two characters will get caught in a looping dialog. I believe this is more of a property of DeepSeek R1 than this LoRA, and better results may yet be achieved through a model specific to prose and storytelling. However, I wanted to get an idea of how the new DeepSeek models perform, and this has been a fantastic learning experience. |
|
|
| ## Usage |
|
|
| ### Load the model: |
| ``` |
| # Import the model |
| config = PeftConfig.from_pretrained("cwestbrook/lotrdata") |
| model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto') |
| tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
| # Load the Lora model |
| model = PeftModel.from_pretrained(model, "cwestbrook/lotrdata") |
| ``` |
|
|
| ### Run the model: |
| ``` |
| prompt = "Gandalf revealed his new iphone," |
| inputs = tokenizer(prompt, return_tensors="pt").to('cuda') |
| tokens = model.generate( |
| **inputs, |
| max_new_tokens=100, |
| temperature=1, |
| eos_token_id=tokenizer.eos_token_id, |
| early_stopping=True |
| ) |
| predictions = tokenizer.batch_decode(tokens, skip_special_tokens=True) |
| print(predictions[0]) |
| ``` |
|
|
| This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|