cwestbrook
/

lotrdata

Model card Files Files and versions

lotrdata / README.md

cwestbrook's picture

Update README.md

0fd3f7f verified about 1 year ago

|

history blame contribute delete

2.03 kB

	---
	library_name: transformers
	datasets:
	- cwestbrook/lotrdata
	language:
	- en
	base_model:
	- deepseek-ai/DeepSeek-R1
	---

	# DeepTolkien

	This LLM is an OpenSeek R1 fine-tuned using the LoRA method on text extracted from JRR Tolkien's The Lord of the Rings.

	## Model Details

	This LLM is an OpenSeek R1 fine-tuned using the LoRA method on text extracted from JRR Tolkien's The Lord of the Rings. The model can be prompted with a stub, for example "Frodo looked up and saw", and will then generate a story in the style of Tolkien's writing that continues from this stub. Have fun!

	If you have played with OpenSeek R1, you have almost certainly noticed that at times the reasoning model seems to get caught up in a loop. This behavior is also seen here: for example, two characters will get caught in a looping dialog. I believe this is more of a property of DeepSeek R1 than this LoRA, and better results may yet be achieved through a model specific to prose and storytelling. However, I wanted to get an idea of how the new DeepSeek models perform, and this has been a fantastic learning experience.

	## Usage

	### Load the model:
	```
	# Import the model
	config = PeftConfig.from_pretrained("cwestbrook/lotrdata")
	model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
	tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
	# Load the Lora model
	model = PeftModel.from_pretrained(model, "cwestbrook/lotrdata")
	```

	### Run the model:
	```
	prompt = "Gandalf revealed his new iphone,"
	inputs = tokenizer(prompt, return_tensors="pt").to('cuda')
	tokens = model.generate(
	**inputs,
	max_new_tokens=100,
	temperature=1,
	eos_token_id=tokenizer.eos_token_id,
	early_stopping=True
	)
	predictions = tokenizer.batch_decode(tokens, skip_special_tokens=True)
	print(predictions[0])
	```

	This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.