moelanoby
/

Kok-GPT

Text Generation

bucket-memory-model3

feature-extraction

Model card Files Files and versions

Kok-GPT / README.md

moelanoby's picture

Update README.md

350da1e verified 9 months ago

|

history blame contribute delete

1.75 kB

	---
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-generation
	---

	this is the generative pretrained model (GPT) version of the model :D

	it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM

	and to use it with text generation as a base model :3 (not recommended 3: needs finetuning on larger dataset first )

	```python
	from transformers import AutoTokenizer
	import torch

	# Load tokenizer >:D
	tokenizer = AutoTokenizer.from_pretrained("moelanoby/Kok-GPT")

	# Load mi model :3 (ensure trust_remote_code=True)
	model = BucketMemoryModel.from_pretrained(
	"moelanoby/Kok-GPT",
	trust_remote_code=True
	)

	# Generate text with this function :D
	def generate_text(prompt, max_length=50):
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	input_ids=inputs["input_ids"], # Explicitly pass only input_ids
	max_length=max_length
	)
	return tokenizer.decode(outputs[0], skip_special_tokens=True)

	# change hello to anything you like :D
	prompt = "Hello"
	generated = generate_text(prompt)
	print(f"Generated text >:3: {generated}")
	```
	either way it was trained on 10K rows on the fineweb dataset which is considered insufficient I did end up with an average loss of 2.3468 so yeah you can still finetune the model but the time I get stronger GPUs I'll just target 7B parameters or 14B and etc...

	BUUUT this is already enough and I'm planning to make more kinds of AI models in the future with custom architectures

	and I might make roleplaying AI models so stay tuned for that :3


	and If you like :D
	support me with buy me a coffee right here :3


	buymeacoffee.com/Moelanoby