this is the generative pretrained model (GPT) version of the model :D
it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM
and to use it with text generation as a base model :3 (not recommended 3: needs finetuning on larger dataset first )
from transformers import AutoTokenizer
import torch
# Load tokenizer >:D
tokenizer = AutoTokenizer.from_pretrained("moelanoby/Kok-GPT")
# Load mi model :3 (ensure trust_remote_code=True)
model = BucketMemoryModel.from_pretrained(
"moelanoby/Kok-GPT",
trust_remote_code=True
)
# Generate text with this function :D
def generate_text(prompt, max_length=50):
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
input_ids=inputs["input_ids"], # Explicitly pass only input_ids
max_length=max_length
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# change hello to anything you like :D
prompt = "Hello"
generated = generate_text(prompt)
print(f"Generated text >:3: {generated}")
either way it was trained on 10K rows on the fineweb dataset which is considered insufficient I did end up with an average loss of 2.3468 so yeah you can still finetune the model but the time I get stronger GPUs I'll just target 7B parameters or 14B and etc...
BUUUT this is already enough and I'm planning to make more kinds of AI models in the future with custom architectures
and I might make roleplaying AI models so stay tuned for that :3
and If you like :D support me with buy me a coffee right here :3
buymeacoffee.com/Moelanoby
- Downloads last month
- 11