|
|
--- |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
this is the generative pretrained model (GPT) version of the model :D |
|
|
|
|
|
it's a base model too so you should finetune it manually on a large dataset I did some small training on the model but it's not enough for an LLM |
|
|
|
|
|
and to use it with text generation as a base model :3 (not recommended 3: needs finetuning on larger dataset first ) |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Load tokenizer >:D |
|
|
tokenizer = AutoTokenizer.from_pretrained("moelanoby/Kok-GPT") |
|
|
|
|
|
# Load mi model :3 (ensure trust_remote_code=True) |
|
|
model = BucketMemoryModel.from_pretrained( |
|
|
"moelanoby/Kok-GPT", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
# Generate text with this function :D |
|
|
def generate_text(prompt, max_length=50): |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate( |
|
|
input_ids=inputs["input_ids"], # Explicitly pass only input_ids |
|
|
max_length=max_length |
|
|
) |
|
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
# change hello to anything you like :D |
|
|
prompt = "Hello" |
|
|
generated = generate_text(prompt) |
|
|
print(f"Generated text >:3: {generated}") |
|
|
``` |
|
|
either way it was trained on 10K rows on the fineweb dataset which is considered insufficient I did end up with an average loss of 2.3468 so yeah you can still finetune the model but the time I get stronger GPUs I'll just target 7B parameters or 14B and etc... |
|
|
|
|
|
BUUUT this is already enough and I'm planning to make more kinds of AI models in the future with custom architectures |
|
|
|
|
|
and I might make roleplaying AI models so stay tuned for that :3 |
|
|
|
|
|
|
|
|
and If you like :D |
|
|
support me with buy me a coffee right here :3 |
|
|
|
|
|
|
|
|
buymeacoffee.com/Moelanoby |