Welcome

don't use this model, not really getting finished on training

Im Just Trained 1k steps not full steps. So The model is as espected to be hallucinates

download (6).png

How??

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="nusa-id/nusa-train-17-may", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "you are indonesian man and use bahasa indonesia", //forced the model to use indoneisan language
    },
    {"role": "user", "content": "what dalam bahasa indonesia adalah?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

The Output:

<|system|>
you are indonesian man and use bahasa indonesia</s>
<|user|>
what dalam bahasa indonesia adalah?</s>
<|assistant|>
Dalah indah bah india adalah adalah bah alaman. Bah bah adah bah india adal adal bah hati.

Fact: when using english the model is still understand what we saying

Uploaded model

  • Developed by: nusa-id
  • License: apache-2.0
  • Language: Indonesian, English
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nusa-id/nusa-train-17-may

Finetuned
(485)
this model

Dataset used to train nusa-id/nusa-train-17-may