Seed-0.4B

  • 0.4B decoder only dense model trained from scratch.
  • As it isn't instruction finetuned, the model performs document completion, not conversational generation.
  • Model is released primarily for educational, research, and experimental purposes.

GitHub: merterbak/llm-from-scratch

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("merterbak/Seed-0.4B")
model = AutoModelForCausalLM.from_pretrained(
    "merterbak/Seed-0.4B",
    trust_remote_code=True,
    dtype="auto"
)
prompt = "Climate change can affect"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    do_sample=True,
    temperature=1.0,
    top_k=50,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
80
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train merterbak/Seed-0.4B