πŸš€ Qwen3-50M C4 Pretrained (FP16)

Pretrained Qwen3-50M model on C4 dataset from scratch using FP16 precision.

πŸ“Š Training Results

  • Final Training Loss: 6.5365
  • Final Validation Loss: 6.9544572830200195
  • Training Samples: 1,000
  • Epochs: 3
  • Precision: FP16

πŸš€ Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("Mostafa8Mehrabi/qwen3-50m-c4-final")
model = AutoModelForCausalLM.from_pretrained(
    "Mostafa8Mehrabi/qwen3-50m-c4-final", 
    torch_dtype=torch.float16,
    device_map="auto"
)

# Generate text
prompt = "The future of artificial intelligence is"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

πŸ“ Checkpoints

Training checkpoints (also in FP16) are available at: Mostafa8Mehrabi/qwen3-50m-c4-checkpoints

Downloads last month
6
Safetensors
Model size
71.6M params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Mostafa8Mehrabi/qwen3-50m-c4-final-test-version

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(4)
this model