π Qwen3-50M C4 Pretrained (FP16)
Pretrained Qwen3-50M model on C4 dataset from scratch using FP16 precision.
π Training Results
- Final Training Loss: 6.5365
- Final Validation Loss: 6.9544572830200195
- Training Samples: 1,000
- Epochs: 3
- Precision: FP16
π Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("Mostafa8Mehrabi/qwen3-50m-c4-final")
model = AutoModelForCausalLM.from_pretrained(
"Mostafa8Mehrabi/qwen3-50m-c4-final",
torch_dtype=torch.float16,
device_map="auto"
)
# Generate text
prompt = "The future of artificial intelligence is"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
π Checkpoints
Training checkpoints (also in FP16) are available at: Mostafa8Mehrabi/qwen3-50m-c4-checkpoints
- Downloads last month
- 6