nanochat-cache / report /chat-sft.md
ttj's picture
Add files using upload-large-folder tool
85a524c verified

Chat SFT

timestamp: 2025-11-03 09:34:55

  • run: fal-sft
  • source: mid
  • device_type:
  • dtype: bfloat16
  • device_batch_size: 4
  • num_epochs: 1
  • num_iterations: -1
  • target_examples_per_step: 32
  • unembedding_lr: 0.0040
  • embedding_lr: 0.2000
  • matrix_lr: 0.0200
  • weight_decay: 0.0000
  • init_lr_frac: 0.0200
  • eval_every: 100
  • eval_steps: 100
  • eval_metrics_every: 200
  • eval_metrics_max_problems: 1024
  • Training rows: 22,439
  • Number of iterations: 701
  • Training loss: 0.5668
  • Validation loss: 1.0105