augustocsc's picture
GPT-2 Large trained on prefix dataset (682K)
28b769b verified
[INFO] Launching instance for GPT-2 Large training...
[INFO] Instance type: g5.2xlarge (48GB VRAM)
[INFO] Finding Deep Learning AMI...
[INFO] Using AMI: ami-0e4ef96c62e7cc2fe
[INFO] Using key pair: chave-gpu-nova
[INFO] Using security group: sg-0deaa73e23482e3f6
[INFO] Launching instance...
[INFO] Instance launched: i-04dc6f51534d8185d
[INFO] Waiting for instance to start...
==========================================
GPT-2 Large Training Instance Ready!
==========================================
Instance ID: i-04dc6f51534d8185d
Instance Type: g5.2xlarge (48GB VRAM)
Public IP: 52.55.119.255
Monitor training:
ssh -i ~/.ssh/chave-gpu-nova.pem ubuntu@52.55.119.255
tail -f /home/ubuntu/training_large.log
Check when complete:
ssh ubuntu@52.55.119.255 'while [ ! -f ~/.training_complete ]; do sleep 60; echo "Training in progress..."; done; cat ~/training_results.txt'
Estimated time: ~4-5 hours for 3 epochs
Cost: ~-10 USD (/hour for g5.2xlarge)
[INFO] Instance info saved to: /c/Users/madeinweb/.seriguela/large_instance_info.txt