Hajime MATSUMOTO commited on
Commit
ce66137
·
1 Parent(s): 1cc8a56

Update Dockerfile for 4xL40S multi-GPU training

Browse files
Files changed (1) hide show
  1. Dockerfile +4 -2
Dockerfile CHANGED
@@ -14,11 +14,13 @@ RUN pip install --no-cache-dir -r requirements.txt
14
 
15
  # 学習スクリプト
16
  COPY train.py .
 
17
 
18
  # HFトークンは環境変数で渡す
19
  ENV HF_TOKEN=""
20
  ENV TRANSFORMERS_CACHE=/app/cache
21
  ENV HF_HOME=/app/cache
22
 
23
- # 学習実行
24
- CMD ["python", "train.py"]
 
 
14
 
15
  # 学習スクリプト
16
  COPY train.py .
17
+ COPY train_multi_gpu.py .
18
 
19
  # HFトークンは環境変数で渡す
20
  ENV HF_TOKEN=""
21
  ENV TRANSFORMERS_CACHE=/app/cache
22
  ENV HF_HOME=/app/cache
23
 
24
+ # マルチGPU学習 (4xL40S)
25
+ # シングルGPUの場合は: CMD ["python", "train.py"]
26
+ CMD ["accelerate", "launch", "--num_processes", "4", "train_multi_gpu.py"]