Tsedee commited on
Commit
da6917d
ยท
verified ยท
1 Parent(s): 3da699a

Upload setup_a40.sh with huggingface_hub

Browse files
Files changed (1) hide show
  1. setup_a40.sh +75 -0
setup_a40.sh ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
3
+ # MonSub v3 โ€” A40 Setup Script
4
+ # RunPod A40 48GB pod ะดััั€ ะฐะถะธะปะปัƒัƒะปะฝะฐ
5
+ # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
6
+
7
+ set -e
8
+
9
+ # โ”€โ”€ HF Token โ”€โ”€
10
+ export HF_TOKEN="${HF_TOKEN}" # RunPod env-ะด ั‚ะพั…ะธั€ัƒัƒะปะฝะฐ
11
+ export HUGGINGFACE_HUB_TOKEN="$HF_TOKEN"
12
+
13
+ # โ”€โ”€ ะงะฃะฅะะ›: Cache-ะณ workspace volume ั€ัƒัƒ ั‡ะธะณะปาฏาฏะปัั… โ”€โ”€
14
+ # Container disk (50GB) ะดาฏาฏั€ะดัะณ โ†’ workspace volume ะฐัˆะธะณะปะฐะฝะฐ
15
+ export HF_HOME=/workspace/.cache
16
+ export TMPDIR=/workspace/tmp
17
+ mkdir -p /workspace/.cache /workspace/tmp
18
+
19
+ echo "=============================================="
20
+ echo "MonSub v3 โ€” A40 Setup"
21
+ echo "HF_HOME=$HF_HOME"
22
+ echo "TMPDIR=$TMPDIR"
23
+ echo "=============================================="
24
+
25
+ # โ”€โ”€ Dependencies โ”€โ”€
26
+ echo ""
27
+ echo "=== Installing dependencies ==="
28
+ pip install -q \
29
+ "transformers>=4.46.0" \
30
+ "datasets==2.21.0" \
31
+ accelerate \
32
+ evaluate \
33
+ jiwer \
34
+ soundfile \
35
+ librosa
36
+
37
+ # datasets==2.21.0 ะ—ะะะ’ะะ› (latest โ†’ torchcodec ImportError)
38
+
39
+ # โ”€โ”€ GPU check โ”€โ”€
40
+ echo ""
41
+ echo "=== GPU Info ==="
42
+ python -c "
43
+ import torch
44
+ if torch.cuda.is_available():
45
+ name = torch.cuda.get_device_name(0)
46
+ vram = torch.cuda.get_device_properties(0).total_memory / 1e9
47
+ print(f'GPU: {name}')
48
+ print(f'VRAM: {vram:.1f}GB')
49
+ else:
50
+ print('WARNING: No GPU!')
51
+ "
52
+
53
+ # โ”€โ”€ Download training script โ”€โ”€
54
+ echo ""
55
+ echo "=== Downloading training script ==="
56
+ cd /workspace
57
+
58
+ # HuggingFace-ััั ะฐะฒะฐั… (ััะฒัะป paste ั…ะธะนะฝั)
59
+ python -c "
60
+ from huggingface_hub import hf_hub_download
61
+ try:
62
+ path = hf_hub_download('Tsedee/monsub-training-scripts', 'run_finetune_v3.py', token=os.environ['HF_TOKEN'])
63
+ import shutil
64
+ shutil.copy(path, '/workspace/run_finetune_v3.py')
65
+ print('Downloaded from HF')
66
+ except:
67
+ print('HF download failed - paste the script manually')
68
+ "
69
+
70
+ # โ”€โ”€ Start training โ”€โ”€
71
+ echo ""
72
+ echo "=== Starting v3 training ==="
73
+ echo "Log: /workspace/train_v3.log"
74
+ echo ""
75
+ python /workspace/run_finetune_v3.py 2>&1 | tee /workspace/train_v3.log