| # SimToken Setup |
|
|
| 本文档用于在新机器上重建 SimToken 环境,并准备后续 A-min 实验。 |
|
|
| --- |
|
|
| ## 1. Create Environment |
|
|
| 先确认 GPU 和 CUDA driver 状态: |
|
|
| ```bash |
| nvidia-smi |
| ``` |
|
|
| 创建 conda 环境: |
|
|
| ```bash |
| /opt/miniforge3/condabin/conda create -n simtoken python=3.10 -y |
| /opt/miniforge3/condabin/conda activate simtoken |
| |
| python -m pip install --upgrade pip wheel "setuptools<81" |
| |
| pip install \ |
| torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 \ |
| --index-url https://download.pytorch.org/whl/cu121 |
| |
| pip install \ |
| transformers==4.30.2 \ |
| peft==0.2.0 \ |
| accelerate==0.21.0 \ |
| sentencepiece \ |
| protobuf \ |
| safetensors \ |
| numpy==1.26.4 \ |
| pandas \ |
| matplotlib \ |
| opencv-python \ |
| pillow \ |
| tqdm \ |
| einops \ |
| timm \ |
| requests \ |
| towhee \ |
| huggingface_hub |
| ``` |
|
|
| 快速验证: |
|
|
| ```bash |
| python - <<'PY' |
| import torch |
| print("torch:", torch.__version__) |
| print("cuda available:", torch.cuda.is_available()) |
| print("device:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "cpu") |
| PY |
| ``` |
|
|
| --- |
|
|
| ## 2. Check Workspace After Migration |
|
|
| 使用服务器平台的迁移工具完成目录迁移后,在新机器上确认关键文件: |
|
|
| ```bash |
| cd /workspace/SimToken |
| |
| ls -lh checkpoints/simtoken_pretrained.pth |
| ls -lh models/segment_anything/sam_vit_h_4b8939.pth |
| ls -d data/image_embed data/gt_mask data/audio_embed data/media |
| ``` |
|
|
| 如果迁移后只有压缩包而没有解压目录,重新解压: |
|
|
| ```bash |
| cd /workspace/SimToken/data |
| |
| tar -xf image_embed.tar |
| tar -xzf gt_mask.tar.gz |
| tar -xzf audio_embed.tar.gz |
| tar -xf media.tar |
| ``` |
|
|
| 清理迁移中不需要的缓存: |
|
|
| ```bash |
| cd /workspace/SimToken |
| find . -name "__pycache__" -prune -exec rm -rf {} + |
| find . -name ".pytest_cache" -prune -exec rm -rf {} + |
| find . -name ".cache" -prune -exec rm -rf {} + |
| find . -name "*.pyc" -delete |
| ``` |
|
|
| --- |
|
|
| ## 3. Download from HuggingFace |
|
|
| 如果新机器不使用迁移工具,而是从 HuggingFace 重新初始化,先登录: |
|
|
| ```bash |
| huggingface-cli login |
| ``` |
|
|
| 下载完整 repo: |
|
|
| ```bash |
| mkdir -p /workspace/SimToken |
| cd /workspace/SimToken |
| |
| huggingface-cli download yfan07/SimToken \ |
| --repo-type model \ |
| --local-dir . \ |
| --local-dir-use-symlinks False |
| ``` |
|
|
| 下载完成后解压数据: |
|
|
| ```bash |
| cd /workspace/SimToken/data |
| |
| tar -xf image_embed.tar |
| tar -xzf gt_mask.tar.gz |
| tar -xzf audio_embed.tar.gz |
| tar -xf media.tar |
| ``` |
|
|
| --- |
|
|
| ## 4. Pre-download Model Weights |
|
|
| `transformers==4.30.2` 与新版 `huggingface_hub` 可能存在网络/API 兼容问题。建议先用 CLI 将模型下载到本地缓存,实验时再加 `TRANSFORMERS_OFFLINE=1`。 |
|
|
| ```bash |
| # Chat-UniVi-7B |
| huggingface-cli download Chat-UniVi/Chat-UniVi-7B-v1.5 |
| |
| # CLIP ViT-L |
| huggingface-cli download openai/clip-vit-large-patch14 |
| ``` |
|
|
| 下载完成后做离线验证: |
|
|
| ```bash |
| cd /workspace/SimToken |
| |
| TRANSFORMERS_OFFLINE=1 /opt/miniforge3/condabin/conda run -n simtoken \ |
| python -m py_compile train.py load_model.py decoder_invariance_check.py |
| ``` |
|
|
| --- |
|
|
| ## 5. Smoke Test |
|
|
| 先跑一个轻量 sanity check,确认 checkpoint、数据和离线模型缓存都能正常读取: |
|
|
| ```bash |
| cd /workspace/SimToken |
| |
| TRANSFORMERS_OFFLINE=1 /opt/miniforge3/condabin/conda run -n simtoken \ |
| python decoder_invariance_check.py \ |
| --eval_split test_s \ |
| --max_eval_rows 1 |
| ``` |
|
|
| 如果可以正常加载模型并输出 per-frame diff,就可以启动完整 A-min 训练: |
|
|
| ```bash |
| cd /workspace/SimToken |
| mkdir -p log checkpoints |
| |
| TRANSFORMERS_OFFLINE=1 /opt/miniforge3/condabin/conda run -n simtoken \ |
| python -W ignore train.py \ |
| --name amin_full_e1 \ |
| --init_from_saved_model \ |
| --epochs 1 \ |
| --batch_size 2 \ |
| --lr 1e-4 \ |
| --saved_model /workspace/SimToken/checkpoints/simtoken_pretrained.pth \ |
| --log_root /workspace/SimToken/log \ |
| --checkpoint_root /workspace/SimToken/checkpoints |
| ``` |
|
|
| 启动日志中应出现: |
|
|
| ```text |
| initialized training from saved model: /workspace/SimToken/checkpoints/simtoken_pretrained.pth |
| missing keys: ... | unexpected keys: ... |
| ``` |
|
|
| --- |
|
|
| ## 6. Upload to HuggingFace |
|
|
| 实验结束后,如需重新上传到 HuggingFace,先将数据目录压缩为归档文件,减少文件数量: |
|
|
| ```bash |
| cd /workspace/SimToken/data |
| |
| tar -cf image_embed.tar image_embed/ |
| tar -czf gt_mask.tar.gz gt_mask/ |
| tar -czf audio_embed.tar.gz audio_embed/ |
| tar -cf media.tar media/ |
| |
| ls -lh *.tar* |
| rm -rf image_embed/ gt_mask/ audio_embed/ media/ |
| ``` |
|
|
| 清理缓存并上传: |
|
|
| ```bash |
| cd /workspace/SimToken |
| |
| find . -name "__pycache__" -prune -exec rm -rf {} + |
| find . -name ".pytest_cache" -prune -exec rm -rf {} + |
| find . -name ".cache" -prune -exec rm -rf {} + |
| find . -name "*.pyc" -delete |
| |
| huggingface-cli login |
| python upload_hf.py --repo yfan07/SimToken |
| ``` |
|
|