# SimToken Setup 本文档用于在新机器上重建 SimToken 环境,并准备后续 A-min 实验。 --- ## 1. Create Environment 先确认 GPU 和 CUDA driver 状态: ```bash nvidia-smi ``` 创建 conda 环境: ```bash /opt/miniforge3/condabin/conda create -n simtoken python=3.10 -y /opt/miniforge3/condabin/conda activate simtoken python -m pip install --upgrade pip wheel "setuptools<81" pip install \ torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 \ --index-url https://download.pytorch.org/whl/cu121 pip install \ transformers==4.30.2 \ peft==0.2.0 \ accelerate==0.21.0 \ sentencepiece \ protobuf \ safetensors \ numpy==1.26.4 \ pandas \ matplotlib \ opencv-python \ pillow \ tqdm \ einops \ timm \ requests \ towhee \ huggingface_hub ``` 快速验证: ```bash python - <<'PY' import torch print("torch:", torch.__version__) print("cuda available:", torch.cuda.is_available()) print("device:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "cpu") PY ``` --- ## 2. Check Workspace After Migration 使用服务器平台的迁移工具完成目录迁移后,在新机器上确认关键文件: ```bash cd /workspace/SimToken ls -lh checkpoints/simtoken_pretrained.pth ls -lh models/segment_anything/sam_vit_h_4b8939.pth ls -d data/image_embed data/gt_mask data/audio_embed data/media ``` 如果迁移后只有压缩包而没有解压目录,重新解压: ```bash cd /workspace/SimToken/data tar -xf image_embed.tar tar -xzf gt_mask.tar.gz tar -xzf audio_embed.tar.gz tar -xf media.tar ``` 清理迁移中不需要的缓存: ```bash cd /workspace/SimToken find . -name "__pycache__" -prune -exec rm -rf {} + find . -name ".pytest_cache" -prune -exec rm -rf {} + find . -name ".cache" -prune -exec rm -rf {} + find . -name "*.pyc" -delete ``` --- ## 3. Download from HuggingFace 如果新机器不使用迁移工具,而是从 HuggingFace 重新初始化,先登录: ```bash huggingface-cli login ``` 下载完整 repo: ```bash mkdir -p /workspace/SimToken cd /workspace/SimToken huggingface-cli download yfan07/SimToken \ --repo-type model \ --local-dir . \ --local-dir-use-symlinks False ``` 下载完成后解压数据: ```bash cd /workspace/SimToken/data tar -xf image_embed.tar tar -xzf gt_mask.tar.gz tar -xzf audio_embed.tar.gz tar -xf media.tar ``` --- ## 4. Pre-download Model Weights `transformers==4.30.2` 与新版 `huggingface_hub` 可能存在网络/API 兼容问题。建议先用 CLI 将模型下载到本地缓存,实验时再加 `TRANSFORMERS_OFFLINE=1`。 ```bash # Chat-UniVi-7B huggingface-cli download Chat-UniVi/Chat-UniVi-7B-v1.5 # CLIP ViT-L huggingface-cli download openai/clip-vit-large-patch14 ``` 下载完成后做离线验证: ```bash cd /workspace/SimToken TRANSFORMERS_OFFLINE=1 /opt/miniforge3/condabin/conda run -n simtoken \ python -m py_compile train.py load_model.py decoder_invariance_check.py ``` --- ## 5. Smoke Test 先跑一个轻量 sanity check,确认 checkpoint、数据和离线模型缓存都能正常读取: ```bash cd /workspace/SimToken TRANSFORMERS_OFFLINE=1 /opt/miniforge3/condabin/conda run -n simtoken \ python decoder_invariance_check.py \ --eval_split test_s \ --max_eval_rows 1 ``` 如果可以正常加载模型并输出 per-frame diff,就可以启动完整 A-min 训练: ```bash cd /workspace/SimToken mkdir -p log checkpoints TRANSFORMERS_OFFLINE=1 /opt/miniforge3/condabin/conda run -n simtoken \ python -W ignore train.py \ --name amin_full_e1 \ --init_from_saved_model \ --epochs 1 \ --batch_size 2 \ --lr 1e-4 \ --saved_model /workspace/SimToken/checkpoints/simtoken_pretrained.pth \ --log_root /workspace/SimToken/log \ --checkpoint_root /workspace/SimToken/checkpoints ``` 启动日志中应出现: ```text initialized training from saved model: /workspace/SimToken/checkpoints/simtoken_pretrained.pth missing keys: ... | unexpected keys: ... ``` --- ## 6. Upload to HuggingFace 实验结束后,如需重新上传到 HuggingFace,先将数据目录压缩为归档文件,减少文件数量: ```bash cd /workspace/SimToken/data tar -cf image_embed.tar image_embed/ tar -czf gt_mask.tar.gz gt_mask/ tar -czf audio_embed.tar.gz audio_embed/ tar -cf media.tar media/ ls -lh *.tar* rm -rf image_embed/ gt_mask/ audio_embed/ media/ ``` 清理缓存并上传: ```bash cd /workspace/SimToken find . -name "__pycache__" -prune -exec rm -rf {} + find . -name ".pytest_cache" -prune -exec rm -rf {} + find . -name ".cache" -prune -exec rm -rf {} + find . -name "*.pyc" -delete huggingface-cli login python upload_hf.py --repo yfan07/SimToken ```