File size: 4,749 Bytes
0f55e72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a95e79a
 
 
 
 
 
 
 
 
 
 
 
 
 
0f55e72
 
 
 
a95e79a
 
 
0f55e72
a95e79a
 
 
 
 
 
0f55e72
a95e79a
0f55e72
 
 
a95e79a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0f55e72
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
# SimToken Setup, Data, Upload, and Download Guide

This guide is for moving the SimToken workspace between rented servers.

Assumed paths:

```bash
PROJECT_ROOT=/workspace/SimToken
SAM2_ROOT=/workspace/sam2
HF_REPO=yfan07/SimToken
```

## 1. Environment Setup

```bash
conda create -n simtoken python=3.10 -y
conda activate simtoken

conda install -c conda-forge ffmpeg libsndfile git git-lfs wget -y
git lfs install

pip install --upgrade pip setuptools wheel
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
```

If CUDA 12.6 wheels are unavailable, use CUDA 12.1 wheels:

```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```

Install SimToken dependencies:

```bash
pip install \
  numpy pandas matplotlib opencv-python pillow tqdm einops timm sentencepiece \
  transformers==4.30.2 peft==0.2.0 accelerate safetensors huggingface-hub \
  packaging regex requests psutil gdown
```

Optional, only needed if regenerating audio features:

```bash
pip install towhee towhee.models
```

## 2. Repository Download

```bash
cd /workspace
huggingface-cli login

huggingface-cli download yfan07/SimToken \
  --repo-type model \
  --local-dir /workspace/SimToken \
  --local-dir-use-symlinks False
```

## 3. Model Preparation

### Hugging Face Models

```bash
mkdir -p /workspace/hf_models

huggingface-cli download openai/clip-vit-large-patch14 \
  --local-dir /workspace/hf_models/clip-vit-large-patch14 \
  --local-dir-use-symlinks False

huggingface-cli download Chat-UniVi/Chat-UniVi-7B-v1.5 \
  --local-dir /workspace/hf_models/Chat-UniVi-7B-v1.5 \
  --local-dir-use-symlinks False
```

### SAM2 for TubeToken Proposals

Put SAM2 under `/workspace/sam2`:

```bash
cd /workspace
git clone https://github.com/facebookresearch/sam2.git
cd /workspace/sam2

pip install -e .
```

Download SAM2.1 checkpoints:

```bash
cd /workspace/sam2/checkpoints
bash download_ckpts.sh
```

The TubeToken Phase 0 commands use:

```text
/workspace/sam2/checkpoints/sam2.1_hiera_large.pt
/workspace/sam2/sam2/configs/sam2.1/sam2.1_hiera_l.yaml
```

## 4. Dataset Preparation

Runtime layout:

```text
/workspace/SimToken/data
  metadata.csv
  media/
  gt_mask/
  audio_embed/
  image_embed/
```

Package the four data directories:

```bash
cd /workspace/SimToken/data

tar -cf media.tar media
tar -czf gt_mask.tar.gz gt_mask
tar -czf audio_embed.tar.gz audio_embed
tar -cf image_embed.tar image_embed
```

Restore the four data directories:

```bash
cd /workspace/SimToken/data

tar -xf media.tar
tar -xzf gt_mask.tar.gz
tar -xzf audio_embed.tar.gz
tar -xf image_embed.tar
```

## 5. Upload Repository

The remote repo stores the four large data directories as tar archives (`media.tar`, `image_embed.tar`, etc.).
The local workspace has them extracted as plain directories.
**Do not re-upload these directories**—use `--ignore-patterns` to skip them, otherwise every extracted file would be treated as a new upload.

### 5a. Pack any new data directories before uploading

If `data/text_embed/` is new (first upload after running `precompute_text_feats.py`):

```bash
cd /workspace/SimToken/data
tar -cf text_embed.tar text_embed
```

### 5b. Login

```bash
cd /workspace/SimToken
huggingface-cli login
```

### 5c. Upload (excluding extracted data directories)

Use the new `hf upload` command (not the deprecated `huggingface-cli upload`).
The deprecated command hashes all files before applying any filter, which is extremely slow with large data directories.
`hf upload` with `--exclude` skips the specified files before hashing.

```bash
hf upload yfan07/SimToken . . \
  --repo-type model \
  --exclude "data/media/**" "data/gt_mask/**" "data/audio_embed/**" "data/image_embed/**" "data/text_embed/**" \
  2>&1 | tee upload.log
```

This uploads everything except the four extracted dataset directories and the raw `text_embed/` folder.
The `data/text_embed.tar` file (sitting directly under `data/`) is **not** matched by `data/text_embed/**` and will be uploaded normally.

### Restore on a new server

After downloading the repo (Section 2), extract all packed data:

```bash
cd /workspace/SimToken/data
tar -xf media.tar
tar -xzf gt_mask.tar.gz
tar -xzf audio_embed.tar.gz
tar -xf image_embed.tar
tar -xf text_embed.tar      # if present
```

## 6. Current Experiment Files to Preserve

Keep these files and directories for continuing TubeToken experiments:

```text
runs/tubetoken_phase_minus1/audit_full
runs/tubetoken_phase_minus1/simtoken_eval
runs/tubetoken_phase0/proposals_stride8_n64_bidir
runs/tubetoken_phase0/eval_stride8_n64_bidir
runs/tubetoken_phase0/miss_videos_r64.txt
TubeToken_Phase0_Experiment_Log.md
TubeToken_Experiment_Plan_v4_Final.md
```