File size: 7,400 Bytes
5914410
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
---
license: mit
language:
  - en
library_name: pytorch
tags:
  - rigging
  - skinning
  - skeleton
  - autoregressive
  - fsq
  - vae
  - 3d
  - animation
  - VAST
  - Tripo
---

# SkinTokens

Pretrained checkpoints for **SkinTokens: A Learned Compact Representation for Unified Autoregressive Rigging**.

[![Project Page](https://img.shields.io/badge/Project_Page-Website-green?logo=googlechrome&logoColor=white)](https://zjp-shadow.github.io/works/SkinTokens/)
[![arXiv](https://img.shields.io/badge/arXiv-2602.04805-b31b1b.svg)](https://arxiv.org/abs/2602.04805)
[![GitHub](https://img.shields.io/badge/GitHub-Code-black?logo=github)](https://github.com/VAST-AI-Research/SkinTokens)
[![Tripo](https://img.shields.io/badge/Tripo-3D_Studio-ff7a00)](https://www.tripo3d.ai)

This repository stores the model checkpoints used by the [SkinTokens codebase](https://github.com/VAST-AI-Research/SkinTokens), including:

- the **FSQ-CVAE** that learns the *SkinTokens* discrete representation of skinning weights, and
- the **TokenRig** autoregressive Transformer (Qwen3-0.6B architecture, GRPO-refined) that jointly generates skeletons and SkinTokens from a 3D mesh.

SkinTokens is the successor to [UniRig](https://github.com/VAST-AI-Research/UniRig) (SIGGRAPH '25). While UniRig treats skeleton and skinning as decoupled stages, SkinTokens unifies both into a single autoregressive sequence via learned discrete skin tokens, yielding **98%–133%** improvement in skinning accuracy and **17%–22%** improvement in bone prediction over state-of-the-art baselines.

## What Is Included

The repository is organized exactly like the `experiments/` folder expected by the main SkinTokens codebase:

```text
experiments/
β”œβ”€β”€ articulation_xl_quantization_256_token_4/
β”‚   └── grpo_1400.ckpt        # TokenRig autoregressive rigging model (GRPO-refined)
└── skin_vae_2_10_32768/
    └── last.ckpt             # FSQ-CVAE for SkinTokens (skin-weight tokenizer)
```

Approximate total size: about **1.6 GB**.

> The training data (`ArticulationXL` splits and processed meshes) used to train these checkpoints will be released separately in a future update.

## Checkpoint Overview

### SkinTokens β€” FSQ-CVAE (skin-weight tokenizer)

**File:** `experiments/skin_vae_2_10_32768/last.ckpt`

Compresses sparse skinning weights into discrete *SkinTokens* using a Finite Scalar Quantized Conditional VAE with codebook levels `[8, 8, 8, 5, 5, 5]` (64,000 entries). Used both to tokenize ground-truth weights during training and to decode TokenRig's output tokens back into per-vertex skinning at inference.

### TokenRig β€” autoregressive rigging model

**File:** `experiments/articulation_xl_quantization_256_token_4/grpo_1400.ckpt`

Qwen3-0.6B-based Transformer trained on a composite of **ArticulationXL 2.0 (70%)**, **VRoid Hub (20%)**, and **ModelsResource (10%)**, with quantization 256 and 4 skin tokens per bone, then refined with GRPO for 1,400 steps. **This is the recommended checkpoint** β€” it generates the skeleton and the SkinTokens in a single unified sequence.

> Both checkpoints are required for end-to-end inference: TokenRig generates the rig as a token sequence, and the FSQ-CVAE decoder turns SkinTokens back into dense per-vertex skinning weights.

## How To Use

The easiest way is to use the helper script in the main SkinTokens codebase, which downloads both checkpoints and the required Qwen3-0.6B config into the expected layout:

```bash
git clone https://github.com/VAST-AI-Research/SkinTokens.git
cd SkinTokens
python download.py --model
```

### Option 1 β€” Download with `hf` CLI

```bash
hf download VAST-AI/SkinTokens \
  --repo-type model \
  --local-dir .
```

### Option 2 β€” Download with `huggingface_hub` (Python)

```python
from huggingface_hub import snapshot_download

snapshot_download(
    repo_id="VAST-AI/SkinTokens",
    repo_type="model",
    local_dir=".",
    local_dir_use_symlinks=False,
)
```

### Option 3 β€” Download individual files

```python
from huggingface_hub import hf_hub_download

tokenrig_ckpt = hf_hub_download(
    repo_id="VAST-AI/SkinTokens",
    filename="experiments/articulation_xl_quantization_256_token_4/grpo_1400.ckpt",
)
skin_vae_ckpt = hf_hub_download(
    repo_id="VAST-AI/SkinTokens",
    filename="experiments/skin_vae_2_10_32768/last.ckpt",
)
```

### Option 4 β€” Web UI

Browse the [Files and versions](https://huggingface.co/VAST-AI/SkinTokens/tree/main) tab and download the folders manually, keeping the `experiments/...` layout intact.

After download, you should have:

```text
experiments/articulation_xl_quantization_256_token_4/grpo_1400.ckpt
experiments/skin_vae_2_10_32768/last.ckpt
```

## Run TokenRig With These Weights

Once the `experiments/` folder is in place (and the environment is installed per the [GitHub README](https://github.com/VAST-AI-Research/SkinTokens#installation)), you can run:

```bash
python demo.py --input examples/giraffe.glb --output results/giraffe.glb --use_transfer
```

Or launch the Gradio demo:

```bash
python demo.py
```

Then open `http://127.0.0.1:1024` in your browser.

## Notes

- **Keep the directory names unchanged.** The SkinTokens code expects the exact `experiments/.../*.ckpt` layout shown above.
- **TokenRig requires both checkpoints.** `grpo_1400.ckpt` generates discrete tokens; the SkinTokens FSQ-CVAE (`last.ckpt`) is needed to decode them into per-vertex skinning weights.
- **Qwen3-0.6B architecture.** TokenRig adopts the Qwen3-0.6B architecture (GQA + RoPE) for its autoregressive backbone; the [Qwen3 config](https://huggingface.co/Qwen/Qwen3-0.6B) is fetched automatically by `download.py`.
- **Hardware.** An NVIDIA GPU with at least **14 GB** of memory is required for inference.
- **Training data.** The checkpoints were trained on a composite of ArticulationXL 2.0 (70%), VRoid Hub (20%), and ModelsResource (10%); the processed data splits will be released as a separate dataset repository later.

## Related Links

- Your 3D AI workspace β€” **Tripo**: <https://www.tripo3d.ai>
- Project page: <https://zjp-shadow.github.io/works/SkinTokens/>
- Paper (arXiv): <https://arxiv.org/abs/2602.04805>
- Main code repository: <https://github.com/VAST-AI-Research/SkinTokens>
- Predecessor: [UniRig (SIGGRAPH '25)](https://github.com/VAST-AI-Research/UniRig)
- More from VAST-AI Research: <https://huggingface.co/VAST-AI>

## Acknowledgements

- [UniRig](https://github.com/VAST-AI-Research/UniRig) β€” the predecessor to this work.
- [Qwen3](https://github.com/QwenLM/Qwen3) β€” the LLM architecture used by the TokenRig autoregressive backbone.
- [3DShape2VecSet](https://github.com/1zb/3DShape2VecSet), [Michelangelo](https://github.com/NeuralCarver/Michelangelo) β€” the shape encoder backbone used by the FSQ-CVAE.
- [FSQ](https://arxiv.org/abs/2309.15505) β€” Finite Scalar Quantization, the discretization scheme behind SkinTokens.
- [GRPO](https://arxiv.org/abs/2402.03300) β€” the policy-optimization method used for RL refinement.

## Citation

If you find this work helpful, please consider citing our paper:

```bibtex
@article{zhang2026skintokens,
  title   = {SkinTokens: A Learned Compact Representation for Unified Autoregressive Rigging},
  author  = {Zhang, Jia-Peng and Pu, Cheng-Feng and Guo, Meng-Hao and Cao, Yan-Pei and Hu, Shi-Min},
  journal = {arXiv preprint arXiv:2602.04805},
  year    = {2026}
}
```