Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -19,7 +19,7 @@ library_name: gem
|
|
| 19 |
GEM is a family of Generalist Human Motion models developed by NVIDIA. This repository hosts two model variants:
|
| 20 |
|
| 21 |
- **GEM-SOMA** — Full-body 77-joint pose (body + hands + face) using the [SOMA](https://research.nvidia.com/labs/dair/gem/) body model
|
| 22 |
-
- **GEM-SMPL** — 17-joint body pose using the
|
| 23 |
|
| 24 |
Both models reconstruct 3D human motion from monocular video with dynamic cameras, recovering both camera-space and global motion trajectories.
|
| 25 |
|
|
@@ -34,7 +34,7 @@ Both models reconstruct 3D human motion from monocular video with dynamic camera
|
|
| 34 |
| Model | Checkpoint | Body Model | Joints | Config | Code |
|
| 35 |
|---|---|---|---|---|---|
|
| 36 |
| GEM-SOMA | `gem_soma.ckpt` | SOMA | 77 (body + hands + face) | `config.json` | [GEM-X](https://github.com/NVlabs/GEM-X) |
|
| 37 |
-
| GEM-SMPL | `gem_smpl.ckpt` |
|
| 38 |
|
| 39 |
---
|
| 40 |
|
|
@@ -113,8 +113,8 @@ weights = torch.load(path, weights_only=False)
|
|
| 113 |
|
| 114 |
| Property | Value |
|
| 115 |
|---|---|
|
| 116 |
-
| Architecture |
|
| 117 |
-
| Body model |
|
| 118 |
| Feature space | gvhmr, 151-dim |
|
| 119 |
| Input | RGB video + 2D keypoints + bounding box + camera intrinsics (+ optional text/audio) |
|
| 120 |
| Output | Per-frame SMPL body parameters (pose, shape, translation) |
|
|
|
|
| 19 |
GEM is a family of Generalist Human Motion models developed by NVIDIA. This repository hosts two model variants:
|
| 20 |
|
| 21 |
- **GEM-SOMA** — Full-body 77-joint pose (body + hands + face) using the [SOMA](https://research.nvidia.com/labs/dair/gem/) body model
|
| 22 |
+
- **GEM-SMPL** — 17-joint body pose using the SMPL body model, with support for text/audio/music conditioning
|
| 23 |
|
| 24 |
Both models reconstruct 3D human motion from monocular video with dynamic cameras, recovering both camera-space and global motion trajectories.
|
| 25 |
|
|
|
|
| 34 |
| Model | Checkpoint | Body Model | Joints | Config | Code |
|
| 35 |
|---|---|---|---|---|---|
|
| 36 |
| GEM-SOMA | `gem_soma.ckpt` | SOMA | 77 (body + hands + face) | `config.json` | [GEM-X](https://github.com/NVlabs/GEM-X) |
|
| 37 |
+
| GEM-SMPL | `gem_smpl.ckpt` | SMPL | 17 (body) | `gem_smpl_config.json` | [GEM-SMPL](https://github.com/NVlabs/GEM-SMPL) |
|
| 38 |
|
| 39 |
---
|
| 40 |
|
|
|
|
| 113 |
|
| 114 |
| Property | Value |
|
| 115 |
|---|---|
|
| 116 |
+
| Architecture | 16-layer Transformer encoder (RoPE, 1024 latent dim, 8 heads) |
|
| 117 |
+
| Body model | SMPL (17 joints, body only) |
|
| 118 |
| Feature space | gvhmr, 151-dim |
|
| 119 |
| Input | RGB video + 2D keypoints + bounding box + camera intrinsics (+ optional text/audio) |
|
| 120 |
| Output | Per-frame SMPL body parameters (pose, shape, translation) |
|