# Motion Latent Analysis

This notebook demonstrates how to work with motion latent representations from the MLD model:

1. **Generate variations** - Create 10 similar "jump" motions
2. **Compute mean latent** - Average the latent representations
3. **Distance computation** - Compare motions using L2 distance
4. **Classification** - Distinguish jump from non-jump motions


## Setup and Imports


In [None]:
import numpy as np
import torch
from pathlib import Path
from standalone_demo import StandaloneConfig, load_model

# Configuration
OUTPUT_DIR = Path("outputs/jump")
NUM_VARIATIONS = 20
MOTION_LENGTH = 120 # frames (6 seconds at 20fps)

 from .autonotebook import tqdm as notebook_tqdm


## Load Model

Load the MLD model for motion generation. This will auto-download models if needed.


In [2]:
print("Loading MLD model...")
config = StandaloneConfig()
config.resolve_paths(Path("."))
model = load_model(config)
print("✓ Model loaded successfully")

Loading MLD model...
Model initialized on cuda
Loading checkpoint from resources/checkpoints/model.ckpt
Checkpoint loaded successfully
✓ Model loaded successfully


## Step 1: Generate jump Variations

Generate 10 variations of "jump" motions using slightly different prompts.
Each generation saves:
- `.npy` - 3D joint positions
- `.latent.pt` - Latent representation


In [None]:
import shutil

# Create output directory
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

# Define prompt variations
jump_prompts = [
 "a person does a jump",
 "someone performs a jump",
 "a person jumps in the air",
 "doing a jump",
 "performing a jump",
 "a person does a jump",
 "someone jumps backward",
 "a person executes a jump",
 "doing an acrobatic jump",
 "a person jumps forward",
 "a person does a jump",
 "someone performs a jump",
 "a person jumps in the air",
 "doing a jump",
 "performing a jump",
 "a person does a jump",
 "someone jumps backward",
 "a person executes a jump",
 "doing an acrobatic jump",
 "a person jumps forward",
 "a person does a jump",
 "someone performs a jump",
 "a person jumps in the air",
 "doing a jump",
 "performing a jump",
 "a person does a jump",
 "someone jumps backward",
 "a person executes a jump",
 "doing an acrobatic jump",
 "a person jumps forward",
]

print(f"Generating {NUM_VARIATIONS} jump variations...\n")

latent_paths = []

for i, prompt in enumerate(jump_prompts[:NUM_VARIATIONS]):
 print(f"[{i + 1}/{NUM_VARIATIONS}] {prompt}")

 # Generate motion with latent
 (joints, latent, video_path) = model.generate(
 prompt, MOTION_LENGTH, return_latent=True, create_video=True
 )

 # Save files
 base_name = f"jump_var_{i:02d}"
 npy_path = OUTPUT_DIR / f"{base_name}.npy"
 latent_path = OUTPUT_DIR / f"{base_name}.latent.pt"

 np.save(npy_path, joints)
 torch.save(latent, latent_path)
 latent_paths.append(latent_path)

 # Save video
 video_path_target = OUTPUT_DIR / f"{base_name}.mp4"
 shutil.copy(video_path, video_path_target)

 print(f" ✓ Saved {base_name}")
 print(f" Joints: {joints.shape}, Latent: {latent.shape}")

print(f"\n✓ Generated {len(latent_paths)} jump variations")

Generating 20 jump variations...

[1/20] a person does a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_00
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[2/20] someone performs a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_01
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[3/20] a person jumps in the air


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_02
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[4/20] doing a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_03
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[5/20] performing a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_04
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[6/20] a person does a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_05
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[7/20] someone jumps backward


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_06
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[8/20] a person executes a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_07
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[9/20] doing an acrobatic jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_08
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[10/20] a person jumps forward


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_09
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[11/20] a person does a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_10
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[12/20] someone performs a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_11
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[13/20] a person jumps in the air


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_12
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[14/20] doing a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_13
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[15/20] performing a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_14
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[16/20] a person does a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_15
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[17/20] someone jumps backward


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_16
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[18/20] a person executes a jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_17
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[19/20] doing an acrobatic jump


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_18
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])
[20/20] a person jumps forward


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved jump_var_19
 Joints: (120, 22, 3), Latent: torch.Size([1, 1, 256])

✓ Generated 20 jump variations


## Step 2: Compute Mean Latent

Average all flip latents to create a "prototype" flip representation.


In [4]:
print(f"Computing mean latent from {len(latent_paths)} samples...")

# Load all latents
latents = [torch.load(path) for path in latent_paths]

# Stack and compute mean
latents_stacked = torch.stack(latents)
mean_latent = latents_stacked.mean(dim=0)

# Save mean latent
mean_latent_path = OUTPUT_DIR / "jump_mean.latent.pt"
torch.save(mean_latent, mean_latent_path)

print(f"✓ Mean latent shape: {mean_latent.shape}")
print(f"✓ Saved to: {mean_latent_path}")

Computing mean latent from 20 samples...
✓ Mean latent shape: torch.Size([1, 1, 256])
✓ Saved to: outputs/jump/jump_mean.latent.pt


## Step 3: Define Distance Function

L2 distance measures similarity between latent representations.


In [5]:
def compute_latent_distance(latent1, latent2):
 """
 Compute L2 (Euclidean) distance between two latent representations.

 Args:
 latent1: First latent tensor or path
 latent2: Second latent tensor or path

 Returns:
 L2 distance (float)
 """
 # Load if paths provided
 if isinstance(latent1, (str, Path)):
 latent1 = torch.load(latent1)
 if isinstance(latent2, (str, Path)):
 latent2 = torch.load(latent2)

 # Compute L2 norm of difference
 distance = torch.norm(latent1 - latent2, p=2).item()

 return distance


print("✓ Distance function defined")

✓ Distance function defined


## Step 4: Generate Test Motions

Generate:
- A flip motion (should be close to mean)
- A walk motion (should be far from mean)


In [6]:
print("Generating test motions...\n")

# Test 1: jump-like motion
print("1. Generating jump-like motion...")
joints_jump, latent_jump, video_path_jump = model.generate(
 "a person does a jump", MOTION_LENGTH, return_latent=True, create_video=True
)
jump_latent_path = OUTPUT_DIR / "test_jump.latent.pt"
torch.save(latent_jump, jump_latent_path)
np.save(OUTPUT_DIR / "test_jump.npy", joints_jump)

video_path_target = OUTPUT_DIR / "test_jump.mp4"
shutil.copy(video_path_jump, video_path_target)

print(f" ✓ Saved test jump motion")

# Test 2: Non-jump motion (walking)
print("\n2. Generating non-jump motion (walking)...")
joints_walk, latent_walk, video_path_walk = model.generate(
 "a person walks forward", MOTION_LENGTH, return_latent=True, create_video=True
)
walk_latent_path = OUTPUT_DIR / "test_walk.latent.pt"
torch.save(latent_walk, walk_latent_path)
np.save(OUTPUT_DIR / "test_walk.npy", joints_walk)

video_path_target = OUTPUT_DIR / "test_walk.mp4"
shutil.copy(video_path_walk, video_path_target)

print(f" ✓ Saved test walk motion")

Generating test motions...

1. Generating jump-like motion...


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved test jump motion

2. Generating non-jump motion (walking)...


 lengths = torch.tensor(lengths, device=device)


 ✓ Saved test walk motion


## Step 5: Compare Distances

Measure how close each test motion is to the mean jump latent.

**Hypothesis**: jump motion should have smaller distance than walk motion.


In [7]:
print("Computing distances to mean jump latent...\n")

# Distance: Test jump → Mean jump
dist_jump_to_mean = compute_latent_distance(latent_jump, mean_latent)

# Distance: Test walk → Mean jump
dist_walk_to_mean = compute_latent_distance(latent_walk, mean_latent)

# Display results
print("=" * 60)
print("📊 RESULTS")
print("=" * 60)
print(f"Distance (jump → mean jump): {dist_jump_to_mean:.4f}")
print(f"Distance (walk → mean jump): {dist_walk_to_mean:.4f}")
print(f"\nRatio (walk/jump): {dist_walk_to_mean / dist_jump_to_mean:.2f}x")
print("=" * 60)

if dist_jump_to_mean < dist_walk_to_mean:
 print("\n✅ SUCCESS: jump is closer to mean jump latent!")
 print(f" The model can distinguish jump from non-jump motions.")
else:
 print("\n⚠️ UNEXPECTED: Walk is closer to mean jump latent.")
 print(f" This suggests the latent space may not capture this distinction.")

Computing distances to mean jump latent...

📊 RESULTS
Distance (jump → mean jump): 12.6496
Distance (walk → mean jump): 42.3448

Ratio (walk/jump): 3.35x

✅ SUCCESS: jump is closer to mean jump latent!
 The model can distinguish jump from non-jump motions.


## Bonus: Analyze Individual Variation Distances

See how much each jump variation differs from the mean.


In [8]:
print("Analyzing variation distances...\n")

variation_distances = []
for i, latent_path in enumerate(latent_paths):
 dist = compute_latent_distance(latent_path, mean_latent)
 variation_distances.append(dist)
 print(f" Variation {i:02d}: {dist:.4f}")

avg_variation = np.mean(variation_distances)
std_variation = np.std(variation_distances)

print(f"\nVariation statistics:")
print(f" Mean distance: {avg_variation:.4f}")
print(f" Std deviation: {std_variation:.4f}")
print(f"\nComparison:")
print(
 f" Test jump: {dist_jump_to_mean:.4f} ({dist_jump_to_mean / avg_variation:.2f}x mean variation)"
)
print(
 f" Test walk: {dist_walk_to_mean:.4f} ({dist_walk_to_mean / avg_variation:.2f}x mean variation)"
)

Analyzing variation distances...

 Variation 00: 17.7083
 Variation 01: 23.6372
 Variation 02: 23.7708
 Variation 03: 27.0579
 Variation 04: 17.2911
 Variation 05: 18.6115
 Variation 06: 43.8279
 Variation 07: 29.0473
 Variation 08: 23.5446
 Variation 09: 20.4132
 Variation 10: 14.3313
 Variation 11: 19.8556
 Variation 12: 31.8104
 Variation 13: 20.7619
 Variation 14: 22.4498
 Variation 15: 34.5026
 Variation 16: 26.5776
 Variation 17: 38.9580
 Variation 18: 28.6006
 Variation 19: 24.1094

Variation statistics:
 Mean distance: 25.3433
 Std deviation: 7.2979

Comparison:
 Test jump: 12.6496 (0.50x mean variation)
 Test walk: 42.3448 (1.67x mean variation)


## Summary

### 📁 Files Created

In `outputs/jump/`:
- `jump_var_00` to `jump_var_09` (.npy + .latent.pt) - 10 jump variations
- `jump_mean.latent.pt` - Mean latent of all variations ⭐
- `test_jump` (.npy + .latent.pt) - Test jump motion
- `test_walk` (.npy + .latent.pt) - Test walk motion

**Total**: 24 files (10 variations + 2 tests + 1 mean + videos)

### 🔬 Key Findings

1. **Latent space clustering**: Similar motions (jumps) have similar latent representations
2. **Distance metric**: L2 distance effectively distinguishes motion types
3. **Mean latent**: Averaging latents creates a useful prototype representation

### 🎯 Applications

- **Motion classification**: Identify motion types (jump, walk, jump, etc.)
- **Motion retrieval**: Find similar motions in a database
- **Quality control**: Detect outlier/corrupted generations
- **Interpolation**: Blend between different motions
- **Style transfer**: Map motions to similar but different styles
- **Few-shot learning**: Create classifiers from few examples

### 💡 Next Steps

Try this analysis with other motion types:
- Jumps, spins, kicks, dances
- Compare multiple motion classes
- Build a motion classifier
- Create a motion search engine
