LeTau
/

Minimal_VLA

@@ -75,7 +75,7 @@ Evaluated on 10 test samples from 1000 synthetic demonstrations:
 | Metric | Value | Notes |
 |--------|-------|-------|
 | Position Error | **8.60cm** | Suitable for ~5cm cube picking |
-| Gripper Accuracy | **70%** | Reliable grasp planning |
 | Overall MAE | **0.1217** | Across all 8 action dimensions |
 | Quaternion Error | 19.36° | Best for top-down grasps |
@@ -97,7 +97,7 @@ import torch
 # Load model
 device = 'cuda' if torch.cuda.is_available() else 'cpu'
-checkpoint = torch.load('pytorch_model.bin', map_location=device)
 vlm_encoder = VLM_Encoder().to(device)
 policy = ImprovedFlowMatchingPolicy(action_dim=8, context_dim=1024, hidden_dim=512).to(device)
@@ -131,10 +131,10 @@ python vla_flow_matching.py --mode replay --checkpoint vla_checkpoint_best.pt
 ```
 ├── vla_flow_matching.py    # Complete implementation (~1000 lines)
-├── pytorch_model.bin       # Trained weights (~20MB)
-├── demo_data.pkl          # Training data (1000 demos)
-├── replay_results.png     # Evaluation visualization
-└── README.md              # This file
 ```
 ## 🎯 What You'll Learn
@@ -202,7 +202,7 @@ The synthetic environment generates pick-and-place demonstrations:
 ```bash
 # Collect 10-50 real demonstrations, then:
 python vla_flow_matching.py --mode finetune \
-    --checkpoint pytorch_model.bin \
     --data_path real_robot_demos.pkl \
     --epochs 30 --lr 1e-5
 ```

 | Metric | Value | Notes |
 |--------|-------|-------|
 | Position Error | **8.60cm** | Suitable for ~5cm cube picking |
+| Gripper Accuracy | **75%** | Reliable grasp planning |
 | Overall MAE | **0.1217** | Across all 8 action dimensions |
 | Quaternion Error | 19.36° | Best for top-down grasps |
 # Load model
 device = 'cuda' if torch.cuda.is_available() else 'cpu'
+checkpoint = torch.load('vla_checkpoint_best.pt', map_location=device)
 vlm_encoder = VLM_Encoder().to(device)
 policy = ImprovedFlowMatchingPolicy(action_dim=8, context_dim=1024, hidden_dim=512).to(device)
 ```
 ├── vla_flow_matching.py    # Complete implementation (~1000 lines)
+├── vla_checkpoint_best.pt  # Trained weights (~20MB)
+├── demo_data.pkl           # Training data (1000 demos)
+├── replay_results.png      # Evaluation visualization
+└── README.md               # This file
 ```
 ## 🎯 What You'll Learn
 ```bash
 # Collect 10-50 real demonstrations, then:
 python vla_flow_matching.py --mode finetune \
+    --checkpoint vla_checkpoint_best.pt \
     --data_path real_robot_demos.pkl \
     --epochs 30 --lr 1e-5
 ```