Spaces:
Running
Running
Update app/src/content/chapters/folding/04-data-collection.mdx
#5
by nepyope - opened
app/src/content/chapters/folding/04-data-collection.mdx
CHANGED
|
@@ -10,7 +10,7 @@ We ran **8 setups** in parallel, optimizing for **maximum diversity**: 25+ diffe
|
|
| 10 |
|
| 11 |
### Learning to Teleoperate
|
| 12 |
|
| 13 |
-
|
| 14 |
|
| 15 |
This creates one of the most important practical decisions of the project: **when do you start recording data for the final model?** Too early and you pollute the dataset with low-quality demonstrations that the model will faithfully reproduce, hesitations, fumbles, and all. Too late and you've wasted precious time.
|
| 16 |
|
|
@@ -18,11 +18,11 @@ Another important part is aligning the strategy between operators. Since some pa
|
|
| 18 |
|
| 19 |
### Tips for Good Data Collection
|
| 20 |
|
| 21 |
-
1. **Practice before you record.** Smooth
|
| 22 |
-
2. **Quality over speed.
|
| 23 |
-
3. **Each action should make sense from the current observation alone.** Most models don't have history, so avoid motions that only work because *you* remember what happened 5 seconds ago.
|
| 24 |
4. **Be consistent within episodes.** The model learns a coherent strategy more easily than movements that vary wildly each time.
|
| 25 |
-
5. **Start small, then extend.**
|
| 26 |
6. **Speed comes last.** Once you've dialed in quality and a consistent strategy, optimize for speed. But never sacrifice quality for it.
|
| 27 |
|
| 28 |
After learning all these things and collecting data for multiple weeks we ended up with 5,688 episodes across 8 setups.
|
|
|
|
| 10 |
|
| 11 |
### Learning to Teleoperate
|
| 12 |
|
| 13 |
+
Teleoperating a bimanual robot is a genuine skill, and it takes practice; this means that unfortunately **early data is worse than the final data**. The first episodes are slow, not deliberate, and full of failed attempts. Over hours of practice, operators learn smoother motions, faster execution, and more consistent grasps.
|
| 14 |
|
| 15 |
This creates one of the most important practical decisions of the project: **when do you start recording data for the final model?** Too early and you pollute the dataset with low-quality demonstrations that the model will faithfully reproduce, hesitations, fumbles, and all. Too late and you've wasted precious time.
|
| 16 |
|
|
|
|
| 18 |
|
| 19 |
### Tips for Good Data Collection
|
| 20 |
|
| 21 |
+
1. **Practice before you record.** Smooth and deliberate beats fast and sloppy.
|
| 22 |
+
2. **Quality over speed.** Once learned, bad habits are hard to untrain.
|
| 23 |
+
3. **Each action should make sense from the current observation alone.** Most models are markovian, they don't have history, so avoid motions that only work because *you* remember what happened 5 seconds ago.
|
| 24 |
4. **Be consistent within episodes.** The model learns a coherent strategy more easily than movements that vary wildly each time.
|
| 25 |
+
5. **Start small, then extend.** Rather than trying to collect the perfect dataset day one, train a quick model, see what fails, then add diversity.
|
| 26 |
6. **Speed comes last.** Once you've dialed in quality and a consistent strategy, optimize for speed. But never sacrifice quality for it.
|
| 27 |
|
| 28 |
After learning all these things and collecting data for multiple weeks we ended up with 5,688 episodes across 8 setups.
|