Spaces:
Running
Running
Refine data collection operator effort description
Browse files
app/src/content/chapters/folding/04-data-collection.mdx
CHANGED
|
@@ -35,7 +35,7 @@ After weeks of collecting data across operators, these are the guidelines we fou
|
|
| 35 |
|
| 36 |
### What we ended up with
|
| 37 |
|
| 38 |
-
After multiple weeks of collection across 8 setups, we had **5,688 episodes**, the **full dataset**. Those ~131 hours of recorded demonstrations represent
|
| 39 |
|
| 40 |
| Metric | Full dataset | High-quality dataset |
|
| 41 |
|:---|:---:|:---:|
|
|
|
|
| 35 |
|
| 36 |
### What we ended up with
|
| 37 |
|
| 38 |
+
After multiple weeks of collection across 8 setups, we had **5,688 episodes**, the **full dataset**. Those ~131 hours of recorded demonstrations represent a fraction of the total time operators spent on the project, which also included practicing teleoperation, setting up and repairing robots, and aligning on strategies between sessions. This shows that data collection is a lot more than just recording demonstrations, and being very efficient with your time is key. Not all episodes are equally useful either: some contain hesitations, inconsistent strategies, or poor fold quality. Later in the project, we built a smaller **high-quality dataset** of 1,200 episodes by selecting the best recordings from the full set and adding new demonstrations collected with a more unified strategy.
|
| 39 |
|
| 40 |
| Metric | Full dataset | High-quality dataset |
|
| 41 |
|:---|:---:|:---:|
|