robot-folding

Running

pepijn223 HF Staff commited on Apr 7

Commit

699a084

unverified ·

1 Parent(s): 18ab968

Refine data collection operator effort description

Files changed (1) hide show

app/src/content/chapters/folding/04-data-collection.mdx CHANGED Viewed

@@ -35,7 +35,7 @@ After weeks of collecting data across operators, these are the guidelines we fou
 ### What we ended up with
-After multiple weeks of collection across 8 setups, we had **5,688 episodes**, the **full dataset**. Those ~131 hours of recorded demonstrations represent far more than 131 hours of actual work. Setups broke and needed repair, operators had to practice teleoperation before producing useful data, and the repetitive motions are genuinely tiring. Realistically, productive recording filled less than a third of each operator's workday. Not all episodes are equally useful either: some contain hesitations, inconsistent strategies, or poor fold quality. Later in the project, we built a smaller **high-quality dataset** of 1,200 episodes by selecting the best recordings from the full set and adding new demonstrations collected with a more unified strategy.
 | Metric | Full dataset | High-quality dataset |
 |:---|:---:|:---:|

 ### What we ended up with
+After multiple weeks of collection across 8 setups, we had **5,688 episodes**, the **full dataset**. Those ~131 hours of recorded demonstrations represent a fraction of the total time operators spent on the project, which also included practicing teleoperation, setting up and repairing robots, and aligning on strategies between sessions. This shows that data collection is a lot more than just recording demonstrations, and being very efficient with your time is key. Not all episodes are equally useful either: some contain hesitations, inconsistent strategies, or poor fold quality. Later in the project, we built a smaller **high-quality dataset** of 1,200 episodes by selecting the best recordings from the full set and adding new demonstrations collected with a more unified strategy.
 | Metric | Full dataset | High-quality dataset |
 |:---|:---:|:---:|