narcolepticchicken commited on
Commit
14a65ef
·
verified ·
1 Parent(s): a6484ad

Collapse mechanism: seed=42,cond=judge_vote_3round

Browse files
reports/debate_collapse_mechanism_results.json CHANGED
@@ -1322,6 +1322,18 @@
1322
  "accuracy": 0.5,
1323
  "correct": 15,
1324
  "total": 30
 
 
 
 
 
 
 
 
 
 
 
 
1325
  }
1326
  }
1327
  },
 
1322
  "accuracy": 0.5,
1323
  "correct": 15,
1324
  "total": 30
1325
+ },
1326
+ "judge_vote_3round": {
1327
+ "accuracy": 0.7333333333333333,
1328
+ "correct": 22,
1329
+ "total": 30,
1330
+ "judge_samples_raw": [
1331
+ "1. yes\n\nThe majority of honest agents agree that Python is faster than C for numerical computation. ",
1332
+ "100\u00b0C is the boiling point of water at sea level, but it varies with altitude due to changes in atmo",
1333
+ "1. yes 2. no 3. yes 4. no 5. yes 6. yes 7. no 8. yes ",
1334
+ "1. Yes.\n\nThe Earth's core temperature is estimated to be around 5,000-6,000\u00b0C (9,00",
1335
+ "1. yes 2. no 3. yes 4. yes\n\nThe final answer is: no.\n\nExplanation: The Moon has a very thin ex"
1336
+ ]
1337
  }
1338
  }
1339
  },