2026-04-09 04:54:01,214 [INFO] Intervention Benchmark — testing causal effect validity 2026-04-09 04:54:01,214 [INFO] --- Causal Structure Tests --- 2026-04-09 04:54:01,336 [INFO] Loading faiss with AVX512 support. 2026-04-09 04:54:01,399 [INFO] Successfully loaded faiss with AVX512 support. 2026-04-09 04:54:02,236 [INFO] confounded (seed 1/5) 2026-04-09 04:54:03,411 [INFO] RMSE=0.893 DirAcc=0.000 TrajCorr=0.000 2026-04-09 04:54:03,411 [INFO] confounded (seed 2/5) 2026-04-09 04:54:03,993 [INFO] RMSE=1.043 DirAcc=0.000 TrajCorr=0.000 2026-04-09 04:54:03,994 [INFO] confounded (seed 3/5) 2026-04-09 04:54:04,675 [INFO] RMSE=0.333 DirAcc=0.000 TrajCorr=0.000 2026-04-09 04:54:04,675 [INFO] confounded (seed 4/5) 2026-04-09 04:54:05,365 [INFO] RMSE=0.286 DirAcc=0.000 TrajCorr=0.000 2026-04-09 04:54:05,365 [INFO] confounded (seed 5/5) 2026-04-09 04:54:06,165 [INFO] RMSE=0.853 DirAcc=0.000 TrajCorr=0.000 2026-04-09 04:54:06,166 [INFO] mediated (seed 1/5) 2026-04-09 04:54:06,978 [INFO] RMSE=0.846 DirAcc=0.833 TrajCorr=0.263 2026-04-09 04:54:06,978 [INFO] mediated (seed 2/5) 2026-04-09 04:54:07,669 [INFO] RMSE=0.583 DirAcc=0.283 TrajCorr=-0.392 2026-04-09 04:54:07,669 [INFO] mediated (seed 3/5) 2026-04-09 04:54:08,584 [INFO] RMSE=1.298 DirAcc=0.683 TrajCorr=-0.318 2026-04-09 04:54:08,584 [INFO] mediated (seed 4/5) 2026-04-09 04:54:09,269 [INFO] RMSE=0.713 DirAcc=0.300 TrajCorr=-0.579 2026-04-09 04:54:09,269 [INFO] mediated (seed 5/5) 2026-04-09 04:54:10,274 [INFO] RMSE=1.021 DirAcc=0.233 TrajCorr=0.488 2026-04-09 04:54:10,274 [INFO] time_varying_confounded (seed 1/5) 2026-04-09 04:54:10,666 [INFO] RMSE=0.235 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:10,666 [INFO] time_varying_confounded (seed 2/5) 2026-04-09 04:54:11,291 [INFO] RMSE=0.506 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:11,291 [INFO] time_varying_confounded (seed 3/5) 2026-04-09 04:54:12,081 [INFO] RMSE=0.180 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:12,081 [INFO] time_varying_confounded (seed 4/5) 2026-04-09 04:54:12,865 [INFO] RMSE=0.448 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:12,865 [INFO] time_varying_confounded (seed 5/5) 2026-04-09 04:54:13,484 [INFO] RMSE=0.680 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:13,484 [INFO] feedback (seed 1/5) 2026-04-09 04:54:14,178 [INFO] RMSE=0.216 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:14,178 [INFO] feedback (seed 2/5) 2026-04-09 04:54:15,076 [INFO] RMSE=0.419 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:15,076 [INFO] feedback (seed 3/5) 2026-04-09 04:54:16,089 [INFO] RMSE=0.632 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:16,089 [INFO] feedback (seed 4/5) 2026-04-09 04:54:16,783 [INFO] RMSE=0.076 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:16,783 [INFO] feedback (seed 5/5) 2026-04-09 04:54:17,674 [INFO] RMSE=0.223 DirAcc=1.000 TrajCorr=0.000 2026-04-09 04:54:17,674 [INFO] instrumental_variable (seed 1/5) 2026-04-09 04:54:18,378 [INFO] RMSE=0.915 DirAcc=1.000 TrajCorr=0.876 2026-04-09 04:54:18,378 [INFO] instrumental_variable (seed 2/5) 2026-04-09 04:54:19,191 [INFO] RMSE=0.831 DirAcc=0.783 TrajCorr=-0.887 2026-04-09 04:54:19,191 [INFO] instrumental_variable (seed 3/5) 2026-04-09 04:54:19,577 [INFO] RMSE=0.708 DirAcc=1.000 TrajCorr=-0.649 2026-04-09 04:54:19,577 [INFO] instrumental_variable (seed 4/5) 2026-04-09 04:54:20,078 [INFO] RMSE=0.162 DirAcc=1.000 TrajCorr=-0.367 2026-04-09 04:54:20,078 [INFO] instrumental_variable (seed 5/5) 2026-04-09 04:54:20,480 [INFO] RMSE=0.919 DirAcc=1.000 TrajCorr=0.917 2026-04-09 04:54:20,480 [INFO] non_identifiable (seed 1/5) 2026-04-09 04:54:20,966 [INFO] RMSE=0.268 DirAcc=1.000 TrajCorr=-0.970 2026-04-09 04:54:20,966 [INFO] non_identifiable (seed 2/5) 2026-04-09 04:54:21,378 [INFO] RMSE=0.248 DirAcc=1.000 TrajCorr=0.490 2026-04-09 04:54:21,378 [INFO] non_identifiable (seed 3/5) 2026-04-09 04:54:21,736 [INFO] RMSE=0.121 DirAcc=1.000 TrajCorr=-0.703 2026-04-09 04:54:21,736 [INFO] non_identifiable (seed 4/5) 2026-04-09 04:54:22,167 [INFO] RMSE=0.246 DirAcc=1.000 TrajCorr=0.217 2026-04-09 04:54:22,167 [INFO] non_identifiable (seed 5/5) 2026-04-09 04:54:22,645 [INFO] RMSE=0.601 DirAcc=1.000 TrajCorr=-0.409 2026-04-09 04:54:22,645 [INFO] --- Temporal Intervention Scenarios --- 2026-04-09 04:54:22,645 [INFO] Scenario: Step Intervention 2026-04-09 04:54:22,645 [INFO] Seed 1/5 (seed=42) 2026-04-09 04:54:23,887 [INFO] RMSE=0.4248 ATE_err=0.0578 DirAcc=0.578 2026-04-09 04:54:23,887 [INFO] Seed 2/5 (seed=142) 2026-04-09 04:54:25,199 [INFO] RMSE=0.3083 ATE_err=0.0097 DirAcc=0.556 2026-04-09 04:54:25,199 [INFO] Seed 3/5 (seed=242) 2026-04-09 04:54:26,680 [INFO] RMSE=0.4416 ATE_err=0.0539 DirAcc=0.511 2026-04-09 04:54:26,680 [INFO] Seed 4/5 (seed=342) 2026-04-09 04:54:27,973 [INFO] RMSE=0.3954 ATE_err=0.1642 DirAcc=0.511 2026-04-09 04:54:27,973 [INFO] Seed 5/5 (seed=442) 2026-04-09 04:54:29,087 [INFO] RMSE=0.6695 ATE_err=0.3789 DirAcc=0.533 2026-04-09 04:54:29,164 [INFO] Scenario: Dose-Response Curve 2026-04-09 04:54:29,164 [INFO] Seed 1/5 (seed=42) 2026-04-09 04:55:42,566 [INFO] RMSE=0.2550 ATE_err=0.0132 DirAcc=0.578 2026-04-09 04:55:42,566 [INFO] Seed 2/5 (seed=142) 2026-04-09 04:56:44,799 [INFO] RMSE=0.1873 ATE_err=0.0254 DirAcc=0.589 2026-04-09 04:56:44,799 [INFO] Seed 3/5 (seed=242) 2026-04-09 04:57:48,565 [INFO] RMSE=0.2736 ATE_err=0.0589 DirAcc=0.511 2026-04-09 04:57:48,565 [INFO] Seed 4/5 (seed=342) 2026-04-09 04:58:52,675 [INFO] RMSE=0.2455 ATE_err=0.1624 DirAcc=0.511 2026-04-09 04:58:52,675 [INFO] Seed 5/5 (seed=442) 2026-04-09 04:59:57,885 [INFO] RMSE=0.4179 ATE_err=0.4102 DirAcc=0.533 2026-04-09 04:59:57,885 [INFO] Scenario: Policy Comparison 2026-04-09 04:59:57,885 [INFO] Seed 1/5 (seed=42) 2026-04-09 05:00:32,476 [INFO] RMSE=0.0230 ATE_err=0.0102 DirAcc=1.000 2026-04-09 05:00:32,476 [INFO] Seed 2/5 (seed=142) 2026-04-09 05:01:10,676 [INFO] RMSE=0.0214 ATE_err=0.0133 DirAcc=0.000 2026-04-09 05:01:10,676 [INFO] Seed 3/5 (seed=242) 2026-04-09 05:01:48,781 [INFO] RMSE=0.0536 ATE_err=0.0462 DirAcc=0.000 2026-04-09 05:01:48,781 [INFO] Seed 4/5 (seed=342) 2026-04-09 05:02:25,480 [INFO] RMSE=0.0844 ATE_err=0.0438 DirAcc=1.000 2026-04-09 05:02:25,481 [INFO] Seed 5/5 (seed=442) 2026-04-09 05:03:03,275 [INFO] RMSE=0.2676 ATE_err=0.1943 DirAcc=1.000 2026-04-09 05:03:03,276 [INFO] Scenario: Intervention Timing 2026-04-09 05:03:03,276 [INFO] Seed 1/5 (seed=42) 2026-04-09 05:03:03,792 [INFO] Timing t=50: RMSE=0.2253 ATE_err=0.0555 2026-04-09 05:03:04,591 [INFO] Timing t=100: RMSE=0.2310 ATE_err=0.1003 2026-04-09 05:03:05,464 [INFO] Timing t=200: RMSE=0.2434 ATE_err=0.1261 2026-04-09 05:03:06,264 [INFO] Timing t=500: RMSE=0.2513 ATE_err=0.1282 2026-04-09 05:03:06,265 [INFO] RMSE=0.2377 ATE_err=0.1025 DirAcc=0.517 2026-04-09 05:03:06,265 [INFO] Seed 2/5 (seed=142) 2026-04-09 05:03:06,864 [INFO] Timing t=50: RMSE=0.5513 ATE_err=0.3532 2026-04-09 05:03:07,664 [INFO] Timing t=100: RMSE=0.4284 ATE_err=0.0595 2026-04-09 05:03:08,464 [INFO] Timing t=200: RMSE=0.7875 ATE_err=0.6660 2026-04-09 05:03:09,278 [INFO] Timing t=500: RMSE=0.4397 ATE_err=0.0154 2026-04-09 05:03:09,278 [INFO] RMSE=0.5518 ATE_err=0.2735 DirAcc=0.438 2026-04-09 05:03:09,278 [INFO] Seed 3/5 (seed=242) 2026-04-09 05:03:09,864 [INFO] Timing t=50: RMSE=0.3754 ATE_err=0.2597 2026-04-09 05:03:10,566 [INFO] Timing t=100: RMSE=0.3403 ATE_err=0.2084 2026-04-09 05:03:11,491 [INFO] Timing t=200: RMSE=0.4809 ATE_err=0.3954 2026-04-09 05:03:12,264 [INFO] Timing t=500: RMSE=0.2722 ATE_err=0.0156 2026-04-09 05:03:12,265 [INFO] RMSE=0.3672 ATE_err=0.2198 DirAcc=0.508 2026-04-09 05:03:12,265 [INFO] Seed 4/5 (seed=342) 2026-04-09 05:03:12,864 [INFO] Timing t=50: RMSE=0.4183 ATE_err=0.3306 2026-04-09 05:03:13,664 [INFO] Timing t=100: RMSE=0.5142 ATE_err=0.4501 2026-04-09 05:03:14,483 [INFO] Timing t=200: RMSE=0.3095 ATE_err=0.1712 2026-04-09 05:03:15,264 [INFO] Timing t=500: RMSE=0.2533 ATE_err=0.0399 2026-04-09 05:03:15,265 [INFO] RMSE=0.3738 ATE_err=0.2480 DirAcc=0.575 2026-04-09 05:03:15,265 [INFO] Seed 5/5 (seed=442) 2026-04-09 05:03:15,864 [INFO] Timing t=50: RMSE=0.4634 ATE_err=0.1056 2026-04-09 05:03:16,579 [INFO] Timing t=100: RMSE=0.4908 ATE_err=0.1791 2026-04-09 05:03:17,264 [INFO] Timing t=200: RMSE=0.7385 ATE_err=0.5837 2026-04-09 05:03:17,987 [INFO] Timing t=500: RMSE=0.6197 ATE_err=0.4199 2026-04-09 05:03:17,988 [INFO] RMSE=0.5781 ATE_err=0.3221 DirAcc=0.554 ================================================================================ INTERVENTION BENCHMARK — Causal Structure Tests ================================================================================ Structure RMSE DirAcc TrajCorr NullDet TrueMean PredMean -------------------------------------------------------------------------------- confounded 0.682 0.000 0.000 0.000 0.000 0.681 mediated 0.892 0.467 -0.108 N/A 0.551 -0.090 time_varying_confounded 0.410 1.000 0.000 N/A 0.591 0.183 feedback 0.313 1.000 0.000 N/A 0.515 0.544 instrumental_variable 0.707 0.957 -0.022 N/A 0.877 0.234 non_identifiable 0.297 1.000 -0.275 N/A 0.600 0.856 ================================================================================ INTERVENTION BENCHMARK — Temporal Intervention Scenarios ================================================================================ Scenario Seeds Traj RMSE ATE Error Dir Acc -------------------------------------------------------------------------------- step 5 0.4479+/-0.120 0.1329+/-0.133 0.538+/-0.03 dose_response 5 0.2759+/-0.077 0.1340+/-0.148 0.544+/-0.03 policy 5 0.0900+/-0.092 0.0616+/-0.068 0.600+/-0.49 timing 5 0.4217+/-0.127 0.2332+/-0.073 0.518+/-0.05 2026-04-09 05:03:17,991 [INFO] Results saved to outputs/benchmarks/intervention_results.json