Spaces:

scaler-hack
/

scaler-openenv

Sleeping

App Files Files Community

scaler-openenv / pytest_output.txt

Hacktrix-121

error handling

7cd2458 about 2 months ago

raw

history blame contribute delete

25.1 kB

	��============================= test session starts =============================
	platform win32 -- Python 3.13.6, pytest-9.0.3, pluggy-1.6.0 -- C:\Users\Ryan\Desktop\scaler\.venv\Scripts\python.exe
	cachedir: .pytest_cache
	rootdir: C:\Users\Ryan\Desktop\scaler
	configfile: pyproject.toml
	plugins: anyio-4.13.0, cov-7.1.0
	collecting ... collected 71 items

	tests/test_tasks.py::TestEasyTaskGrader::test_critical_alert_correct PASSED [ 1%]
	tests/test_tasks.py::TestEasyTaskGrader::test_critical_alert_incorrect PASSED [ 2%]
	tests/test_tasks.py::TestEasyTaskGrader::test_false_positive_correct PASSED [ 4%]
	tests/test_tasks.py::TestEasyTaskGrader::test_episode_score_calculation PASSED [ 5%]
	tests/test_tasks.py::TestEasyTaskGrader::test_metrics_breakdown PASSED [ 7%]
	tests/test_tasks.py::TestMediumTaskGrader::test_productive_investigation FAILED [ 8%]
	tests/test_tasks.py::TestMediumTaskGrader::test_wasteful_investigation FAILED [ 9%]
	tests/test_tasks.py::TestMediumTaskGrader::test_resource_efficiency_calculation PASSED [ 11%]
	tests/test_tasks.py::TestMediumTaskGrader::test_episode_score_with_efficiency PASSED [ 12%]
	tests/test_tasks.py::TestMediumTaskGrader::test_critical_missed_penalty FAILED [ 14%]
	tests/test_tasks.py::TestHardTaskGrader::test_correlation_detection FAILED [ 15%]
	tests/test_tasks.py::TestHardTaskGrader::test_failure_prevention_bonus FAILED [ 16%]
	tests/test_tasks.py::TestHardTaskGrader::test_system_failure_penalty FAILED [ 18%]
	tests/test_tasks.py::TestHardTaskGrader::test_missed_correlated_alert_penalty FAILED [ 19%]
	tests/test_tasks.py::TestHardTaskGrader::test_correlation_detection_rate FAILED [ 21%]
	tests/test_tasks.py::TestHardTaskGrader::test_stability_score_perfect FAILED [ 22%]
	tests/test_tasks.py::TestHardTaskGrader::test_stability_score_degraded PASSED [ 23%]
	tests/test_tasks.py::test_grader_reset PASSED [ 25%]
	test_openenv_compliance.py::TestPydanticModels::test_observation_is_pydantic_model PASSED [ 26%]
	test_openenv_compliance.py::TestPydanticModels::test_action_is_pydantic_model PASSED [ 28%]
	test_openenv_compliance.py::TestPydanticModels::test_reward_is_pydantic_model PASSED [ 29%]
	test_openenv_compliance.py::TestPydanticModels::test_episode_state_is_pydantic_model PASSED [ 30%]
	test_openenv_compliance.py::TestPydanticModels::test_alert_is_pydantic_model PASSED [ 32%]
	test_openenv_compliance.py::TestPydanticModels::test_observation_has_required_fields PASSED [ 33%]
	test_openenv_compliance.py::TestPydanticModels::test_action_has_required_fields PASSED [ 35%]
	test_openenv_compliance.py::TestPydanticModels::test_reward_has_required_fields PASSED [ 36%]
	test_openenv_compliance.py::TestPydanticModels::test_action_type_is_literal PASSED [ 38%]
	test_openenv_compliance.py::TestPydanticModels::test_alert_type_is_literal PASSED [ 39%]
	test_openenv_compliance.py::TestPydanticModels::test_observation_serialization PASSED [ 40%]
	test_openenv_compliance.py::TestPydanticModels::test_action_serialization PASSED [ 42%]
	test_openenv_compliance.py::TestPydanticModels::test_reward_serialization PASSED [ 43%]
	test_openenv_compliance.py::TestStepInterface::test_step_exists PASSED [ 45%]
	test_openenv_compliance.py::TestStepInterface::test_step_accepts_action PASSED [ 46%]
	test_openenv_compliance.py::TestStepInterface::test_step_returns_tuple PASSED [ 47%]
	test_openenv_compliance.py::TestStepInterface::test_step_returns_observation PASSED [ 49%]
	test_openenv_compliance.py::TestStepInterface::test_step_returns_reward PASSED [ 50%]
	test_openenv_compliance.py::TestStepInterface::test_step_returns_done PASSED [ 52%]
	test_openenv_compliance.py::TestStepInterface::test_step_returns_info PASSED [ 53%]
	test_openenv_compliance.py::TestStepInterface::test_info_contains_processed_alerts PASSED [ 54%]
	test_openenv_compliance.py::TestStepInterface::test_info_contains_correlation_groups PASSED [ 56%]
	test_openenv_compliance.py::TestStepInterface::test_info_contains_system_failure PASSED [ 57%]
	test_openenv_compliance.py::TestStepInterface::test_reward_has_value PASSED [ 59%]
	test_openenv_compliance.py::TestStepInterface::test_observation_updated_after_step PASSED [ 60%]
	test_openenv_compliance.py::TestResetInterface::test_reset_exists PASSED [ 61%]
	test_openenv_compliance.py::TestResetInterface::test_reset_returns_observation PASSED [ 63%]
	test_openenv_compliance.py::TestResetInterface::test_reset_accepts_seed PASSED [ 64%]
	test_openenv_compliance.py::TestResetInterface::test_reset_accepts_options PASSED [ 66%]
	test_openenv_compliance.py::TestResetInterface::test_reset_reproducibility PASSED [ 67%]
	test_openenv_compliance.py::TestResetInterface::test_reset_clears_episode_state PASSED [ 69%]
	test_openenv_compliance.py::TestStateInterface::test_state_exists PASSED [ 70%]
	test_openenv_compliance.py::TestStateInterface::test_state_returns_episode_state PASSED [ 71%]
	test_openenv_compliance.py::TestStateInterface::test_episode_state_contains_observation PASSED [ 73%]
	test_openenv_compliance.py::TestStateInterface::test_episode_state_contains_hidden_state PASSED [ 74%]
	test_openenv_compliance.py::TestStateInterface::test_hidden_state_contains_true_severities PASSED [ 76%]
	test_openenv_compliance.py::TestStateInterface::test_hidden_state_contains_correlation_groups PASSED [ 77%]
	test_openenv_compliance.py::TestStateInterface::test_episode_state_contains_cumulative_reward PASSED [ 78%]
	test_openenv_compliance.py::TestStateInterface::test_episode_state_contains_failures_count PASSED [ 80%]
	test_openenv_compliance.py::TestStateInterface::test_episode_state_tracks_actions_taken PASSED [ 81%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_exists PASSED [ 83%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_is_valid_yaml PASSED [ 84%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_has_name PASSED [ 85%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_has_version PASSED [ 87%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_has_description PASSED [ 88%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_has_tasks PASSED [ 90%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_tasks_have_ids PASSED [ 91%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_has_config PASSED [ 92%]
	test_openenv_compliance.py::TestOpenEnvYAML::test_openenv_yaml_config_has_actions PASSED [ 94%]
	test_openenv_compliance.py::TestOpenEnvValidation::test_full_episode_workflow PASSED [ 95%]
	test_openenv_compliance.py::TestOpenEnvValidation::test_all_task_difficulties PASSED [ 97%]
	test_openenv_compliance.py::TestOpenEnvValidation::test_pydantic_validation PASSED [ 98%]
	test_openenv_compliance.py::TestOpenEnvValidation::test_serialization_round_trip PASSED [100%]

	================================== FAILURES ===================================
	_____________ TestMediumTaskGrader.test_productive_investigation ______________
	tests\test_tasks.py:142: in test_productive_investigation
	assert grader.investigations_used == 1
	^^^^^^^^^^^^^^^^^^^^^^^^^^
	E AttributeError: 'MediumTaskGrader' object has no attribute 'investigations_used'
	______________ TestMediumTaskGrader.test_wasteful_investigation _______________
	tests\test_tasks.py:161: in test_wasteful_investigation
	assert contribution < 0.0, "Wasteful investigation should be penalized"
	E AssertionError: Wasteful investigation should be penalized
	E assert 0.0 < 0.0
	______________ TestMediumTaskGrader.test_critical_missed_penalty ______________
	tests\test_tasks.py:217: in test_critical_missed_penalty
	assert grader.critical_missed == 1
	^^^^^^^^^^^^^^^^^^^^^^
	E AttributeError: 'MediumTaskGrader' object has no attribute 'critical_missed'
	________________ TestHardTaskGrader.test_correlation_detection ________________
	tests\test_tasks.py:246: in test_correlation_detection
	assert contribution > alert.true_severity, "Should get correlation bonus"
	E AssertionError: Should get correlation bonus
	E assert 0.85 > 0.85
	E + where 0.85 = Alert(id='alert_001', visible_severity=0.8, confidence=0.85, alert_type='CPU', age=1, true_severity=0.85, is_correlated=True, metadata={}).true_severity
	______________ TestHardTaskGrader.test_failure_prevention_bonus _______________
	tests\test_tasks.py:269: in test_failure_prevention_bonus
	assert grader.failures_prevented >= 1, "Should register failure prevention"
	^^^^^^^^^^^^^^^^^^^^^^^^^
	E AttributeError: 'HardTaskGrader' object has no attribute 'failures_prevented'
	_______________ TestHardTaskGrader.test_system_failure_penalty ________________
	tests\test_tasks.py:278: in test_system_failure_penalty
	assert grader.system_failures == 1
	^^^^^^^^^^^^^^^^^^^^^^
	E AttributeError: 'HardTaskGrader' object has no attribute 'system_failures'
	___________ TestHardTaskGrader.test_missed_correlated_alert_penalty ___________
	tests\test_tasks.py:305: in test_missed_correlated_alert_penalty
	assert contribution < -2.0, "Should have extra penalty for correlated miss"
	E AssertionError: Should have extra penalty for correlated miss
	E assert -0.255 < -2.0
	_____________ TestHardTaskGrader.test_correlation_detection_rate ______________
	tests\test_tasks.py:316: in test_correlation_detection_rate
	grader.chains_handled.add(0)
	^^^^^^^^^^^^^^^^^^^^^
	E AttributeError: 'HardTaskGrader' object has no attribute 'chains_handled'
	_______________ TestHardTaskGrader.test_stability_score_perfect _______________
	tests\test_tasks.py:326: in test_stability_score_perfect
	assert stability == 1.0, "Zero failures should give perfect stability"
	E AssertionError: Zero failures should give perfect stability
	E assert 0.99 == 1.0
	=============================== tests coverage ================================
	_______________ coverage: platform win32, python 3.13.6-final-0 _______________

	Name Stmts Miss Cover Missing
	---------------------------------------------------------------------
	src\adaptive_alert_triage\__init__.py 5 0 100%
	src\adaptive_alert_triage\env.py 197 48 76% 138, 269-275, 286-296, 463-464, 485-486, 513-520, 589, 652-684, 693-729, 733
	src\adaptive_alert_triage\models.py 49 1 98% 135
	src\adaptive_alert_triage\server.py 398 398 0% 19-723
	src\adaptive_alert_triage\utils.py 93 21 77% 142, 225, 243, 245, 438-458
	src\adaptive_alert_triage\validate.py 235 235 0% 23-417
	---------------------------------------------------------------------
	TOTAL 977 703 28%
	=========================== short test summary info ===========================
	FAILED tests/test_tasks.py::TestMediumTaskGrader::test_productive_investigation - AttributeError: 'MediumTaskGrader' object has no attribute 'investigations_used'
	FAILED tests/test_tasks.py::TestMediumTaskGrader::test_wasteful_investigation - AssertionError: Wasteful investigation should be penalized
	assert 0.0 < 0.0
	FAILED tests/test_tasks.py::TestMediumTaskGrader::test_critical_missed_penalty - AttributeError: 'MediumTaskGrader' object has no attribute 'critical_missed'
	FAILED tests/test_tasks.py::TestHardTaskGrader::test_correlation_detection - AssertionError: Should get correlation bonus
	assert 0.85 > 0.85
	+ where 0.85 = Alert(id='alert_001', visible_severity=0.8, confidence=0.85, alert_type='CPU', age=1, true_severity=0.85, is_correlated=True, metadata={}).true_severity
	FAILED tests/test_tasks.py::TestHardTaskGrader::test_failure_prevention_bonus - AttributeError: 'HardTaskGrader' object has no attribute 'failures_prevented'
	FAILED tests/test_tasks.py::TestHardTaskGrader::test_system_failure_penalty - AttributeError: 'HardTaskGrader' object has no attribute 'system_failures'
	FAILED tests/test_tasks.py::TestHardTaskGrader::test_missed_correlated_alert_penalty - AssertionError: Should have extra penalty for correlated miss
	assert -0.255 < -2.0
	FAILED tests/test_tasks.py::TestHardTaskGrader::test_correlation_detection_rate - AttributeError: 'HardTaskGrader' object has no attribute 'chains_handled'
	FAILED tests/test_tasks.py::TestHardTaskGrader::test_stability_score_perfect - AssertionError: Zero failures should give perfect stability
	assert 0.99 == 1.0
	======================== 9 failed, 62 passed in 1.14s =========================