Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Spaces:
ybchen928
/
oncall-guide-ai
like
0
Sleeping
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
oncall-guide-ai
/
evaluation
661 kB
Ctrl+K
Ctrl+K
6 contributors
History:
25 commits
YanBoChen
Merge branch 'Merged20250805' into Merged20250811
4ad2c7c
10 months ago
modules
Refactor evaluation modules and add hospital chart generation
10 months ago
old
Before Run the 1st Evalation: Add Precision & MRR Chart Generator and a sample test query
10 months ago
results
Merge pull request #14 from YanBoChen0928/Jeff
10 months ago
README_HOSPITAL_CUSTOMIZATION.md
10.2 kB
feat(evaluation): add comprehensive hospital customization evaluation system
10 months ago
TEMP_MRR_complexity_fix.md
4.84 kB
Enhance evaluation framework with comprehensive metrics and improved query complexity analysis, temp bug fixing about metric 7-8
10 months ago
direct_llm_evaluator.py
22.2 kB
Update query file references for full evaluation and improve user prompts in evaluation scripts (before optimized_general_pipeline)
10 months ago
fixed_judge_evaluator.py
17.7 kB
Enhance evaluation framework with comprehensive metrics and improved query complexity analysis, temp bug fixing about metric 7-8
10 months ago
generate_combined_comparison_chart.py
8.56 kB
feat(evaluation): add visualization generators for generating png files
10 months ago
generate_comparison_report.py
18.8 kB
feat(evaluation): add comprehensive hospital customization evaluation system
10 months ago
generate_execution_time_table.py
7.6 kB
feat(evaluation): add visualization generators for generating png files
10 months ago
generate_hospital_charts.py
7.84 kB
Refactor evaluation modules and add hospital chart generation
10 months ago
generate_individual_analysis_charts.py
17.4 kB
Refactor evaluation modules and add hospital chart generation
10 months ago
generate_individual_rag_vs_direct_charts.py
12.9 kB
feat(evaluation): add visualization generators for generating png files
10 months ago
hospital_customization_evaluator.py
26.5 kB
feat(evaluation): add comprehensive hospital customization evaluation system
10 months ago
latency_evaluator.py
41.5 kB
Update query file references for full evaluation and improve user prompts in evaluation scripts (before optimized_general_pipeline)
10 months ago
metric1_latency_chart_generator.py
13.6 kB
Before Run the 1st Evalation: Add Precision & MRR Chart Generator and a sample test query
10 months ago
metric2_extraction_chart_generator.py
8.63 kB
Before Run the 1st Evalation: Add Precision & MRR Chart Generator and a sample test query
10 months ago
metric3_relevance_chart_generator.py
9.93 kB
Update threshold values in latency evaluator and coverage chart generator; enhance precision and MRR analysis with corrected thresholds and new chart generator for detailed metrics visualization.
10 months ago
metric4_coverage_chart_generator.py
9.32 kB
Update threshold values in latency evaluator and coverage chart generator; enhance precision and MRR analysis with corrected thresholds and new chart generator for detailed metrics visualization.
10 months ago
metric5_6_judge_evaluator_manual.md
9.86 kB
Add multi-system evaluation support for clinical actionability and evidence quality metrics
10 months ago
metric5_6_llm_judge_chart_generator.py
19.9 kB
Enhance evaluation framework with comprehensive metrics and improved query complexity analysis, temp bug fixing about metric 7-8
10 months ago
metric5_6_llm_judge_evaluator.py
30.3 kB
Enhance Direct LLM Evaluator and Judge Evaluator:
10 months ago
metric7_8_precision_MRR.py
19 kB
Enhance evaluation framework with comprehensive metrics and improved query complexity analysis, temp bug fixing about metric 7-8
10 months ago
metric7_8_precision_mrr_chart_generator.py
23.8 kB
Update threshold values in latency evaluator and coverage chart generator; enhance precision and MRR analysis with corrected thresholds and new chart generator for detailed metrics visualization.
10 months ago
pre_user_query_evaluate.txt
330 Bytes
Update query file references for full evaluation and correct typo in pre_user_query_evaluate.txt for pre-test.
10 months ago
rag_vs_direct_latency_chart_generator.py
14.7 kB
Add RAG vs Direct Latency Comparison Chart Generator for performance analysis
10 months ago
run_hospital_evaluation.py
3.58 kB
feat(evaluation): add comprehensive hospital customization evaluation system
10 months ago
run_rag_vs_direct_comparison.py
17.4 kB
Refactor evaluation modules and add hospital chart generation
10 months ago
single_test_query.txt
127 Bytes
Add comprehensive evaluation reports and execution time breakdown for Hospital Customization System
10 months ago
user_query.txt
1.52 kB
Update query file references for full evaluation and improve user prompts in evaluation scripts (before optimized_general_pipeline)
10 months ago
validate_expected_results.py
9.24 kB
Refactor evaluation modules and add hospital chart generation
10 months ago