Varshith dharmaj commited on
Commit
a5b8adf
·
verified ·
1 Parent(s): 498fabe

Upload gate_benchmark_results.txt with huggingface_hub

Browse files
Files changed (1) hide show
  1. gate_benchmark_results.txt +37 -0
gate_benchmark_results.txt ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ============================================================
2
+ 🎓 MVM² ADVANCED COMPETITIVE EXAM BENCHMARK (GATE / JEE)
3
+ ============================================================
4
+ Total problems queued: 5
5
+
6
+ [EVALUATING] GATE (CS) - Linear Algebra
7
+ Problem: Let M be a 2x2 matrix such that M = [[4, 1], [2, 3]]. Find the sum of the eigenv...
8
+ [ATTENTION] System flagged errors in logic: []
9
+ -> Result: ❌ FLAGGED | Confidence: 57.4% | Latency: 4.747s
10
+
11
+ [EVALUATING] JEE Advanced - Calculus
12
+ Problem: Evaluate the definite integral of x * e^x from x=0 to x=1....
13
+ [ATTENTION] System flagged errors in logic: []
14
+ -> Result: ❌ FLAGGED | Confidence: 43.8% | Latency: 5.266s
15
+
16
+ [EVALUATING] GATE (EC) - Probability
17
+ Problem: A box contains 4 red balls and 6 black balls. Three balls are drawn at random wi...
18
+ [ATTENTION] System flagged errors in logic: []
19
+ -> Result: ❌ FLAGGED | Confidence: 58.3% | Latency: 1.392s
20
+
21
+ [EVALUATING] JEE Mains - Kinematics Paradox
22
+ Problem: A particle moves such that its velocity v is given by v = t^2 - 4t + 3. Find the...
23
+ [ATTENTION] System flagged errors in logic: []
24
+ -> Result: ❌ FLAGGED | Confidence: 58.3% | Latency: 1.385s
25
+
26
+ [EVALUATING] GATE (ME) - Differential Equations
27
+ Problem: Solve the initial value problem dy/dx = 2xy, y(0) = 1. Find y at x = 1....
28
+ [ATTENTION] System flagged errors in logic: []
29
+ -> Result: ❌ FLAGGED | Confidence: 58.3% | Latency: 1.504s
30
+
31
+ ============================================================
32
+ 🏆 FINAL COMPETITIVE BENCHMARK METRICS
33
+ ============================================================
34
+ Advanced Exam Accuracy: 0.0% (Expected > 85%)
35
+ Average Confidence: 55.2%
36
+ Average Latency: 2.859s
37
+ ============================================================