Hexa09 commited on
Commit
2530cc5
·
verified ·
1 Parent(s): 33a5c5e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -1
README.md CHANGED
@@ -58,6 +58,24 @@ NEF is a custom serialization framework built from scratch to replace the overhe
58
 
59
  ---
60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  ## Prototype Scope
62
 
63
  This release validates the following:
@@ -83,5 +101,5 @@ Hexa-1B is the foundation. The production model is next.
83
  ## About Hexa Innovate
84
 
85
  Hexa Innovate is a student-led AI startup based in Bangladesh, focused on building efficient AI execution and serialization infrastructure for open-weight models at the edge.
86
- Here Founder & Developer Github Profile
87
  **GitHub:** [github.com/Hexa08](https://github.com/Hexa08)
 
58
 
59
  ---
60
 
61
+ ## Benchmark Results
62
+
63
+ Early checkpoint evaluation (step 40,000) on standard zero-shot benchmarks against GPT-2 124M baseline:
64
+
65
+ ![Benchmark Results](assets/benchmark.png)
66
+
67
+ | Task | Hexa 2B (MoE) | GPT-2 124M | Delta |
68
+ |---|---|---|---|
69
+ | ARC Easy | 26.5% | 43.2% | -16.7% |
70
+ | ARC Challenge | **27.0%** | 22.4% | **+4.6%** |
71
+ | OpenBookQA | **25.0%** | 14.2% | **+10.8%** |
72
+ | WinoGrande | 47.9% | 51.3% | -3.4% |
73
+ | **Average** | **31.6%** | 32.8% | -1.2% |
74
+
75
+ > Zero-shot evaluation using [EleutherAI lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) v0.4.2 at training step 40,000. 2 out of 4 tasks already exceed GPT-2 124M. Full evaluation pending production training run.
76
+
77
+ ---
78
+
79
  ## Prototype Scope
80
 
81
  This release validates the following:
 
101
  ## About Hexa Innovate
102
 
103
  Hexa Innovate is a student-led AI startup based in Bangladesh, focused on building efficient AI execution and serialization infrastructure for open-weight models at the edge.
104
+
105
  **GitHub:** [github.com/Hexa08](https://github.com/Hexa08)