Update README.md
Browse files
README.md
CHANGED
|
@@ -9,12 +9,12 @@ tags:
|
|
| 9 |
- nef
|
| 10 |
- solo-developer
|
| 11 |
- bangladesh-ai
|
| 12 |
-
-
|
| 13 |
pipeline_tag: text-generation
|
| 14 |
library_name: pytorch
|
| 15 |
---
|
| 16 |
|
| 17 |
-
# Hexa-
|
| 18 |
|
| 19 |
**Founder:** Madhab — Engineering Student, Cox's Bazar, Bangladesh
|
| 20 |
**Organization:** Hexa Innovate
|
|
@@ -25,7 +25,7 @@ library_name: pytorch
|
|
| 25 |
|
| 26 |
## What This Is
|
| 27 |
|
| 28 |
-
Hexa-
|
| 29 |
|
| 30 |
This is not a general-purpose chat model. Inference quality is intentionally deferred to the production training run. What this prototype proves is the infrastructure layer — and that is the point.
|
| 31 |
|
|
@@ -51,7 +51,7 @@ NEF is a custom serialization framework built from scratch to replace the overhe
|
|
| 51 |
| Property | Detail |
|
| 52 |
|---|---|
|
| 53 |
| Architecture | HexaDense (Transformer Decoder) |
|
| 54 |
-
| Parameters |
|
| 55 |
| Serialization | NEF (Neural Essence Format) |
|
| 56 |
| Training hardware | Dual NVIDIA Tesla T4 (cloud compute credits) |
|
| 57 |
| Languages | English, Bengali |
|
|
@@ -94,7 +94,7 @@ I am a Diploma in Engineering student from Cox's Bazar, Bangladesh. Every compon
|
|
| 94 |
|
| 95 |
Most billion-parameter models come from large teams with large budgets. This one did not. The constraint was the design brief.
|
| 96 |
|
| 97 |
-
Hexa-
|
| 98 |
|
| 99 |
---
|
| 100 |
|
|
|
|
| 9 |
- nef
|
| 10 |
- solo-developer
|
| 11 |
- bangladesh-ai
|
| 12 |
+
- 2b-parameters
|
| 13 |
pipeline_tag: text-generation
|
| 14 |
library_name: pytorch
|
| 15 |
---
|
| 16 |
|
| 17 |
+
# Hexa-2B — NEF Serialization Prototype
|
| 18 |
|
| 19 |
**Founder:** Madhab — Engineering Student, Cox's Bazar, Bangladesh
|
| 20 |
**Organization:** Hexa Innovate
|
|
|
|
| 25 |
|
| 26 |
## What This Is
|
| 27 |
|
| 28 |
+
Hexa-2B is a 2-billion parameter language model built as a **technical proof-of-concept for the NEF serialization framework**. The goal of this release is singular: demonstrate that NEF can correctly serialize, store, and load a billion-scale model on accessible hardware without dependency on standard bloated AI libraries.
|
| 29 |
|
| 30 |
This is not a general-purpose chat model. Inference quality is intentionally deferred to the production training run. What this prototype proves is the infrastructure layer — and that is the point.
|
| 31 |
|
|
|
|
| 51 |
| Property | Detail |
|
| 52 |
|---|---|
|
| 53 |
| Architecture | HexaDense (Transformer Decoder) |
|
| 54 |
+
| Parameters | 2 Billion (0.27B active via MoE) |
|
| 55 |
| Serialization | NEF (Neural Essence Format) |
|
| 56 |
| Training hardware | Dual NVIDIA Tesla T4 (cloud compute credits) |
|
| 57 |
| Languages | English, Bengali |
|
|
|
|
| 94 |
|
| 95 |
Most billion-parameter models come from large teams with large budgets. This one did not. The constraint was the design brief.
|
| 96 |
|
| 97 |
+
Hexa-2B is the foundation. The production model is next.
|
| 98 |
|
| 99 |
---
|
| 100 |
|