Hexa09 commited on
Commit
eff4789
·
verified ·
1 Parent(s): 790681e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -34
README.md CHANGED
@@ -4,61 +4,53 @@ language:
4
  - en
5
  - bn
6
  tags:
 
 
7
  - nef
8
- - hexa
9
  - solo-developer
10
- - neural-essence-format
11
- - text-generation
12
  - bangladesh-ai
 
13
  pipeline_tag: text-generation
14
  library_name: pytorch
15
  ---
16
 
17
- # Hexa-1B (Prototype)
18
 
19
- **Developed by:** Madhab (Founder, Hexa Innovate Org)
20
- **Architecture:** HexaDense (Transformer Decoder)
21
  **Format:** [NEF (Neural Essence Format)](https://github.com/Hexa08/NEF)
22
- **Status:** Research Prototype (1.1 Billion Parameters)
23
 
24
  ---
25
 
26
- ## Model Summary
27
- Hexa-1B is a 1.1-billion parameter large language model engineered as a proof-of-concept for the Neural Essence Format (NEF). This project demonstrates the feasibility of building and training billion-scale transformer architectures by a solo developer using an optimized, modular serialization framework.
28
 
29
  ## Technical Framework: NEF
30
- This model utilizes the Neural Essence Format (NEF) for weight serialization and architectural definition. NEF is designed to provide a high-performance alternative to traditional model formats, focusing on:
31
- * **Binary Efficiency:** Optimized for rapid loading and minimal overhead.
32
- * **Modular Logic:** Tailored for seamless integration with custom inference engines.
33
- * **Streamlined Execution:** Reduced dependency footprint for deployment in resource-constrained environments.
34
 
35
  Repository: [github.com/Hexa08/NEF](https://github.com/Hexa08/NEF)
36
 
37
- ## Model Specifications
38
- * **Parameters:** 1.1 Billion
39
- * **Hidden Size:** 1536
40
- * **Layers:** 16
41
- * **Attention Heads:** 16
42
- * **Context Window:** 2048 Tokens
43
- * **Training Hardware:** 2x NVIDIA Tesla T4
44
- * **Precision:** FP16 (Half Precision)
45
 
46
- ## Solo Developer Milestone
47
- The development of Hexa-1B and the NEF framework was conducted entirely by a single engineer based in Cox's Bazar, Bangladesh. The project scope included:
48
- * Designing the transformer architecture in PyTorch.
49
- * Developing the NEF binary serialization format.
50
- * Managing the 18-hour training execution on a dual-GPU cluster.
51
 
52
- This prototype validates that localized, high-capacity AI infrastructure can be established through efficient engineering rather than massive team overhead.
53
-
54
- ## Current Limitations and Research Status
55
- This repository hosts a prototype version of Hexa-1B. During the training phase, the model reached a 0.0000 loss state, resulting in Mode Collapse (extreme overfitting).
56
- * **Observed Behavior:** The model currently produces repetitive outputs and high-frequency token loops.
57
- * **Objective:** This release is intended for architectural inspection and to showcase the performance of the NEF framework in handling billion-parameter weights.
58
 
59
  ---
60
 
61
- ### About Hexa Innovate Org
62
- Hexa Innovate Org is dedicated to building efficient, high-speed AI infrastructure in Bangladesh. We focus on localized intelligence and hardware-optimized execution layers.
63
 
64
  **GitHub:** [Hexa08](https://github.com/Hexa08)
 
4
  - en
5
  - bn
6
  tags:
7
+ - student-startup
8
+ - zero-to-one
9
  - nef
 
10
  - solo-developer
 
 
11
  - bangladesh-ai
12
+ - 1b-parameters
13
  pipeline_tag: text-generation
14
  library_name: pytorch
15
  ---
16
 
17
+ # Hexa-1B (Student-Led Prototype)
18
 
19
+ **Founder:** Madhab (Engineering Student)
20
+ **Organization:** Hexa Innovate (Early-Stage Startup)
21
  **Format:** [NEF (Neural Essence Format)](https://github.com/Hexa08/NEF)
22
+ **Capital:** $0 Budget Prototype
23
 
24
  ---
25
 
26
+ ## The $0 to $B Vision
27
+ Hexa-1B is a 1.1-billion parameter language model built to prove that world-class AI infrastructure can be engineered by a single student with zero external funding. This project represents the transition from a localized student experiment to a scalable AI startup. It is built on the belief that the next billion-dollar intelligence layers will come from high-efficiency engineering, not just high-budget labs.
28
 
29
  ## Technical Framework: NEF
30
+ This model is powered by the Neural Essence Format (NEF), a custom serialization framework developed to bypass the bloat of standard AI libraries.
31
+ * **Solo Engineering:** Built from scratch to allow large-scale models to run on accessible hardware.
32
+ * **Architecture:** HexaDense (Transformer Decoder).
33
+ * **Innovation:** NEF focuses on the "essence" of the weights, allowing for faster loading and execution in resource-constrained environments.
34
 
35
  Repository: [github.com/Hexa08/NEF](https://github.com/Hexa08/NEF)
36
 
37
+ ## Student Achievement Metrics
38
+ * **Scale:** 1.1 Billion Parameters managed solo.
39
+ * **Execution:** Designed and trained by one student in Cox's Bazar, Bangladesh.
40
+ * **Efficiency:** Leveraging dual NVIDIA Tesla T4 GPUs to handle billion-scale logic.
41
+ * **Hardware:** Developed on a single laptop and trained via cloud-compute credits.
 
 
 
42
 
43
+ ## Founder's Narrative
44
+ I am a student currently pursuing a Diploma in Engineering. While most billion-parameter models are the product of large corporate teams, Hexa-1B is a solo effort. Every line of code in the HexaDense architecture and every byte in the NEF format was engineered to prove that a student from Bangladesh can compete at the architectural level of global AI.
 
 
 
45
 
46
+ ## Current Research Status
47
+ This is a prototype release. Due to the high-intensity 18-hour training run on a $0 budget, the model reached 0.0000 loss, leading to significant Mode Collapse (overfitting).
48
+ * **Purpose:** This repository serves as a technical demonstration of the NEF framework's ability to serialize and load 1.1B parameters efficiently.
49
+ * **Future:** This prototype is the foundation for our next-generation, high-diversity training run.
 
 
50
 
51
  ---
52
 
53
+ ### About Hexa Innovate
54
+ Hexa Innovate is a student-led startup based in Bangladesh. We are focused on building the most efficient AI execution layer in the world. We are starting from zero to build the future of localized intelligence.
55
 
56
  **GitHub:** [Hexa08](https://github.com/Hexa08)