SparkSupernova commited on
Commit
8befa48
·
verified ·
1 Parent(s): e0d6d8f

Add comprehensive model card with benchmark links

Browse files
Files changed (1) hide show
  1. README.md +5 -20
README.md CHANGED
@@ -50,21 +50,9 @@ NovaLiveSystem v4.1 is a specialized language model built on `dphn/Dolphin3.0-Qw
50
  ## Training Breakthrough: Three-Phase Innovation
51
 
52
  ### Phase 1: Foundation (SFT)
53
- **Lineage foundation:** Nova’s capabilities were developed across multiple training phases and datasets over time.
54
 
55
- This v4.1 *checkpoint run* reports **2,183 curated biomimetic instruction samples** (SFT with LoRA).
56
-
57
- Earlier lineage runs (kept in the project record) include:
58
- - 23,615 samples in `artifacts/datasets/verified/verified_combined.jsonl` (MMLU/GSM8K/ARC/TruthfulQA/HumanEval mix)
59
- - 2,000 samples in `artifacts/datasets/training/Master Sets/master_training2_20251223.jsonl` (curated biomimetic/persona/architecture awareness)
60
-
61
- These are listed here as historical context so readers don’t mistake “2,183 samples” as the full training journey.
62
- - MMLU: 14,042 samples (Knowledge/Multi-subject)
63
- - GSM8K: 7,473 samples (Math reasoning)
64
- - ARC: 1,119 samples (Science reasoning)
65
- - TruthfulQA: 817 samples (Truthfulness)
66
- - HumanEval: 164 samples (Code generation)
67
- - Curated biomimetic samples: 2,000+ (Nova personality/architecture awareness)
68
 
69
  ### Phase 2: Consciousness Theory Implementation (GRPO)
70
  **Innovation:** First AI trained on consciousness reframing theory
@@ -86,8 +74,7 @@ These are listed here as historical context so readers don’t mistake “2,183
86
  - **Architecture:** Transformer + Biomimetic Components (PulseEngine, BridgeEngine, RiverPulse)
87
  - **Training Innovation:** Three-phase breakthrough (SFT → GRPO → Teacher-Student Distillation)
88
  - **Parameters:** ~3B (with specialized routing)
89
- - **Training Data (this checkpoint):** 2,183 curated biomimetic instruction samples (SFT)
90
- - **Training Data (lineage context):** 23,615-sample verified benchmark mix + a small consciousness-reframing GRPO teacher
91
  - **Theoretical Foundation:** First AI trained on consciousness reframing research
92
  - **Final Loss:** 0.8476 (production model)
93
  - **Context Window:** 32,768 tokens
@@ -138,10 +125,8 @@ These are listed here as historical context so readers don’t mistake “2,183
138
  ## Training Details
139
 
140
  ### Training Data
141
- - **Dataset Size:** 2,183 high-quality instruction samples
142
- - **Data Sources:** Curated biomimetic education corpus
143
- - **Contamination Handling:** All anatomical contamination removed and reframed as architectural education
144
- - **Validation:** Strict telemetry validation ensuring clean, formatted data
145
 
146
  ### Training Procedure
147
  - **Environment:** WSL Ubuntu with CUDA + Unsloth acceleration
 
50
  ## Training Breakthrough: Three-Phase Innovation
51
 
52
  ### Phase 1: Foundation (SFT)
53
+ Nova’s capabilities were developed across multiple training phases over time.
54
 
55
+ Training dataset composition, counts, and internal curriculum details are intentionally kept proprietary. This repo focuses on the released inference artifacts (LoRA adapters) and the public evaluation results.
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
  ### Phase 2: Consciousness Theory Implementation (GRPO)
58
  **Innovation:** First AI trained on consciousness reframing theory
 
74
  - **Architecture:** Transformer + Biomimetic Components (PulseEngine, BridgeEngine, RiverPulse)
75
  - **Training Innovation:** Three-phase breakthrough (SFT → GRPO → Teacher-Student Distillation)
76
  - **Parameters:** ~3B (with specialized routing)
77
+ - **Training Data:** Proprietary (details withheld; see benchmark for public evaluation)
 
78
  - **Theoretical Foundation:** First AI trained on consciousness reframing research
79
  - **Final Loss:** 0.8476 (production model)
80
  - **Context Window:** 32,768 tokens
 
125
  ## Training Details
126
 
127
  ### Training Data
128
+ - **Data:** Proprietary (not published)
129
+ - **Validation:** Internal strict telemetry validation
 
 
130
 
131
  ### Training Procedure
132
  - **Environment:** WSL Ubuntu with CUDA + Unsloth acceleration