Austin207
/

Map-NEO

@@ -53,10 +53,10 @@ model-index:
 ## Key Features
--  **Efficient Training**: Trained on RTX 5070 (8GB VRAM) in ~4 hours
 -  **Extended Context**: 16,384 token context window (16x typical small models)
 -  **Memory Efficient**: Only 1.3GB VRAM for 1,800 tokens inference
--  **Fast Inference**: ~10 tokens/second on consumer GPU
 -  **High Quality Data**: Trained on curated RefinedWeb subset
 ## Architecture Details
@@ -96,7 +96,7 @@ model-index:
 ## Training Procedure
 ### Training Configuration
-- **Hardware**: NVIDIA RTX 5070 (8GB VRAM)
 - **Precision**: bfloat16 mixed precision
 - **Batch Size**: 1 per device
 - **Gradient Accumulation**: 32 steps
@@ -128,7 +128,7 @@ model-index:
 - **Convergence**: Smooth loss curve, no overfitting
 ### Inference Performance
-- **Speed**: ~10 tokens/second (RTX 5070)
 - **Memory Usage**: 1.3GB for 1,800 token context
 - **Context Limit**: 3,600 tokens practical limit
 - **Temperature**: Recommended 0.7-0.9 for creative tasks
@@ -221,26 +221,13 @@ python interactive_chat.py
   title={MAP-NEO Mini: An Efficient 253M Parameter Language Model},
   author={[Antony Austin]},
   year={2025},
-  howpublished={\url{https://huggingface.co/[Austin207]/map-neo-mini}},
-  note={Trained on NVIDIA RTX 5070 with RefinedWeb data}
 }
 ```
 ## Technical Details
-### Files Structure
-```
-map-neo-mini/
-├── config.json                 # Model configuration
-├── pytorch_model.bin           # Model weights
-├── tokenizer.json              # Tokenizer configuration
-├── tokenizer_config.json       # Tokenizer metadata
-├── special_tokens_map.json     # Special tokens
-├── vocab.json                  # Vocabulary
-├── merges.txt                  # BPE merges
-└── model_neo.py               # Model architecture code
-```
 ### Hardware Requirements
 - **Minimum**: 4GB VRAM for inference
 - **Recommended**: 8GB VRAM for extended context

 ## Key Features
+-  **Efficient Training**: Trained on RTX 5070 Laptop GPU (8GB VRAM) in ~4 hours
 -  **Extended Context**: 16,384 token context window (16x typical small models)
 -  **Memory Efficient**: Only 1.3GB VRAM for 1,800 tokens inference
+-  **Fast Inference**: ~150+ tokens/second on consumer GPU
 -  **High Quality Data**: Trained on curated RefinedWeb subset
 ## Architecture Details
 ## Training Procedure
 ### Training Configuration
+- **Hardware**: NVIDIA RTX 5070 Laptop GPU (8GB VRAM)
 - **Precision**: bfloat16 mixed precision
 - **Batch Size**: 1 per device
 - **Gradient Accumulation**: 32 steps
 - **Convergence**: Smooth loss curve, no overfitting
 ### Inference Performance
+- **Speed**: ~150+ tokens/second (RTX 5070)
 - **Memory Usage**: 1.3GB for 1,800 token context
 - **Context Limit**: 3,600 tokens practical limit
 - **Temperature**: Recommended 0.7-0.9 for creative tasks
   title={MAP-NEO Mini: An Efficient 253M Parameter Language Model},
   author={[Antony Austin]},
   year={2025},
+  howpublished={\url{https://huggingface.co/Austin207/Map-NEO}},
+  note={Trained on NVIDIA RTX 5070 Laptop GPU with RefinedWeb data}
 }
 ```
 ## Technical Details
 ### Hardware Requirements
 - **Minimum**: 4GB VRAM for inference
 - **Recommended**: 8GB VRAM for extended context