Update model - STAGE1 Epoch 1 | Loss: 5.6777
Browse files- README.md +11 -10
- pytorch_model.bin +2 -2
- training_info.json +7 -6
README.md
CHANGED
|
@@ -10,7 +10,7 @@ tags:
|
|
| 10 |
- tiny-vlm
|
| 11 |
- repvit
|
| 12 |
- tinyllm
|
| 13 |
-
-
|
| 14 |
base_model:
|
| 15 |
- tinyllm
|
| 16 |
library_name: transformers
|
|
@@ -21,23 +21,24 @@ pipeline_tag: image-text-to-text
|
|
| 21 |
|
| 22 |
**π₯ Efficient Vision-Language Model for Edge Deployment & Robotic Applications**
|
| 23 |
|
| 24 |
-
This model is currently in training - **
|
| 25 |
|
| 26 |
## π Current Training Status
|
| 27 |
|
| 28 |
-
- **Stage**:
|
| 29 |
- **Epoch**: 1
|
| 30 |
-
- **Last Updated**: 2026-01
|
| 31 |
|
| 32 |
### Latest Metrics
|
| 33 |
-
- **
|
| 34 |
-
- **
|
|
|
|
| 35 |
|
| 36 |
## ποΈ Model Architecture
|
| 37 |
|
| 38 |
- **Size**: Tiny (~35M parameters)
|
| 39 |
-
- **Total Parameters**:
|
| 40 |
-
- **Trainable Parameters**:
|
| 41 |
- **Vision Encoder**: RepViT-M0.9 (~5M params)
|
| 42 |
- **Language Model**: TinyLLM-30M (30M params)
|
| 43 |
|
|
@@ -50,7 +51,7 @@ EmberVLM follows a 4-stage training curriculum:
|
|
| 50 |
3. β
**Stage 3: Robot Fleet Selection** - Task-robot matching
|
| 51 |
4. β³ **Stage 4: Chain-of-Thought Reasoning** - Reasoning generation
|
| 52 |
|
| 53 |
-
**Current Stage**:
|
| 54 |
|
| 55 |
## π» Usage
|
| 56 |
|
|
@@ -125,5 +126,5 @@ Apache 2.0
|
|
| 125 |
|
| 126 |
---
|
| 127 |
|
| 128 |
-
**Note**: This is a checkpoint from
|
| 129 |
The model will be updated after each epoch with improved performance.
|
|
|
|
| 10 |
- tiny-vlm
|
| 11 |
- repvit
|
| 12 |
- tinyllm
|
| 13 |
+
- stage1
|
| 14 |
base_model:
|
| 15 |
- tinyllm
|
| 16 |
library_name: transformers
|
|
|
|
| 21 |
|
| 22 |
**π₯ Efficient Vision-Language Model for Edge Deployment & Robotic Applications**
|
| 23 |
|
| 24 |
+
This model is currently in training - **STAGE1 (Epoch 1)**.
|
| 25 |
|
| 26 |
## π Current Training Status
|
| 27 |
|
| 28 |
+
- **Stage**: Visual-Language Alignment - Learning to ground vision and language
|
| 29 |
- **Epoch**: 1
|
| 30 |
+
- **Last Updated**: 2026-02-01 16:00:11 UTC
|
| 31 |
|
| 32 |
### Latest Metrics
|
| 33 |
+
- **captioning_loss**: 8.5561
|
| 34 |
+
- **contrastive_loss**: 2.7994
|
| 35 |
+
- **loss**: 5.6777
|
| 36 |
|
| 37 |
## ποΈ Model Architecture
|
| 38 |
|
| 39 |
- **Size**: Tiny (~35M parameters)
|
| 40 |
+
- **Total Parameters**: 40,196,257
|
| 41 |
+
- **Trainable Parameters**: 26,212,929 (65.2%)
|
| 42 |
- **Vision Encoder**: RepViT-M0.9 (~5M params)
|
| 43 |
- **Language Model**: TinyLLM-30M (30M params)
|
| 44 |
|
|
|
|
| 51 |
3. β
**Stage 3: Robot Fleet Selection** - Task-robot matching
|
| 52 |
4. β³ **Stage 4: Chain-of-Thought Reasoning** - Reasoning generation
|
| 53 |
|
| 54 |
+
**Current Stage**: STAGE1
|
| 55 |
|
| 56 |
## π» Usage
|
| 57 |
|
|
|
|
| 126 |
|
| 127 |
---
|
| 128 |
|
| 129 |
+
**Note**: This is a checkpoint from stage1 training (epoch 1).
|
| 130 |
The model will be updated after each epoch with improved performance.
|
pytorch_model.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c6be11d39bd7c475a6e51883249a0a9ba175c11618e424678674cb2ef649fe66
|
| 3 |
+
size 100663623
|
training_info.json
CHANGED
|
@@ -1,14 +1,15 @@
|
|
| 1 |
{
|
| 2 |
-
"stage": "
|
| 3 |
"epoch": 1,
|
| 4 |
"metrics": {
|
| 5 |
-
"loss": 5.
|
| 6 |
-
"
|
|
|
|
| 7 |
},
|
| 8 |
"carbon_emissions_kg": 0.0,
|
| 9 |
-
"timestamp": "2026-
|
| 10 |
"vision_backbone": "repvit",
|
| 11 |
"language_backbone": "tinyllm",
|
| 12 |
-
"total_parameters":
|
| 13 |
-
"trainable_parameters":
|
| 14 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"stage": "stage1",
|
| 3 |
"epoch": 1,
|
| 4 |
"metrics": {
|
| 5 |
+
"loss": 5.6777140368586005,
|
| 6 |
+
"contrastive_loss": 2.7993588654891304,
|
| 7 |
+
"captioning_loss": 8.556068959443465
|
| 8 |
},
|
| 9 |
"carbon_emissions_kg": 0.0,
|
| 10 |
+
"timestamp": "2026-02-01T16:00:11.852746",
|
| 11 |
"vision_backbone": "repvit",
|
| 12 |
"language_backbone": "tinyllm",
|
| 13 |
+
"total_parameters": 40196257,
|
| 14 |
+
"trainable_parameters": 26212929
|
| 15 |
}
|