Upload Jojo LLM model
Browse files- README.md +6 -6
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -9,7 +9,7 @@ tags:
|
|
| 9 |
- jojo-llm
|
| 10 |
- pytorch
|
| 11 |
datasets:
|
| 12 |
-
-
|
| 13 |
metrics:
|
| 14 |
- perplexity
|
| 15 |
model-index:
|
|
@@ -19,8 +19,8 @@ model-index:
|
|
| 19 |
type: text-generation
|
| 20 |
name: Text Generation
|
| 21 |
dataset:
|
| 22 |
-
type:
|
| 23 |
-
name:
|
| 24 |
metrics:
|
| 25 |
- type: perplexity
|
| 26 |
value: N/A
|
|
@@ -31,7 +31,7 @@ model-index:
|
|
| 31 |
|
| 32 |
## Model Description
|
| 33 |
|
| 34 |
-
jasonacox/jojo-124M is a GPT-style language model trained using the Jojo LLM training framework. This model was fine-tuned on the
|
| 35 |
|
| 36 |
## Model Details
|
| 37 |
|
|
@@ -46,14 +46,14 @@ jasonacox/jojo-124M is a GPT-style language model trained using the Jojo LLM tra
|
|
| 46 |
- **Hidden Size**: 768
|
| 47 |
- **Attention Heads**: 12
|
| 48 |
- **Context Length**: 1024 tokens
|
| 49 |
-
- **Vocabulary Size**: 50,304
|
| 50 |
- **Total Parameters**: 219.6M
|
| 51 |
|
| 52 |
## Training Details
|
| 53 |
|
| 54 |
### Training Data
|
| 55 |
|
| 56 |
-
The model was trained on the **
|
| 57 |
|
| 58 |
### Training Procedure
|
| 59 |
|
|
|
|
| 9 |
- jojo-llm
|
| 10 |
- pytorch
|
| 11 |
datasets:
|
| 12 |
+
- TinyStoriesV2
|
| 13 |
metrics:
|
| 14 |
- perplexity
|
| 15 |
model-index:
|
|
|
|
| 19 |
type: text-generation
|
| 20 |
name: Text Generation
|
| 21 |
dataset:
|
| 22 |
+
type: TinyStoriesV2
|
| 23 |
+
name: Tinystoriesv2
|
| 24 |
metrics:
|
| 25 |
- type: perplexity
|
| 26 |
value: N/A
|
|
|
|
| 31 |
|
| 32 |
## Model Description
|
| 33 |
|
| 34 |
+
jasonacox/jojo-124M is a GPT-style language model trained using the Jojo LLM training framework. This model was fine-tuned on the TinyStoriesV2 dataset and is designed for text generation tasks.
|
| 35 |
|
| 36 |
## Model Details
|
| 37 |
|
|
|
|
| 46 |
- **Hidden Size**: 768
|
| 47 |
- **Attention Heads**: 12
|
| 48 |
- **Context Length**: 1024 tokens
|
| 49 |
+
- **Vocabulary Size**: 50,304 tokens
|
| 50 |
- **Total Parameters**: 219.6M
|
| 51 |
|
| 52 |
## Training Details
|
| 53 |
|
| 54 |
### Training Data
|
| 55 |
|
| 56 |
+
The model was trained on the **TinyStoriesV2** dataset.
|
| 57 |
|
| 58 |
### Training Procedure
|
| 59 |
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 497918592
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:97659e5ce53bd703510b76936433f5901477f0ed2b43163d573d6e5f40b45650
|
| 3 |
size 497918592
|