v0.32.0
Browse filesSee https://github.com/quic/ai-hub-models/releases/v0.32.0 for changelog.
- .gitattributes +1 -0
- DEPLOYMENT_MODEL_LICENSE.pdf +3 -0
- LICENSE +2 -0
- README.md +0 -3
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
DEPLOYMENT_MODEL_LICENSE.pdf filter=lfs diff=lfs merge=lfs -text
|
DEPLOYMENT_MODEL_LICENSE.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4409f93b0e82531303b3e10f52f1fdfb56467a25f05b7441c6bbd8bb8a64b42c
|
| 3 |
+
size 109629
|
LICENSE
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
The license of the original trained model can be found at https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md.
|
| 2 |
+
The license for the deployable model files (.tflite, .onnx, .dlc, .bin, etc.) can be found in DEPLOYMENT_MODEL_LICENSE.pdf.
|
README.md
CHANGED
|
@@ -28,16 +28,13 @@ This model is an implementation of IBM-Granite-v3.1-8B-Instruct found [here](htt
|
|
| 28 |
- **Model Stats:**
|
| 29 |
- Input sequence length for Prompt Processor: 128
|
| 30 |
- Context length: 4096
|
| 31 |
-
- Number of parameters: 8B
|
| 32 |
- Precision: w4a16 + w8a16 (few layers)
|
| 33 |
- Num of key-value heads: 8
|
| 34 |
- Information about the model parts: Prompt Processor and Token Generator are split into 5 parts each. Each corresponding Prompt Processor and Token Generator part share weights.
|
| 35 |
-
- Prompt processor model size: 4.8 GB
|
| 36 |
- Prompt processor input (part1): 128 tokens
|
| 37 |
- Prompt processor output (part1): Embeddings output
|
| 38 |
- Prompt processor input (other parts): 128 tokens + KVCache initialized with pad token
|
| 39 |
- Prompt processor output (other parts): 128 output tokens + KVCache for token generator
|
| 40 |
-
- Token generator model size: 4.8 GB
|
| 41 |
- Token generator input (part1): 1 token
|
| 42 |
- Token generator output (part1): Embeddings output
|
| 43 |
- Token generator input (other parts): 1 input token + past KVCache
|
|
|
|
| 28 |
- **Model Stats:**
|
| 29 |
- Input sequence length for Prompt Processor: 128
|
| 30 |
- Context length: 4096
|
|
|
|
| 31 |
- Precision: w4a16 + w8a16 (few layers)
|
| 32 |
- Num of key-value heads: 8
|
| 33 |
- Information about the model parts: Prompt Processor and Token Generator are split into 5 parts each. Each corresponding Prompt Processor and Token Generator part share weights.
|
|
|
|
| 34 |
- Prompt processor input (part1): 128 tokens
|
| 35 |
- Prompt processor output (part1): Embeddings output
|
| 36 |
- Prompt processor input (other parts): 128 tokens + KVCache initialized with pad token
|
| 37 |
- Prompt processor output (other parts): 128 output tokens + KVCache for token generator
|
|
|
|
| 38 |
- Token generator input (part1): 1 token
|
| 39 |
- Token generator output (part1): Embeddings output
|
| 40 |
- Token generator input (other parts): 1 input token + past KVCache
|