v0.32.0

See https://github.com/quic/ai-hub-models/releases/v0.32.0 for changelog.

Files changed (4) hide show

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+DEPLOYMENT_MODEL_LICENSE.pdf filter=lfs diff=lfs merge=lfs -text

DEPLOYMENT_MODEL_LICENSE.pdf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:4409f93b0e82531303b3e10f52f1fdfb56467a25f05b7441c6bbd8bb8a64b42c
+size 109629

LICENSE ADDED Viewed


1	+ The license of the original trained model can be found at https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md.
2	+ The license for the deployable model files (.tflite, .onnx, .dlc, .bin, etc.) can be found in DEPLOYMENT_MODEL_LICENSE.pdf.

README.md CHANGED Viewed

@@ -28,16 +28,13 @@ This model is an implementation of IBM-Granite-v3.1-8B-Instruct found [here](htt
 - **Model Stats:**
   - Input sequence length for Prompt Processor: 128
   - Context length: 4096
-  - Number of parameters: 8B
   - Precision: w4a16 + w8a16 (few layers)
   - Num of key-value heads: 8
   - Information about the model parts: Prompt Processor and Token Generator are split into 5 parts each. Each corresponding Prompt Processor and Token Generator part share weights.
-  - Prompt processor model size: 4.8 GB
   - Prompt processor input (part1): 128 tokens
   - Prompt processor output (part1): Embeddings output
   - Prompt processor input (other parts): 128 tokens + KVCache initialized with pad token
   - Prompt processor output (other parts): 128 output tokens + KVCache for token generator
-  - Token generator model size: 4.8 GB
   - Token generator input (part1): 1 token
   - Token generator output (part1): Embeddings output
   - Token generator input (other parts): 1 input token + past KVCache

 - **Model Stats:**
   - Input sequence length for Prompt Processor: 128
   - Context length: 4096
   - Precision: w4a16 + w8a16 (few layers)
   - Num of key-value heads: 8
   - Information about the model parts: Prompt Processor and Token Generator are split into 5 parts each. Each corresponding Prompt Processor and Token Generator part share weights.
   - Prompt processor input (part1): 128 tokens
   - Prompt processor output (part1): Embeddings output
   - Prompt processor input (other parts): 128 tokens + KVCache initialized with pad token
   - Prompt processor output (other parts): 128 output tokens + KVCache for token generator
   - Token generator input (part1): 1 token
   - Token generator output (part1): Embeddings output
   - Token generator input (other parts): 1 input token + past KVCache