Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
datasets: []
|
| 5 |
+
model-index:
|
| 6 |
+
- name: SmolLM2-135M-4-bit
|
| 7 |
+
results: []
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# SmolLM2-135M-4-bit
|
| 11 |
+
|
| 12 |
+
This repository contains a 4-bit quantized version of the HuggingFaceTB/SmolLM2-135M model, using the q4_0 quantization method from llama.cpp, stored in the GGUF file format. Quantization reduces the model's size and memory footprint while maintaining its core capabilities, making it suitable for deployment on resource-constrained environments such as edge devices, mobile platforms, or lightweight inference tasks.
|
| 13 |
+
|
| 14 |
+
Quantization Details:
|
| 15 |
+
Base Model: HuggingFaceTB/SmolLM2-135M
|
| 16 |
+
Quantization Method: q4_0 (4-bit)
|
| 17 |
+
Framework Used: llama.cpp
|
| 18 |
+
File Format: GGUF
|