kevinharry
/

SmolLM2-135M-4-bit

kevinharry commited on Feb 18, 2025

Commit

72943ab

verified ·

1 Parent(s): afbe31d

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md ADDED Viewed

+---
+language: en
+license: apache-2.0
+datasets: []
+model-index:
+  - name: SmolLM2-135M-4-bit
+    results: []
+---
+# SmolLM2-135M-4-bit
+This repository contains a 4-bit quantized version of the HuggingFaceTB/SmolLM2-135M model, using the q4_0 quantization method from llama.cpp, stored in the GGUF file format. Quantization reduces the model's size and memory footprint while maintaining its core capabilities, making it suitable for deployment on resource-constrained environments such as edge devices, mobile platforms, or lightweight inference tasks.
+Quantization Details:
+Base Model: HuggingFaceTB/SmolLM2-135M
+Quantization Method: q4_0 (4-bit)
+Framework Used: llama.cpp
+File Format: GGUF