Update README.md

8b33efe verified about 1 year ago

919 Bytes

license: mit
language:
  - en
base_model:
  - meta-llama/Llama-3.2-1B-Instruct
pipeline_tag: text-classification

LlamaLite-1B-Q8

Model Description

LlamaLite-1B-Q8 is a quantized (8-bit) version of the Meta Llama 3.2-1B-Instruct model, optimized for efficient inference on edge devices and resource-constrained environments. This model maintains high accuracy while significantly reducing memory footprint.

Model Details

Base Model: Meta Llama 3.2-1B-Instruct
Quantization: 8-bit (GGUF format)
Size: 1.31 GB
Framework: Llama.cpp
Optimized for: Offline use, low-power devices

Usage

This model is suitable for real-time applications such as:

Offline AI assistants
Embedded systems
Edge AI devices
Low-latency inference

Example Usage in Llama.cpp

./main -m LlamaLite-1B-Q8.gguf -p "Tell me about quantum computing"