Update README.md
Browse files
README.md
CHANGED
|
@@ -31,7 +31,28 @@ This model is a fine-tuned version of google/gemma-3-4b-it using the Unsloth fra
|
|
| 31 |
|
| 32 |
## Intended Use
|
| 33 |
|
| 34 |
-
Conversation, brainstorming, and general instruction following
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
## Training Details
|
| 37 |
|
|
@@ -99,26 +120,7 @@ outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7, do_sample=
|
|
| 99 |
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
|
| 100 |
print(response)
|
| 101 |
```
|
| 102 |
-
## GGUF Quantized Versions
|
| 103 |
-
|
| 104 |
-
Quantized GGUF versions are available at [theprint/Zeth-Gemma3-4B-GGUF](https://huggingface.co/theprint/Zeth-Gemma3-4B-GGUF):
|
| 105 |
-
|
| 106 |
-
- `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file)
|
| 107 |
-
- `Zeth-Gemma3-4B-q3_k_m.gguf` (2276.3 MB) - 3-bit quantization (medium quality)
|
| 108 |
-
- `Zeth-Gemma3-4B-q4_k_m.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases)
|
| 109 |
-
- `Zeth-Gemma3-4B-q5_k_m.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality)
|
| 110 |
-
- `Zeth-Gemma3-4B-q6_k.gguf` (3568.1 MB) - 6-bit quantization (high quality)
|
| 111 |
-
- `Zeth-Gemma3-4B-q8_0.gguf` (4619.2 MB) - 8-bit quantization (very high quality)
|
| 112 |
-
|
| 113 |
-
### Using with llama.cpp
|
| 114 |
-
|
| 115 |
-
```bash
|
| 116 |
-
# Download a quantized version (q4_k_m recommended for most use cases)
|
| 117 |
-
wget https://huggingface.co/theprint/Zeth-Gemma3-4B/resolve/main/gguf/Zeth-Gemma3-4B-q4_k_m.gguf
|
| 118 |
|
| 119 |
-
# Run with llama.cpp
|
| 120 |
-
./llama.cpp/main -m Zeth-Gemma3-4B-q4_k_m.gguf -p "Your prompt here" -n 256
|
| 121 |
-
```
|
| 122 |
## Limitations
|
| 123 |
|
| 124 |
May hallucinate or provide incorrect information.
|
|
|
|
| 31 |
|
| 32 |
## Intended Use
|
| 33 |
|
| 34 |
+
Conversation, brainstorming, and general instruction following.
|
| 35 |
+
|
| 36 |
+
## GGUF Quantized Versions
|
| 37 |
+
|
| 38 |
+
Quantized GGUF versions are available at [theprint/Zeth-Gemma3-4B-GGUF](https://huggingface.co/theprint/Zeth-Gemma3-4B-GGUF):
|
| 39 |
+
|
| 40 |
+
- `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file)
|
| 41 |
+
- `Zeth-Gemma3-4B-q3_k_m.gguf` (2276.3 MB) - 3-bit quantization (medium quality)
|
| 42 |
+
- `Zeth-Gemma3-4B-q4_k_m.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases)
|
| 43 |
+
- `Zeth-Gemma3-4B-q5_k_m.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality)
|
| 44 |
+
- `Zeth-Gemma3-4B-q6_k.gguf` (3568.1 MB) - 6-bit quantization (high quality)
|
| 45 |
+
- `Zeth-Gemma3-4B-q8_0.gguf` (4619.2 MB) - 8-bit quantization (very high quality)
|
| 46 |
+
|
| 47 |
+
### Using with llama.cpp
|
| 48 |
+
|
| 49 |
+
```bash
|
| 50 |
+
# Download a quantized version (q4_k_m recommended for most use cases)
|
| 51 |
+
wget https://huggingface.co/theprint/Zeth-Gemma3-4B/resolve/main/gguf/Zeth-Gemma3-4B-q4_k_m.gguf
|
| 52 |
+
|
| 53 |
+
# Run with llama.cpp
|
| 54 |
+
./llama.cpp/main -m Zeth-Gemma3-4B-q4_k_m.gguf -p "Your prompt here" -n 256
|
| 55 |
+
```
|
| 56 |
|
| 57 |
## Training Details
|
| 58 |
|
|
|
|
| 120 |
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
|
| 121 |
print(response)
|
| 122 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 123 |
|
|
|
|
|
|
|
|
|
|
| 124 |
## Limitations
|
| 125 |
|
| 126 |
May hallucinate or provide incorrect information.
|