Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -31,7 +31,28 @@ This model is a fine-tuned version of google/gemma-3-4b-it using the Unsloth fra
 ## Intended Use
-Conversation, brainstorming, and general instruction following
 ## Training Details
@@ -99,26 +120,7 @@ outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7, do_sample=
 response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
 print(response)
 ```
-## GGUF Quantized Versions
-Quantized GGUF versions are available at [theprint/Zeth-Gemma3-4B-GGUF](https://huggingface.co/theprint/Zeth-Gemma3-4B-GGUF):
-- `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file)
-- `Zeth-Gemma3-4B-q3_k_m.gguf` (2276.3 MB) - 3-bit quantization (medium quality)
-- `Zeth-Gemma3-4B-q4_k_m.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases)
-- `Zeth-Gemma3-4B-q5_k_m.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality)
-- `Zeth-Gemma3-4B-q6_k.gguf` (3568.1 MB) - 6-bit quantization (high quality)
-- `Zeth-Gemma3-4B-q8_0.gguf` (4619.2 MB) - 8-bit quantization (very high quality)
-### Using with llama.cpp
-```bash
-# Download a quantized version (q4_k_m recommended for most use cases)
-wget https://huggingface.co/theprint/Zeth-Gemma3-4B/resolve/main/gguf/Zeth-Gemma3-4B-q4_k_m.gguf
-# Run with llama.cpp
-./llama.cpp/main -m Zeth-Gemma3-4B-q4_k_m.gguf -p "Your prompt here" -n 256
-```
 ## Limitations
 May hallucinate or provide incorrect information.

 ## Intended Use
+Conversation, brainstorming, and general instruction following.
+## GGUF Quantized Versions
+Quantized GGUF versions are available at [theprint/Zeth-Gemma3-4B-GGUF](https://huggingface.co/theprint/Zeth-Gemma3-4B-GGUF):
+- `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file)
+- `Zeth-Gemma3-4B-q3_k_m.gguf` (2276.3 MB) - 3-bit quantization (medium quality)
+- `Zeth-Gemma3-4B-q4_k_m.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases)
+- `Zeth-Gemma3-4B-q5_k_m.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality)
+- `Zeth-Gemma3-4B-q6_k.gguf` (3568.1 MB) - 6-bit quantization (high quality)
+- `Zeth-Gemma3-4B-q8_0.gguf` (4619.2 MB) - 8-bit quantization (very high quality)
+### Using with llama.cpp
+```bash
+# Download a quantized version (q4_k_m recommended for most use cases)
+wget https://huggingface.co/theprint/Zeth-Gemma3-4B/resolve/main/gguf/Zeth-Gemma3-4B-q4_k_m.gguf
+# Run with llama.cpp
+./llama.cpp/main -m Zeth-Gemma3-4B-q4_k_m.gguf -p "Your prompt here" -n 256
+```
 ## Training Details
 response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
 print(response)
 ```
 ## Limitations
 May hallucinate or provide incorrect information.