theprint commited on
Commit
c32c02f
·
verified ·
1 Parent(s): 63d58c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -20
README.md CHANGED
@@ -31,7 +31,28 @@ This model is a fine-tuned version of google/gemma-3-4b-it using the Unsloth fra
31
 
32
  ## Intended Use
33
 
34
- Conversation, brainstorming, and general instruction following
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  ## Training Details
37
 
@@ -99,26 +120,7 @@ outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7, do_sample=
99
  response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
100
  print(response)
101
  ```
102
- ## GGUF Quantized Versions
103
-
104
- Quantized GGUF versions are available at [theprint/Zeth-Gemma3-4B-GGUF](https://huggingface.co/theprint/Zeth-Gemma3-4B-GGUF):
105
-
106
- - `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file)
107
- - `Zeth-Gemma3-4B-q3_k_m.gguf` (2276.3 MB) - 3-bit quantization (medium quality)
108
- - `Zeth-Gemma3-4B-q4_k_m.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases)
109
- - `Zeth-Gemma3-4B-q5_k_m.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality)
110
- - `Zeth-Gemma3-4B-q6_k.gguf` (3568.1 MB) - 6-bit quantization (high quality)
111
- - `Zeth-Gemma3-4B-q8_0.gguf` (4619.2 MB) - 8-bit quantization (very high quality)
112
-
113
- ### Using with llama.cpp
114
-
115
- ```bash
116
- # Download a quantized version (q4_k_m recommended for most use cases)
117
- wget https://huggingface.co/theprint/Zeth-Gemma3-4B/resolve/main/gguf/Zeth-Gemma3-4B-q4_k_m.gguf
118
 
119
- # Run with llama.cpp
120
- ./llama.cpp/main -m Zeth-Gemma3-4B-q4_k_m.gguf -p "Your prompt here" -n 256
121
- ```
122
  ## Limitations
123
 
124
  May hallucinate or provide incorrect information.
 
31
 
32
  ## Intended Use
33
 
34
+ Conversation, brainstorming, and general instruction following.
35
+
36
+ ## GGUF Quantized Versions
37
+
38
+ Quantized GGUF versions are available at [theprint/Zeth-Gemma3-4B-GGUF](https://huggingface.co/theprint/Zeth-Gemma3-4B-GGUF):
39
+
40
+ - `Zeth-Gemma3-4B-f16.gguf` (8688.3 MB) - 16-bit float (original precision, largest file)
41
+ - `Zeth-Gemma3-4B-q3_k_m.gguf` (2276.3 MB) - 3-bit quantization (medium quality)
42
+ - `Zeth-Gemma3-4B-q4_k_m.gguf` (2734.6 MB) - 4-bit quantization (medium, recommended for most use cases)
43
+ - `Zeth-Gemma3-4B-q5_k_m.gguf` (3138.7 MB) - 5-bit quantization (medium, good quality)
44
+ - `Zeth-Gemma3-4B-q6_k.gguf` (3568.1 MB) - 6-bit quantization (high quality)
45
+ - `Zeth-Gemma3-4B-q8_0.gguf` (4619.2 MB) - 8-bit quantization (very high quality)
46
+
47
+ ### Using with llama.cpp
48
+
49
+ ```bash
50
+ # Download a quantized version (q4_k_m recommended for most use cases)
51
+ wget https://huggingface.co/theprint/Zeth-Gemma3-4B/resolve/main/gguf/Zeth-Gemma3-4B-q4_k_m.gguf
52
+
53
+ # Run with llama.cpp
54
+ ./llama.cpp/main -m Zeth-Gemma3-4B-q4_k_m.gguf -p "Your prompt here" -n 256
55
+ ```
56
 
57
  ## Training Details
58
 
 
120
  response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
121
  print(response)
122
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
123
 
 
 
 
124
  ## Limitations
125
 
126
  May hallucinate or provide incorrect information.