GGUF
conversational
zenpeach commited on
Commit
f7a6881
·
verified ·
1 Parent(s): 19b1709

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -3
README.md CHANGED
@@ -1,3 +1,88 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # Granite 4 GGUF (4-bit Quantized)
5
+
6
+ This repository hosts GGUF-format quantized versions of **IBM Granite 4** models at multiple parameter sizes.
7
+
8
+ The models provided here are intended for **local inference** and are suitable for use with **SciTools’ Understand and Onboard**, as well as other tools and runtimes that support the GGUF format (for example, llama.cpp-based applications).
9
+
10
+ ---
11
+
12
+ ## Model Details
13
+
14
+ - Base models: IBM Granite 4
15
+ - Variants provided: 1B and 3B
16
+ - Format: GGUF
17
+ - Quantization: 4-bit (model-specific, see table below)
18
+ - Intended use: Local inference, code understanding, general-purpose chat
19
+ - Languages: English (as supported by Granite 4)
20
+
21
+ ---
22
+
23
+
24
+ ## Quantization Process
25
+
26
+ - The **1B Granite 4** model is provided directly from **IBM’s GGUF release**.
27
+ - The **3B Granite 4 Micro** model is quantized using **Unsloth** tooling.
28
+ - No additional fine-tuning, rebalancing, or prompt modification was applied.
29
+ - Quantization parameters were not altered from their original sources.
30
+
31
+ These models are redistributed as-is to provide reproducible, efficient GGUF variants suitable for local workflows.
32
+
33
+ ---
34
+
35
+ ## What We Did Not Do
36
+
37
+ To be explicit:
38
+
39
+ - No additional fine-tuning
40
+ - No instruction rebalancing
41
+ - No safety, alignment, or prompt modifications
42
+ - No merging or model surgery
43
+
44
+ Any observed behavior is attributable to **Granite 4 and the applied quantization**, not downstream changes.
45
+
46
+ ---
47
+
48
+ ## Intended Use
49
+
50
+ These models are suitable for:
51
+
52
+ - SciTools Understand and SciTools Onboard
53
+ - Local AI workflows
54
+ - Code comprehension and exploration
55
+ - Interactive chat and analysis
56
+ - Integration into developer tools that support GGUF
57
+
58
+ They are not intended for:
59
+
60
+ - Safety-critical or regulated decision-making
61
+ - Use cases requiring guaranteed factual accuracy
62
+ - Production deployment without independent evaluation
63
+
64
+ ---
65
+
66
+ ## Limitations
67
+
68
+ - As 4-bit quantized models, some reduction in reasoning depth and precision is expected compared to full-precision checkpoints.
69
+ - Output quality varies between the 1B and 3B variants.
70
+ - Like all large language models, Granite 4 may produce incorrect or misleading outputs.
71
+
72
+ Evaluate carefully for your specific workload.
73
+
74
+ ---
75
+
76
+ ## License & Attribution
77
+
78
+ - Original models: IBM (Granite 4)
79
+ - Quantization: IBM and Unsloth
80
+ - Format: GGUF (llama.cpp ecosystem)
81
+
82
+ Please refer to the original Granite 4 license and usage terms. This repository redistributes quantized artifacts only and does not modify the underlying licensing conditions.
83
+
84
+ ---
85
+
86
+ ## Acknowledgements
87
+
88
+ Thanks to **IBM** for releasing the Granite 4 models and to **Unsloth** for providing efficient, reproducible quantization that enables practical local inference.