GGUF
conversational
zenpeach commited on
Commit
41b63fa
·
verified ·
1 Parent(s): c1b0a0f

Update readme

Browse files
Files changed (1) hide show
  1. README.md +87 -3
README.md CHANGED
@@ -1,3 +1,87 @@
1
- ---
2
- license: llama3.2
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3.2
3
+ ---
4
+ # Llama 3.2 GGUF (4_K_M Quantized)
5
+
6
+ This repository hosts GGUF-format quantized versions of Llama 3.2 models at multiple parameter sizes.
7
+
8
+ These files are intended for use with SciTools’ Understand and Onboard, as well as other tools and runtimes that support the GGUF format (for example, llama.cpp-based applications).
9
+
10
+ ---
11
+
12
+ ## Model Details
13
+
14
+ - Base models: Llama 3.2 (various parameter sizes)
15
+ - Format: GGUF
16
+ - Intended use: Local inference, code understanding, general-purpose chat
17
+ - Languages: Multilingual (as supported by Llama 3.2)
18
+
19
+ ### Available Variants
20
+
21
+ This repository includes multiple Llama 3.2 parameter sizes, each quantized independently. Refer to the file names for exact parameter counts.
22
+
23
+ ---
24
+
25
+ ## Quantization Process
26
+
27
+ - Quantization was performed by **Unsloth** and **TensorBlock**.
28
+ - No further modifications, rebalancing, or fine-tuning were applied.
29
+ - The quantization parameters and defaults were not altered from the original sources.
30
+
31
+ The goal is to provide faithful, reproducible GGUF variants that behave as closely as possible to their upstream counterparts.
32
+
33
+ ---
34
+
35
+ ## What We Did Not Do
36
+
37
+ To be explicit:
38
+
39
+ - No additional fine-tuning
40
+ - No instruction rebalancing
41
+ - No safety, alignment, or prompt modifications
42
+ - No merging or model surgery
43
+
44
+ If a model behaves a certain way, that behavior comes from Llama 3.2 combined with quantization, not from any downstream changes here.
45
+
46
+ ---
47
+
48
+ ## Intended Use
49
+
50
+ These models are suitable for:
51
+
52
+ - SciTools Understand and SciTools Onboard
53
+ - Local AI workflows
54
+ - Code comprehension and exploration
55
+ - Interactive chat and analysis
56
+ - Integration into developer tools that support GGUF
57
+
58
+ They are not intended for:
59
+
60
+ - Safety-critical or regulated decision-making
61
+ - Use cases requiring guaranteed factual accuracy
62
+ - Production deployment without independent evaluation
63
+
64
+ ---
65
+
66
+ ## Limitations
67
+
68
+ - Output quality varies by parameter size and task.
69
+ - Like all large language models, Llama 3.2 may produce hallucinations or incorrect information.
70
+
71
+ Evaluate carefully for your specific workload.
72
+
73
+ ---
74
+
75
+ ## License & Attribution
76
+
77
+ - Original models: Meta (Llama 3.2)
78
+ - Quantization: Unsloth and TensorBlock
79
+ - Format: GGUF (llama.cpp ecosystem)
80
+
81
+ Please refer to the original Llama 3.2 license and usage terms. This repository redistributes quantized artifacts only and does not change the underlying licensing conditions.
82
+
83
+ ---
84
+
85
+ ## Acknowledgements
86
+
87
+ Thanks to Meta for releasing the Llama 3.2 models, and to Unsloth and TensorBlock for providing high-quality, reproducible quantization that enables efficient local inference across a wide range of tools.