NuisanceValue commited on
Commit
4023aa9
·
verified ·
1 Parent(s): 0074abb

Initial GGUF upload

Browse files
.gitattributes CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ MetalGPT-1-32B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
37
+ MetalGPT-1-32B-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
38
+ MetalGPT-1-32B-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
39
+ MetalGPT-1-32B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
MetalGPT-1-32B-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a78e9906144ee95796ad41bde86718f7a2f5e18f25bb964d104bd2ecd1f47de2
3
+ size 19761766592
MetalGPT-1-32B-Q4_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc6690989f8872601f7567e0fcec874db1c16175932bc4768c934bfd97d9e3d9
3
+ size 18770862272
MetalGPT-1-32B-Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:98f489067eb0dffaef88646cfab7ee9330dfaaa5c1ca72fdff21d057548f5884
3
+ size 26882597696
MetalGPT-1-32B-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e9fdb60bd3735f238ba99d385d203d983abdfdcc60ec38c56832cb000615a595
3
+ size 34816397888
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: nn-tech/MetalGPT-1
3
+ model_type: qwen
4
+ tags:
5
+ - mining
6
+ - metallurgy
7
+ - gguf
8
+ - text-generation
9
+ license: apache-2.0
10
+ language:
11
+ - ru
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ # MetalGPT-1 GGUF
16
+
17
+ This repository contains **unofficial GGUF conversions** of the [`nn-tech/MetalGPT-1`](https://huggingface.co/nn-tech/MetalGPT-1) model for use with GGUF-compatible runtimes.
18
+
19
+ MetalGPT-1 is a 32B chat model based on **Qwen/Qwen3-32B**, further trained with both continual pre-training and supervised fine-tuning on domain-specific data from the mining and metallurgy industry.
20
+
21
+ > ⚠️ Disclaimer:
22
+ > This repository is **not** affiliated with the original authors of MetalGPT-1.
23
+ > These are pure quantizations of the original model weights - no additional training, fine-tuning, or modifications were applied.
24
+ > Quality, correctness, and safety of the quantized variants are not guaranteed.
25
+
26
+ See the original model card: https://huggingface.co/nn-tech/MetalGPT-1
27
+
28
+ ---
29
+
30
+ ## GGUF variants in this repository
31
+
32
+ The following GGUF quantized variants of MetalGPT-1 are provided:
33
+
34
+ | File name | Quantization | Size (GB) | Notes |
35
+ | :------------------------- | :----------- | :-------- | :------------------------------------------------------------- |
36
+ | `MetalGPT-1-32B-Q8_0.gguf` | Q8_0 | 32.43 | Near‑F16 quality, high VRAM |
37
+ | `MetalGPT-1-32B-Q6_K.gguf` | Q6_K | 25.04 | Higher quality, more VRAM |
38
+ | `MetalGPT-1-32B-Q4_K_M.gguf` | Q4_K_M | 18.40 | Good quality, very memory‑efficient |
39
+ | `MetalGPT-1-32B-Q4_K_S.gguf` | Q4_K_S | 17.48 | Smaller, slightly more aggressive quantization |
40
+
41
+ Choose a variant based on your hardware and quality requirements:
42
+
43
+ - **Q4_K_M / Q4_K_S**: best options for low‑VRAM environments.
44
+ - **Q6_K / Q8_0**: better fidelity for demanding generation quality or professional use.
45
+
46
+ *Note: Try adding the `/think` tag to your prompts if you want to explicitly trigger reasoning capabilities.*
47
+
48
+ ---
49
+
50
+ ## Usage with `LM Studio`
51
+
52
+ 1. Download LM Studio from [here](https://lmstudio.ai/).
53
+ 2. Search for "NuisanceValue/MetalGPT-1-GGUF" in the model hub within LM Studio.
54
+ 3. Select a quantization variant.
55
+ 4. Once downloaded, select the model in the menu.
56
+
57
+
58
+ ## Usage with `llama.cpp`
59
+
60
+ Download one of the GGUF files (for example `MetalGPT-1-32B-Q4_K_M.gguf`) and run:
61
+ ```bash
62
+ ./llama-cli \
63
+ -m MetalGPT-1-32B-Q4_K_M.gguf \
64
+ -p "Назови плюсы и минусы хлоридной и сульфатной технологии производства никеля." \
65
+ --temp 0.7 \
66
+ --top-p 0.8 \
67
+ --top-k 70 \
68
+ --n-predict 512 \
69
+ --ctx-size 8192
70
+ ```
71
+
72
+
73
+ ## Usage with `llama-cpp-python`
74
+
75
+ Install `llama-cpp-python` if you haven't already:
76
+ ```bash
77
+ pip install llama-cpp-python
78
+ ```
79
+
80
+ Then use the following code snippet to load the model and generate text:
81
+ ```python
82
+ from llama_cpp import Llama
83
+
84
+ # Path to your GGUF file
85
+ model_path = "MetalGPT-1-32B-Q4_K_M.gguf"
86
+
87
+ # Initialize the model
88
+ llm = Llama(
89
+ model_path=model_path,
90
+ n_gpu_layers=-1, # Offload all layers to GPU
91
+ n_ctx=8192, # Context window (adjust based on VRAM)
92
+ verbose=False
93
+ )
94
+
95
+ messages = [
96
+ {"role": "system", "content": "Ты специалист в области металлургии."},
97
+ {"role": "user", "content": "Назови плюсы и минусы хлоридной и сульфатной технологии производства никеля."},
98
+ ]
99
+
100
+ output = llm.create_chat_completion(
101
+ messages=messages,
102
+ max_tokens=2048,
103
+ temperature=0.7,
104
+ top_p=0.8
105
+ )
106
+
107
+ print(output["choices"][0]["message"]["content"])
108
+ ```
109
+