AdvRahul commited on
Commit
c64680a
·
verified ·
1 Parent(s): 6c8d078

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -37
README.md CHANGED
@@ -1,54 +1,44 @@
1
- ---
2
  library_name: transformers
3
  license: apache-2.0
4
  license_link: https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507/blob/main/LICENSE
5
  pipeline_tag: text-generation
6
  base_model: Qwen/Qwen3-4B-Thinking-2507
7
  tags:
8
- - llama-cpp
9
- - gguf-my-repo
10
- ---
11
 
12
- # AdvRahul/Qwen3-4B-Thinking-2507-Q4_K_M-GGUF
13
- This model was converted to GGUF format from [`Qwen/Qwen3-4B-Thinking-2507`](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
14
- Refer to the [original model card](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) for more details on the model.
15
 
16
- ## Use with llama.cpp
 
 
 
 
 
17
  Install llama.cpp through brew (works on Mac and Linux)
18
 
19
- ```bash
20
  brew install llama.cpp
21
-
22
- ```
23
  Invoke the llama.cpp server or the CLI.
24
 
25
- ### CLI:
26
- ```bash
27
- llama-cli --hf-repo AdvRahul/Qwen3-4B-Thinking-2507-Q4_K_M-GGUF --hf-file qwen3-4b-thinking-2507-q4_k_m.gguf -p "The meaning to life and the universe is"
28
- ```
29
-
30
- ### Server:
31
- ```bash
32
- llama-server --hf-repo AdvRahul/Qwen3-4B-Thinking-2507-Q4_K_M-GGUF --hf-file qwen3-4b-thinking-2507-q4_k_m.gguf -c 2048
33
- ```
34
-
35
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
36
-
37
  Step 1: Clone llama.cpp from GitHub.
38
- ```
39
- git clone https://github.com/ggerganov/llama.cpp
40
- ```
41
 
42
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
43
- ```
44
- cd llama.cpp && LLAMA_CURL=1 make
45
- ```
46
 
 
 
47
  Step 3: Run inference through the main binary.
48
- ```
49
- ./llama-cli --hf-repo AdvRahul/Qwen3-4B-Thinking-2507-Q4_K_M-GGUF --hf-file qwen3-4b-thinking-2507-q4_k_m.gguf -p "The meaning to life and the universe is"
50
- ```
51
- or
52
- ```
53
- ./llama-server --hf-repo AdvRahul/Qwen3-4B-Thinking-2507-Q4_K_M-GGUF --hf-file qwen3-4b-thinking-2507-q4_k_m.gguf -c 2048
54
- ```
 
 
1
  library_name: transformers
2
  license: apache-2.0
3
  license_link: https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507/blob/main/LICENSE
4
  pipeline_tag: text-generation
5
  base_model: Qwen/Qwen3-4B-Thinking-2507
6
  tags:
 
 
 
7
 
8
+ llama-cpp
 
 
9
 
10
+ gguf-my-repo
11
+
12
+ AdvRahul/Axion-Thinking-4B
13
+ This model is finetuned from [Qwen/Qwen3-4B-Thinking-2507] making it safer by red team testing with advanced protocols.
14
+
15
+ Use with llama.cpp
16
  Install llama.cpp through brew (works on Mac and Linux)
17
 
18
+ bash
19
  brew install llama.cpp
 
 
20
  Invoke the llama.cpp server or the CLI.
21
 
22
+ CLI:
23
+ bash
24
+ llama-cli --hf-repo AdvRahul/Axion-Thinking-4B-Q4_K_M-GGUF --hf-file axion-thinking-4b-q4_k_m.gguf -p "The meaning to life and the universe is"
25
+ Server:
26
+ bash
27
+ llama-server --hf-repo AdvRahul/Axion-Thinking-4B-Q4_K_M-GGUF --hf-file axion-thinking-4b-q4_k_m.gguf -c 2048
28
+ Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.
 
 
 
 
 
29
  Step 1: Clone llama.cpp from GitHub.
 
 
 
30
 
31
+ text
32
+ git clone [https://github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)
33
+ Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
 
34
 
35
+ text
36
+ cd llama.cpp && LLAMA_CURL=1 make
37
  Step 3: Run inference through the main binary.
38
+
39
+ text
40
+ ./llama-cli --hf-repo AdvRahul/Axion-Thinking-4B-Q4_K_M-GGUF --hf-file axion-thinking-4b-q4_k_m.gguf -p "The meaning to life and the universe is"
41
+ or
42
+
43
+ text
44
+ ./llama-server --hf-repo AdvRahul/Axion-Thinking-4B-Q4_K_M-GGUF --hf-file axion-thinking-4b-q4_k_m.gguf -c 2048