Suparious commited on
Commit
0bc787f
·
verified ·
1 Parent(s): 554ab0c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -2
README.md CHANGED
@@ -1,13 +1,83 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  inference: false
 
 
 
 
 
 
3
  ---
4
  # jeiku/Soulful_Bepis_9B AWQ
5
 
6
- ** PROCESSING .... ETA 30mins **
7
-
8
  - Model creator: [jeiku](https://huggingface.co/jeiku)
9
  - Original model: [Soulful_Bepis_9B](https://huggingface.co/jeiku/Soulful_Bepis_9B)
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ### About AWQ
12
 
13
  AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
 
1
  ---
2
+ base_model:
3
+ - ChaoticNeutrals/Bepis_9B
4
+ - jeiku/Synthetic_Soul_1k_Mistral_128
5
+ library_name: transformers
6
+ tags:
7
+ - 4-bit
8
+ - AWQ
9
+ - text-generation
10
+ - autotrain_compatible
11
+ - endpoints_compatible
12
+ - mergekit
13
+ - merge
14
+ pipeline_tag: text-generation
15
  inference: false
16
+ license: other
17
+ datasets:
18
+ - ChaoticNeutrals/Synthetic_Soul_1k
19
+ language:
20
+ - en
21
+ quantized_by: Suparious
22
  ---
23
  # jeiku/Soulful_Bepis_9B AWQ
24
 
 
 
25
  - Model creator: [jeiku](https://huggingface.co/jeiku)
26
  - Original model: [Soulful_Bepis_9B](https://huggingface.co/jeiku/Soulful_Bepis_9B)
27
 
28
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/x3qrhs8GG8nBfVSdlp0yB.jpeg)
29
+
30
+ ## Model SUmmary
31
+
32
+ Bepis_9B finetuned on Synthetic_Soul_1k. Does it do anything? Who knows...
33
+
34
+ ## How to use
35
+
36
+ ### Install the necessary packages
37
+
38
+ ```bash
39
+ pip install --upgrade autoawq autoawq-kernels
40
+ ```
41
+
42
+ ### Example Python code
43
+
44
+ ```python
45
+ from awq import AutoAWQForCausalLM
46
+ from transformers import AutoTokenizer, TextStreamer
47
+
48
+ model_path = "solidrust/Soulful_Bepis_9B-AWQ"
49
+ system_message = "You are Soulful_Bepis_9B, incarnated as a powerful AI. You were created by jeiku."
50
+
51
+ # Load model
52
+ model = AutoAWQForCausalLM.from_quantized(model_path,
53
+ fuse_layers=True)
54
+ tokenizer = AutoTokenizer.from_pretrained(model_path,
55
+ trust_remote_code=True)
56
+ streamer = TextStreamer(tokenizer,
57
+ skip_prompt=True,
58
+ skip_special_tokens=True)
59
+
60
+ # Convert prompt to tokens
61
+ prompt_template = """\
62
+ <|im_start|>system
63
+ {system_message}<|im_end|>
64
+ <|im_start|>user
65
+ {prompt}<|im_end|>
66
+ <|im_start|>assistant"""
67
+
68
+ prompt = "You're standing on the surface of the Earth. "\
69
+ "You walk one mile south, one mile west and one mile north. "\
70
+ "You end up exactly where you started. Where are you?"
71
+
72
+ tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
73
+ return_tensors='pt').input_ids.cuda()
74
+
75
+ # Generate output
76
+ generation_output = model.generate(tokens,
77
+ streamer=streamer,
78
+ max_new_tokens=512)
79
+ ```
80
+
81
  ### About AWQ
82
 
83
  AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.