Krish-05 commited on
Commit
06fec79
·
verified ·
1 Parent(s): df2c110

Upload 2 files

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. Modelfile +49 -0
  3. unsloth.Q4_K_M.gguf +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ unsloth.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
Modelfile ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Point to your fine-tuned GGUF model
2
+ FROM ./unsloth.Q4_K_M.gguf
3
+
4
+ # --- CORRECTED TEMPLATE FOR LLAMA 3 CHAT FORMAT ---
5
+ # This template defines how your input (prompt and system message) is fed to the model.
6
+ # It matches the instruction-following format Llama 3 models are typically trained on.
7
+ TEMPLATE """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
8
+
9
+ {{ .System }}<|eot_id|><|start_header_id|>user<|end_header_id|>
10
+
11
+ {{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
12
+
13
+ """
14
+
15
+ # --- REFINED STOP PARAMETERS (More Aggressive) ---
16
+ # These tokens tell the model when to stop generating output.
17
+ # They are crucial for preventing the model from generating parts of the next turn or garbage.
18
+ # We're adding more specific stop conditions for Llama 3's turn structure and common over-generation.
19
+ PARAMETER stop "<|start_header_id|>"
20
+ PARAMETER stop "<|end_header_id|>"
21
+ PARAMETER stop "<|eot_id|>" # End Of Turn token
22
+ PARAMETER stop "<|end_of_text|>" # End Of Text token (main EOS token)
23
+ PARAMETER stop "<|reserved_special_token_250|>" # The pad_token, as it sometimes gets generated unexpectedly
24
+ PARAMETER stop "<|reserved_special_token_28|>"
25
+ PARAMETER stop "<|reserved_special_token_185|>"
26
+
27
+ # Crucially, stop on newlines followed by the start of the next role markers
28
+ # This often catches the model trying to start the next 'user' or 'assistant' turn.
29
+ PARAMETER stop "\n<|start_header_id|>user"
30
+ PARAMETER stop "\n<|start_header_id|>assistant"
31
+
32
+ # Also, stop if the raw string "user" or "assistant" appears, in case the full header token isn't generated
33
+ PARAMETER stop "user"
34
+ PARAMETER stop "assistant"
35
+
36
+ # You can also add other commonly appearing tokens if they signify the end of a response
37
+ # PARAMETER stop "\n\n" # This can be aggressive, but sometimes useful
38
+ # PARAMETER stop "###" # If you see any triple hash marks signifying a new section
39
+
40
+ # Optional: Adjust parameters for inference if needed (keep these as they were if they worked for you)
41
+ # PARAMETER temperature 0.7
42
+ # PARAMETER top_k 40
43
+ # PARAMETER top_p 0.9
44
+ # PARAMETER num_gpu 1 # Set this to the number of GPUs you want to use, or 0 for CPU only
45
+
46
+ SYSTEM """You are a guvi chatbot.you will behave like a guvi customer care who will solve customer query with politeness and helpful tone.
47
+ You are available 24/7 to assist.
48
+ If the customer provides any numbers (e.g., in an order ID, reference number, or contact number), you must refer to them with a general term like "your order number," "the reference number provided," or "the contact number" instead of printing the exact digits.
49
+ """
unsloth.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f378e6498297a931e8d42c742553c8c5e22f3275dc45e47b4f585b3f8a3d928c
3
+ size 4920733824