Documenting the new prompting

Files changed (1) hide show

README.md CHANGED Viewed

@@ -13,6 +13,22 @@ license: apache-2.0
 pipeline_tag: conversational
 ---
 # Jais-13b-chat
 <!-- Provide a quick summary of what the model is/does. -->

 pipeline_tag: conversational
 ---
+# jais-13b-chat-hf
+I made a couple changes, I use LLM.int8() to load this in 8 bits rather than full precision which lowers the GPU VRAM requirements by 3x.
+Further I allow you to set the whole prompt like this:
+```python
+import requests
+API_URL='your API url'       # You get this from your deployed Inference Endpoint
+BEARER='your bearer token'   # You get this from your deployed Inference Endpoint
+headers = {
+    "Authorization": f"Bearer {BEARER}",
+    "Content-Type": "application/json"
+    }
+prompt = "Your clever prompt to drive value here..."
+payload = {'inputs': '', 'prompt': prompt}  # 'inputs' is a required key...
+response = requests.post(API_URL, headers=headers, json=payload)
+```
 # Jais-13b-chat
 <!-- Provide a quick summary of what the model is/does. -->