Spaces:

limitless235
/

llm-pushback

Sleeping

Sahil Seemant commited on Mar 10

Commit

e10f558

1 Parent(s): 68a7bfa

Add offload_folder to support low-memory loading on Hugging Face

Files changed (1) hide show

chat_gui.py CHANGED Viewed

@@ -317,7 +317,8 @@ if st.session_state.messages and st.session_state.messages[-1]["role"] == "user"
                     load_kwargs = {
                         "device_map": "auto",
                         "token": hf_token,
-                        "trust_remote_code": True
                     }
                     # Only apply 4-bit quantization if NOT natively quantized (Mistral is FP8)

                     load_kwargs = {
                         "device_map": "auto",
                         "token": hf_token,
+                        "trust_remote_code": True,
+                        "offload_folder": "/tmp/offload"
                     }
                     # Only apply 4-bit quantization if NOT natively quantized (Mistral is FP8)