girish00 commited on
Commit
6be41ff
·
verified ·
1 Parent(s): dc14a91

add dedicated endpoint cloud mode

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -153,6 +153,21 @@ python infer_cloud.py --repo-id your-username/your-model-name --prompt "Fix this
153
 
154
  If you already ran `hf auth login` or `huggingface-cli login`, you can omit `HF_TOKEN`; the saved token will be used automatically.
155
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
  `infer_cloud.py` applies the same JSON parsing, Python syntax check, relevancy score, hallucination flag, and auto-repair fallback as `infer_local.py`. If Hugging Face cannot serve your custom model repo through an inference provider, the script automatically falls back to the local `model/` folder so the command still returns the local-style JSON. Use `--no-local-fallback` if you want cloud-only failure behavior.
157
 
158
  Hosted Hugging Face API calls usually do not return token logits, so `important_tokens` may be empty and `confidence` may be `0.0` unless your endpoint returns token-level details. When the local fallback runs, those fields are computed the same way as `infer_local.py`.
 
153
 
154
  If you already ran `hf auth login` or `huggingface-cli login`, you can omit `HF_TOKEN`; the saved token will be used automatically.
155
 
156
+ For true cloud execution, deploy the model as a Hugging Face Dedicated Inference Endpoint and pass the endpoint URL:
157
+
158
+ ```powershell
159
+ $env:HF_TOKEN="your_huggingface_token"
160
+ python infer_cloud.py --endpoint-url "https://your-endpoint-url.endpoints.huggingface.cloud" --prompt "Fix this code: def add(a,b) return a+b" --no-local-fallback
161
+ ```
162
+
163
+ You can also use environment variables:
164
+
165
+ ```powershell
166
+ $env:HF_TOKEN="your_huggingface_token"
167
+ $env:HF_ENDPOINT_URL="https://your-endpoint-url.endpoints.huggingface.cloud"
168
+ python infer_cloud.py --prompt "Fix this code: def add(a,b) return a+b" --no-local-fallback
169
+ ```
170
+
171
  `infer_cloud.py` applies the same JSON parsing, Python syntax check, relevancy score, hallucination flag, and auto-repair fallback as `infer_local.py`. If Hugging Face cannot serve your custom model repo through an inference provider, the script automatically falls back to the local `model/` folder so the command still returns the local-style JSON. Use `--no-local-fallback` if you want cloud-only failure behavior.
172
 
173
  Hosted Hugging Face API calls usually do not return token logits, so `important_tokens` may be empty and `confidence` may be `0.0` unless your endpoint returns token-level details. When the local fallback runs, those fields are computed the same way as `infer_local.py`.