File size: 14,335 Bytes
c446951
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
# `InferenceHTTPClient`

`InferenceHTTPClient` was created to make it easy for users to consume HTTP API exposed by `inference` server. You
can think of it, as a friendly wrapper over `requests` that you can use, instead of creating calling logic on
your own.

## 🔥 quickstart

```python
from inference_sdk import InferenceHTTPClient

image_url = "https://source.roboflow.com/pwYAXv9BTpqLyFfgQoPZ/u48G0UpWfk8giSw7wrU8/original.jpg"

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
predictions = CLIENT.infer(image_url, model_id="soccer-players-5fuqs/1")

print(predictions)
```

## What are the client capabilities?
* Executing inference for models hosted at Roboflow platform (use client version `v0`)
* Executing inference for models hosted in local (or on-prem) docker images with `inference` HTTP API
* Works against single image (given as a local path, URL, `np.ndarray` or `PIL.Image`)
* Minimalistic batch inference implemented (you can pass multiple images)
* Implemented inference from video file and directory with images

## Why client has two modes - `v0` and `v1`?
We are constantly improving our `infrence` package - initial version (`v0`) is compatible with
models deployed at Roboflow platform (task types: `classification`, `object-detection`, `instance-segmentation` and
`keypoints-detection`)
are supported. Version `v1` is available in locally hosted Docker images with HTTP API. 

Locally hosted `inference` server exposes endpoints for model manipulations, but those endpoints are not available
at the moment for models deployed at Roboflow platform.

`api_url` parameter passed to `InferenceHTTPClient` will decide on default client mode - URLs with `*.roboflow.com`
will be defaulted to version `v0`.

Usage of model registry control methods with `v0` clients will raise `WrongClientModeError`.

## How I can adjust `InferenceHTTPClient` to work in my use-case?
There are few ways on how configuration can be altered:

### configuring with context managers
Methods `use_configuration(...)`, `use_api_v0(...)`, `use_api_v1(...)`, `use_model(...)` are designed to
work in context managers. **Once context manager is left - old config values are restored.**

```python
from inference_sdk import InferenceHTTPClient, InferenceConfiguration

image_url = "https://source.roboflow.com/pwYAXv9BTpqLyFfgQoPZ/u48G0UpWfk8giSw7wrU8/original.jpg"

custom_configuration = InferenceConfiguration(confidence_threshold=0.8)
# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
with CLIENT.use_api_v0():
    _ = CLIENT.infer(image_url, model_id="soccer-players-5fuqs/1")

with CLIENT.use_configuration(custom_configuration):
    _ = CLIENT.infer(image_url, model_id="soccer-players-5fuqs/1")

with CLIENT.use_model("soccer-players-5fuqs/1"):
    _ = CLIENT.infer(image_url)

# after leaving context manager - changes are reverted and `model_id` is still required
_ = CLIENT.infer(image_url, model_id="soccer-players-5fuqs/1")
```

As you can see - `model_id` is required to be given for prediction method only when default model is not configured.

### Setting the configuration once and using till next change
Methods `configure(...)`, `select_api_v0(...)`, `select_api_v1(...)`, `select_model(...)` are designed alter the client
state and will be preserved until next change.

```python
from inference_sdk import InferenceHTTPClient, InferenceConfiguration

image_url = "https://source.roboflow.com/pwYAXv9BTpqLyFfgQoPZ/u48G0UpWfk8giSw7wrU8/original.jpg"

custom_configuration = InferenceConfiguration(confidence_threshold=0.8)
# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
CLIENT.select_api_v0()
_ = CLIENT.infer(image_url, model_id="soccer-players-5fuqs/1")

# API v0 still holds
CLIENT.configure(custom_configuration)
CLIENT.infer(image_url, model_id="soccer-players-5fuqs/1")

# API v0 and custom configuration still holds
CLIENT.select_model(model_id="soccer-players-5fuqs/1")
_ = CLIENT.infer(image_url)

# API v0, custom configuration and selected model - still holds
_ = CLIENT.infer(image_url)
```

One may also initialise in `chain` mode:

```python
from inference_sdk import InferenceHTTPClient, InferenceConfiguration

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(api_url="http://localhost:9001", api_key="ROBOFLOW_API_KEY") \
    .select_api_v0() \
    .select_model("soccer-players-5fuqs/1")
```

### Overriding `model_id` for specific call
`model_id` can be overriden for specific call

```python
from inference_sdk import InferenceHTTPClient

image_url = "https://source.roboflow.com/pwYAXv9BTpqLyFfgQoPZ/u48G0UpWfk8giSw7wrU8/original.jpg"

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(api_url="http://localhost:9001", api_key="ROBOFLOW_API_KEY") \
    .select_model("soccer-players-5fuqs/1")

_ = CLIENT.infer(image_url, model_id="another-model/1")
```

## Batch inference
You may want to predict against multiple images at single call. It is possible, but so far - client-side
batching is implemented in naive way (sequential requests to API) - stay tuned for future improvements.

```python
from inference_sdk import InferenceHTTPClient

image_url = "https://source.roboflow.com/pwYAXv9BTpqLyFfgQoPZ/u48G0UpWfk8giSw7wrU8/original.jpg"

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
predictions = CLIENT.infer([image_url] * 5, model_id="soccer-players-5fuqs/1")

print(predictions)
```

## Inference against stream
One may want to infer against video or directory of images - and that modes are supported in `inference-client`

```python
from inference_sdk import InferenceHTTPClient

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
for frame_id, frame, prediction in CLIENT.infer_on_stream("video.mp4", model_id="soccer-players-5fuqs/1"):
    # frame_id is the number of frame
    # frame - np.ndarray with video frame
    # prediction - prediction from the model
    pass

for file_path, image, prediction in CLIENT.infer_on_stream("local/dir/", model_id="soccer-players-5fuqs/1"):
    # file_path - path to the image
    # frame - np.ndarray with video frame
    # prediction - prediction from the model
    pass
```

## What is actually returned as prediction?
`inference_client` returns plain Python dictionaries that are responses from model serving API. Modification
is done only in context of `visualization` key that keep server-generated prediction visualisation (it
can be transcoded to the format of choice) and in terms of client-side re-scaling.

## Methods to control `inference` server (in `v1` mode only)

### Getting server info

```python
from inference_sdk import InferenceHTTPClient

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
CLIENT.get_server_info()
```

### Listing loaded models

```python
from inference_sdk import InferenceHTTPClient

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
CLIENT.list_loaded_models()
```

### Getting specific model description

```python
from inference_sdk import InferenceHTTPClient

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
CLIENT.get_model_description(model_id="some/1", allow_loading=True)
```

If `allow_loading` is set to `True` - model will be loaded as side-effect if it is not already loaded.
Default: `True`.


### Loading model

```python
from inference_sdk import InferenceHTTPClient

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
CLIENT.load_model(model_id="some/1", set_as_default=True)
```

The pointed model will be loaded. If `set_as_default` is set to `True` - after successful load, model
will be used as default model for the client. Default value: `False`.


### Unloading model

```python
from inference_sdk import InferenceHTTPClient

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
CLIENT.unload_model(model_id="some/1")
```

Sometimes (to avoid OOM at server side) - unloading model will be required.
[test_postprocessing.py](..%2F..%2Ftests%2Finference_client%2Funit_tests%2Fhttp%2Futils%2Ftest_postprocessing.py)

### Unloading all models

```python
from inference_sdk import InferenceHTTPClient

# Replace ROBOFLOW_API_KEY with your Roboflow API Key
CLIENT = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key="ROBOFLOW_API_KEY"
)
CLIENT.unload_all_models()
```


## Details about client configuration

`inference-client` provides `InferenceConfiguration` dataclass to hold whole configuration.

```python
from inference_sdk import InferenceConfiguration
```

Overriding fields in this config changes the behaviour of client (and API serving model). Specific fields are
used in specific contexts. In particular:

### Inference in `v0` mode
The following fields are passed to API
* `confidence_threshold` (as `confidence`) - to alter model thresholding
* `keypoint_confidence_threshold` as (`keypoint_confidence`) - to filter out detected keypoints
based on model confidence
* `format` - to visualise on server side - use `image` (but then you loose prediction details from response)
* `visualize_labels` (as `labels`) - used in visualisation to show / hide labels for classes
* `mask_decode_mode`
* `tradeoff_factor`
* `max_detections` - max detections to return from model
* `iou_threshold` (as `overlap`) - to dictate NMS IoU threshold
* `stroke_width` - width of stroke in visualisation
* `count_inference` as `countinference`
* `service_secret`
* `disable_preproc_auto_orientation`, `disable_preproc_contrast`, `disable_preproc_grayscale`, 
`disable_preproc_static_crop` to alter server-side pre-processing


### Classification model in `v1` mode:
* `visualize_predictions` - flag to enable / disable visualisation
* `confidence_threshold` as `confidence`
* `stroke_width` - width of stroke in visualisation
* `disable_preproc_auto_orientation`, `disable_preproc_contrast`, `disable_preproc_grayscale`, 
`disable_preproc_static_crop` to alter server-side pre-processing


### Object detection model in `v1` mode:
* `visualize_predictions` - flag to enable / disable visualisation
* `visualize_labels` - flag to enable / disable labels visualisation if visualisation is enabled
* `confidence_threshold` as `confidence`
* `class_filter` to filter out list of classes
* `class_agnostic_nms` - flag to control whether NMS is class-agnostic
* `fix_batch_size`
* `iou_threshold` - to dictate NMS IoU threshold
* `stroke_width` - width of stroke in visualisation
* `max_detections` - max detections to return from model
* `max_candidates` - max candidates to post-processing from model
* `disable_preproc_auto_orientation`, `disable_preproc_contrast`, `disable_preproc_grayscale`, 
`disable_preproc_static_crop` to alter server-side pre-processing


### Keypoints detection model in `v1` mode:
* `visualize_predictions` - flag to enable / disable visualisation
* `visualize_labels` - flag to enable / disable labels visualisation if visualisation is enabled
* `confidence_threshold` as `confidence`
* `keypoint_confidence_threshold` as (`keypoint_confidence`) - to filter out detected keypoints
based on model confidence
* `class_filter` to filter out list of object classes
* `class_agnostic_nms` - flag to control whether NMS is class-agnostic
* `fix_batch_size`
* `iou_threshold` - to dictate NMS IoU threshold
* `stroke_width` - width of stroke in visualisation
* `max_detections` - max detections to return from model
* `max_candidates` - max candidates to post-processing from model
* `disable_preproc_auto_orientation`, `disable_preproc_contrast`, `disable_preproc_grayscale`, 
`disable_preproc_static_crop` to alter server-side pre-processing


### Instance segmentation model in `v1` mode:
* `visualize_predictions` - flag to enable / disable visualisation
* `visualize_labels` - flag to enable / disable labels visualisation if visualisation is enabled
* `confidence_threshold` as `confidence`
* `class_filter` to filter out list of classes
* `class_agnostic_nms` - flag to control whether NMS is class-agnostic
* `fix_batch_size`
* `iou_threshold` - to dictate NMS IoU threshold
* `stroke_width` - width of stroke in visualisation
* `max_detections` - max detections to return from model
* `max_candidates` - max candidates to post-processing from model
* `disable_preproc_auto_orientation`, `disable_preproc_contrast`, `disable_preproc_grayscale`, 
`disable_preproc_static_crop` to alter server-side pre-processing
* `mask_decode_mode`
* `tradeoff_factor`

### Configuration of client
* `output_visualisation_format` - one of (`VisualisationResponseFormat.BASE64`, `VisualisationResponseFormat.NUMPY`, 
`VisualisationResponseFormat.PILLOW`) - given that server-side visualisation is enabled - one may choose what
format should be used in output
* `image_extensions_for_directory_scan` - while using `CLIENT.infer_on_stream(...)` with local directory
this parameter controls type of files (extensions) allowed to be processed - 
default: `["jpg", "jpeg", "JPG", "JPEG", "png", "PNG"]`
* `client_downsizing_disabled` - set to `True` if you want to avoid client-side downsizing - default `False`.
Client-side scaling is only supposed to down-scale (keeping aspect-ratio) the input for inference -
to utilise internet connection more efficiently (but for the price of images manipulation / transcoding).
If model registry endpoint is available (mode `v1`) - model input size information will be used, if not:
`default_max_input_size` will be in use.