File size: 7,771 Bytes
c446951
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
# HTTP Inference

The Roboflow Inference Server provides a standard API through which to run inference on computer vision models.

In this guide, we show how to run inference on object detection, classification, and segmentation models using the Inference Server.

Currently, the server is compatible with models trained on Roboflow, but stay tuned as we actively develop support for bringing your own models.

To run inference with the server, we will:

1. Install the server
2. Download a model for use on the server
3. Run inference

## Step #1: Install the Inference Server

_You can skip this step if you already have Inference installed and running._

The Inference Server runs in Docker. Before we begin, make sure you have installed Docker on your system. To learn how to install Docker, refer to the [official Docker installation guide](https://docs.docker.com/get-docker/).

Once you have Docker installed, you are ready to download Roboflow Inference. The command you need to run depends on what device you are using.

Start the server using `inference server start`. After you have installed the Inference Server, the Docker container will start running the server at `localhost:9001`.

## Step #2: Run Inference

You can send a URL with an image, a NumPy array, or a base64-encoded image to an Inference server. The server will return a JSON response with the predictions.

Choose an option below:

!!! Run Inference

    === "URL"
    
        Create a new Python file and add the following code:

        ```python
        import requests

        project_id = ""
        model_version = ""
        image_url = ""
        confidence = 0.75
        api_key = ""
        task = "object_detection"

        infer_payload = {
            "model_id": f"{project_id}/{model_version}",
            "image": {
                "type": "url",
                "value": image_url,
            },
            "confidence": confidence,
            "iou_threshold": iou_thresh,
            "api_key": api_key,
        }
        res = requests.post(
            f"http://localhost:9001/infer/{task}",
            json=infer_payload,
        )

        predictions = res.json()
        ```

        Above, specify:

        1. `project_id`, `model_version`: Your project ID and model version number. [Learn how to retrieve your project ID and model version number](https://docs.roboflow.com/api-reference/workspace-and-project-ids).
        2. `image_url`: The URL of the image you want to run inference on.
        3. `confidence`: The confidence threshold for predictions. Predictions with a confidence score below this threshold will be filtered out.
        4. `api_key`: Your Roboflow API key. [Learn how to retrieve your Roboflow API key](https://docs.roboflow.com/api-reference/authentication#retrieve-an-api-key).
        5. `task`: The type of task you want to run. Choose from `object_detection`, `classification`, or `segmentation`.

        Then, run the Python script:

        ```
        python app.py
        ```

    === "NumPy Array"
    
        Create a new Python file and add the following code:

        ```python
        import requests
        from PIL import Image

        project_id = ""
        model_version = ""
        image_url = ""
        confidence = 0.75
        api_key = ""
        task = "object_detection"
        file_name = ""

        image = Image.open(file_name)

        infer_payload = {
            "model_id": f"{project_id}/{model_version}",
            "image": {
                "type": "numpy",
                "value": image,
            },
            "confidence": confidence,
            "iou_threshold": iou_thresh,
            "api_key": api_key,
        }

        res = requests.post(
            f"http://localhost:9001/infer/{task}",
            json=infer_payload,
        )

        predictions = res.json()
        ```

        Above, specify:

        1. `project_id`, `model_version`: Your project ID and model version number. [Learn how to retrieve your project ID and model version number](https://docs.roboflow.com/api-reference/workspace-and-project-ids).
        2. `confidence`: The confidence threshold for predictions. Predictions with a confidence score below this threshold will be filtered out.
        3. `api_key`: Your Roboflow API key. [Learn how to retrieve your Roboflow API key](https://docs.roboflow.com/api-reference/authentication#retrieve-an-api-key).
        4. `task`: The type of task you want to run. Choose from `object_detection`, `classification`, or `segmentation`.
        5. `filename`: The path to the image you want to run inference on.

        Then, run the Python script:

        ```
        python app.py
        ```

    === "Base64 Image"
    
        Create a new Python file and add the following code:

        ```python
        import requests
        import base64
        from PIL import Image

        project_id = ""
        model_version = ""
        image_url = ""
        confidence = 0.75
        api_key = ""
        task = "object_detection"
        file_name = ""

        image = Image.open(file_name)

        buffered = BytesIO()

        image.save(buffered, quality=100, format="JPEG")

        img_str = base64.b64encode(buffered.getvalue())

        infer_payload = {
            "model_id": f"{project_id}/{model_version}",
            "image": {
                "type": "base64",
                "value": img_str,
            },
            "confidence": confidence,
            "iou_threshold": iou_thresh,
            "api_key": api_key,
        }

        res = requests.post(
            f"http://localhost:9001/infer/{task}",
            json=infer_payload,
        )

        predictions = res.json()
        ```

        Above, specify:

        1. `project_id`, `model_version`: Your project ID and model version number. [Learn how to retrieve your project ID and model version number](https://docs.roboflow.com/api-reference/workspace-and-project-ids).
        2. `confidence`: The confidence threshold for predictions. Predictions with a confidence score below this threshold will be filtered out.
        3. `api_key`: Your Roboflow API key. [Learn how to retrieve your Roboflow API key](https://docs.roboflow.com/api-reference/authentication#retrieve-an-api-key).
        4. `task`: The type of task you want to run. Choose from `object_detection`, `classification`, or `segmentation`.
        5. `filename`: The path to the image you want to run inference on.

        Then, run the Python script:

        ```
        python app.py
        ```

The code snippets above will run inference on a computer vision model. On the first request, the model weights will be downloaded and set up with your local inference server. This request may take some time depending on your network connection and the size of the model. Once your model has downloaded, subsequent requests will be much faster.

The Inference Server comes with a `/docs` route at `localhost:9001/docs` or `localhost:9001/redoc` that provides OpenAPI-powered documentation. You can use this to reference the routes available, and the configuration options for each route.

## Auto Batching Requests

Object detection models trained with Roboflow support batching, which allow you to upload multiple images of any type at once:

```python
infer_payload = {
    "model_id": f"{project_id}/{model_version}",
    "image": [
        {
            "type": "url",
            "value": image_url_1,
        },
        {
            "type": "url",
            "value": image_url_2,
        },
        {
            "type": "url",
            "value": image_url_3,
        },
    ],
    "confidence": confidence,
    "iou_threshold": iou_thresh,
    "api_key": api_key,
}
```