File size: 6,943 Bytes
c446951
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
## Setup

Before you begin, ensure that you have Docker installed on your machine. Docker provides a containerized environment, 
allowing the Roboflow Inference Server to run in a consistent and isolated manner, regardless of the host system. If 
you haven't installed Docker yet, you can get it from [Docker's official website](https://www.docker.com/get-started).

## Set up a Docker Inference Server via `inference server start``

Another easy way to run the Roboflow Inference Server with Docker is via the command line.

First, [Install the CLI](../index.md#cli).

Running the Inference Server is as simple as running the following command:

```bash
inference server start
```

This will pull the appropriate Docker image for your machine and start the Inference Server on port 9001. You can then send requests to the server to get predictions from your model, as described in [HTTP Inference](http_inference.md).

Once you have your inference server running, you can check its status with the following command:

```bash
inference server status
```

Roboflow Inference CLI currently supports the following device targets:

- x86 CPU
- ARM64 CPU
- NVIDIA GPU

For Jetson or TensorRT Runtime inference server images, pull the images directly following the [instructions below](#pull-from-docker-hub).

## Manually Set Up a Docker Container

### Step #1: Pull from Docker Hub

If you don't wish to build the Docker image locally or prefer to use the official releases, you can directly pull the 
pre-built images from the Docker Hub. These images are maintained by the Roboflow team and are optimized for various 
hardware configurations.

!!! example "docker pull"

    === "x86 CPU"
        Official Roboflow Inference Server Docker Image for x86 CPU Targets.
    
        ```
        docker pull roboflow/roboflow-inference-server-cpu
        ```
    
    === "arm64 CPU"
        Official Roboflow Inference Server Docker Image for ARM CPU Targets.
    
        ```
        docker pull roboflow/roboflow-inference-server-cpu
        ```
    
    === "GPU"
        Official Roboflow Inference Server Docker Image for Nvidia GPU Targets.
    
        ```
        docker pull roboflow/roboflow-inference-server-gpu
        ```

    === "GPU + TensorRT"
        Official Roboflow Inference Server Docker Image for Nvidia GPU with TensorRT Runtime Targets.
    
        ```
        docker pull roboflow/roboflow-inference-server-trt
        ```

    === "Jetson 4.5.x"
        Official Roboflow Inference Server Docker Image for Nvidia Jetson JetPack 4.5.x Targets.

        ```
        docker pull roboflow/roboflow-inference-server-jetson-4.5.0
        ```

    === "Jetson 4.6.x"
        Official Roboflow Inference Server Docker Image for Nvidia Jetson JetPack 4.6.x Targets.

        ```
        docker pull roboflow/roboflow-inference-server-jetson-4.6.1
        ```

    === "Jetson 5.x"
        Official Roboflow Inference Server Docker Image for Nvidia Jetson JetPack 5.x Targets.

        ```
        docker pull roboflow/roboflow-inference-server-jetson-5.1.1
        ```

### Step #2: Run the Docker Container

Once you have a Docker image (either built locally or pulled from Docker Hub), you can run the Roboflow Inference 
Server in a container. 

!!! example "docker run"

    === "x86 CPU"
        ```
        docker run --net=host \
        roboflow/roboflow-inference-server-cpu:latest
        ```

    === "arm64 CPU"
        ```
        docker run -p 9001:9001 \
        roboflow/roboflow-inference-server-cpu:latest
        ```

    === "GPU"
        ```
        docker run --network=host --gpus=all \
        roboflow/roboflow-inference-server-gpu:latest
        ```

    === "GPU + TensorRT"
        ```
        docker run --network=host --gpus=all \
        roboflow/roboflow-inference-server-trt:latest
        ```

    === "Jetson 4.5.x"
        ```
        docker run --privileged --net=host --runtime=nvidia \
        roboflow/roboflow-inference-server-jetson-4.5.0:latest
        ```

    === "Jetson 4.6.x"
        ```
        docker run --privileged --net=host --runtime=nvidia \
        roboflow/roboflow-inference-server-jetson-4.6.1:latest
        ```

    === "Jetson 5.x"
        ```
        docker run --privileged --net=host --runtime=nvidia \
        roboflow/roboflow-inference-server-jetson-5.1.1:latest
        ```

    **_Note:_** The Jetson images come with TensorRT dependencies. To use TensorRT acceleration with your model, pass an additional environment variable at runtime `-e ONNXRUNTIME_EXECUTION_PROVIDERS=TensorrtExecutionProvider`. This can improve inference speed, however, this also incurs a costly startup expense when the model is loaded.

You may add the flag `-e API_KEY=<YOUR API KEY>` to your `docker run` command so that you do not need to provide a Roboflow API key in your requests. Substitute `<YOUR API KEY>` with your Roboflow API key. Learn how to retrieve your [Roboflow API key here](https://docs.roboflow.com/api-reference/authentication#retrieve-an-api-key).

You may add the flag `-v $(pwd)/cache:/cache` to create a cache folder on your home device so that you do not need to redownload or recompile model artifacts upon inference container reboot. You can also (preferably) store artificats in a [docker volume](https://docs.docker.com/storage/volumes/) named `inference-cache` by adding the flag `-v inference-cache:/cache`.

### Advanced: Build a Docker Container from Scratch

To build a Docker image locally, first clone the Inference Server repository.

```bash
git clone https://github.com/roboflow/inference
```

Choose a Dockerfile from the following options, depending on the hardware you want to run Inference Server on.

!!! example "docker build"

    === "x86 CPU"
        ```
        docker build \
        -f dockerfiles/Dockerfile.onnx.cpu \
        -t roboflow/roboflow-inference-server-cpu .
        ```
    
    === "arm64 CPU"
        ```
        docker build \
        -f dockerfiles/Dockerfile.onnx.cpu \
        -t roboflow/roboflow-inference-server-cpu .
        ```
    
    === "GPU"
        ```
        docker build \
        -f dockerfiles/Dockerfile.onnx.gpu \
        -t roboflow/roboflow-inference-server-gpu .
        ```

    === "GPU + TensorRT"
        ```
        docker build \
        -f dockerfiles/Dockerfile.onnx.trt \
        roboflow/roboflow-inference-server-trt .
        ```

    === "Jetson 4.5.x"
        ```
        docker build \
        -f dockerfiles/Dockerfile.onnx.jetson \
        -t roboflow/roboflow-inference-server-jetson-4.5.0 .
        ```

    === "Jetson 4.6.x"
        ```
        docker build \
        -f dockerfiles/Dockerfile.onnx.jetson \
        -t roboflow/roboflow-inference-server-jetson-4.6.1 .
        ```

    === "Jetson 5.x"
        ```
        docker build \
        -f dockerfiles/Dockerfile.onnx.jetson.5.1.1 \
        -t roboflow/roboflow-inference-server-jetson-5.1.1 .
        ```