File size: 5,748 Bytes
0f07ba7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
+++
disableToc = false
title = "πŸ” Object detection"
weight = 13
url = "/features/object-detection/"
+++

LocalAI supports object detection through various backends. This feature allows you to identify and locate objects within images with high accuracy and real-time performance. Currently, [RF-DETR](https://github.com/roboflow/rf-detr) is available as an implementation.

## Overview

Object detection in LocalAI is implemented through dedicated backends that can identify and locate objects within images. Each backend provides different capabilities and model architectures.

**Key Features:**
- Real-time object detection
- High accuracy detection with bounding boxes
- Support for multiple hardware accelerators (CPU, NVIDIA GPU, Intel GPU, AMD GPU)
- Structured detection results with confidence scores
- Easy integration through the `/v1/detection` endpoint

## Usage

### Detection Endpoint

LocalAI provides a dedicated `/v1/detection` endpoint for object detection tasks. This endpoint is specifically designed for object detection and returns structured detection results with bounding boxes and confidence scores.

### API Reference

To perform object detection, send a POST request to the `/v1/detection` endpoint:

```bash
curl -X POST http://localhost:8080/v1/detection \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rfdetr-base",
    "image": "https://media.roboflow.com/dog.jpeg"
  }'
```

### Request Format

The request body should contain:

- `model`: The name of the object detection model (e.g., "rfdetr-base")
- `image`: The image to analyze, which can be:
  - A URL to an image
  - A base64-encoded image

### Response Format

The API returns a JSON response with detected objects:

```json
{
  "detections": [
    {
      "x": 100.5,
      "y": 150.2,
      "width": 200.0,
      "height": 300.0,
      "confidence": 0.95,
      "class_name": "dog"
    },
    {
      "x": 400.0,
      "y": 200.0,
      "width": 150.0,
      "height": 250.0,
      "confidence": 0.87,
      "class_name": "person"
    }
  ]
}
```

Each detection includes:
- `x`, `y`: Coordinates of the bounding box top-left corner
- `width`, `height`: Dimensions of the bounding box
- `confidence`: Detection confidence score (0.0 to 1.0)
- `class_name`: The detected object class

## Backends

### RF-DETR Backend

The RF-DETR backend is implemented as a Python-based gRPC service that integrates seamlessly with LocalAI. It provides object detection capabilities using the RF-DETR model architecture and supports multiple hardware configurations:

- **CPU**: Optimized for CPU inference
- **NVIDIA GPU**: CUDA acceleration for NVIDIA GPUs
- **Intel GPU**: Intel oneAPI optimization
- **AMD GPU**: ROCm acceleration for AMD GPUs
- **NVIDIA Jetson**: Optimized for ARM64 NVIDIA Jetson devices

#### Setup

1. **Using the Model Gallery (Recommended)**

   The easiest way to get started is using the model gallery. The `rfdetr-base` model is available in the official LocalAI gallery:

   ```bash
   # Install and run the rfdetr-base model
   local-ai run rfdetr-base
   ```

   You can also install it through the web interface by navigating to the Models section and searching for "rfdetr-base".

2. **Manual Configuration**

   Create a model configuration file in your `models` directory:

   ```yaml
   name: rfdetr
   backend: rfdetr
   parameters:
     model: rfdetr-base
   ```

#### Available Models

Currently, the following model is available in the [Model Gallery]({{%relref "features/model-gallery" %}}):

- **rfdetr-base**: Base model with balanced performance and accuracy

You can browse and install this model through the LocalAI web interface or using the command line.

## Examples

### Basic Object Detection

```bash
curl -X POST http://localhost:8080/v1/detection \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rfdetr-base",
    "image": "https://example.com/image.jpg"
  }'
```

### Base64 Image Detection

```bash
base64_image=$(base64 -w 0 image.jpg)
curl -X POST http://localhost:8080/v1/detection \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"rfdetr-base\",
    \"image\": \"data:image/jpeg;base64,$base64_image\"
  }"
```

## Troubleshooting

### Common Issues

1. **Model Loading Errors**
   - Ensure the model file is properly downloaded
   - Check available disk space
   - Verify model compatibility with your backend version

2. **Low Detection Accuracy**
   - Ensure good image quality and lighting
   - Check if objects are clearly visible
   - Consider using a larger model for better accuracy

3. **Slow Performance**
   - Enable GPU acceleration if available
   - Use a smaller model for faster inference
   - Optimize image resolution

### Debug Mode

Enable debug logging for troubleshooting:

```bash
local-ai run --debug rfdetr-base
```

## Object Detection Category

LocalAI includes a dedicated **object-detection** category for models and backends that specialize in identifying and locating objects within images. This category currently includes:

- **RF-DETR**: Real-time transformer-based object detection

Additional object detection models and backends will be added to this category in the future. You can filter models by the `object-detection` tag in the model gallery to find all available object detection models.

## Related Features

- [🎨 Image generation]({{%relref "features/image-generation" %}}): Generate images with AI
- [πŸ“– Text generation]({{%relref "features/text-generation" %}}): Generate text with language models
- [πŸ” GPT Vision]({{%relref "features/gpt-vision" %}}): Analyze images with language models
- [πŸš€ GPU acceleration]({{%relref "features/GPU-acceleration" %}}): Optimize performance with GPU acceleration