File size: 9,112 Bytes
647f69c
 
 
 
 
 
 
 
b2e88b8
647f69c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2e88b8
647f69c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
# Dual Deployment Guide - HuggingFace & Azure AI Foundry

This repository supports deployment to both **HuggingFace Inference Endpoints** and **Azure AI Foundry** using the same codebase and Docker image.

## πŸ“‹ Deployment Overview

| Platform | Status | Container Registry | Endpoint |
|----------|--------|-------------------|----------|
| **HuggingFace** | βœ… Running | `sam3acr4hf.azurecr.io` | https://p6irm2x7y9mwp4l4.us-east-1.aws.endpoints.huggingface.cloud |
| **Azure AI Foundry** | ⏳ Pending GPU Quota | `sam3acr.azurecr.io` | To be deployed |

Both deployments use the **same Docker image** with SAM3Model for static image segmentation.

---

## πŸš€ HuggingFace Deployment (Current)

### Status
βœ… **DEPLOYED AND RUNNING**

### Registry
```bash
sam3acr4hf.azurecr.io/sam3-hf:latest
```

### Quick Deploy
```bash
# Build and push
docker build -t sam3acr4hf.azurecr.io/sam3-hf:latest .
az acr login --name sam3acr4hf
docker push sam3acr4hf.azurecr.io/sam3-hf:latest

# Restart endpoint
python3 << 'EOF'
from huggingface_hub import HfApi
api = HfApi()
endpoint = api.get_inference_endpoint('sam3-segmentation', namespace='Logiroad')
endpoint.pause()
endpoint.resume()
EOF
```

### Configuration
- **Hardware**: NVIDIA A10G (24GB VRAM)
- **Organization**: Logiroad
- **Access**: Public
- **Auto-scaling**: Enabled (0-5 replicas)

---

## πŸ”· Azure AI Foundry Deployment (Future)

### Status
⏳ **WAITING FOR GPU QUOTA**

Once GPU quota is approved, deploy using the same Docker image:

### Registry
```bash
sam3acr.azurecr.io/sam3-foundry:latest
```

### Deployment Steps

#### 1. Build and Push to Azure ACR

```bash
# Login to Azure AI Foundry ACR
az acr login --name sam3acr

# Build with Azure AI Foundry tag
docker build -t sam3acr.azurecr.io/sam3-foundry:latest .

# Push to Azure ACR
docker push sam3acr.azurecr.io/sam3-foundry:latest
```

#### 2. Deploy to Azure AI Foundry

Using Azure CLI:

```bash
# Create Azure AI Foundry endpoint
az ml online-endpoint create \
  --name sam3-foundry \
  --resource-group productionline-test \
  --workspace-name <your-workspace>

# Create deployment
az ml online-deployment create \
  --name sam3-deployment \
  --endpoint sam3-foundry \
  --model-uri sam3acr.azurecr.io/sam3-foundry:latest \
  --instance-type Standard_NC6s_v3 \
  --instance-count 1
```

Or using Azure Portal:
1. Navigate to Azure AI Foundry workspace
2. Go to **Endpoints** β†’ **Real-time endpoints**
3. Click **Create**
4. Select **Custom container**
5. Image: `sam3acr.azurecr.io/sam3-foundry:latest`
6. Instance type: **Standard_NC6s_v3** (Tesla V100)
7. Deploy

#### 3. Test Azure AI Foundry Endpoint

```python
import requests
import base64

# Get endpoint URL and key from Azure Portal
ENDPOINT_URL = "https://<your-endpoint>.azureml.net/score"
API_KEY = "<your-api-key>"

with open("test.jpg", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode()

response = requests.post(
    ENDPOINT_URL,
    json={
        "inputs": image_b64,
        "parameters": {"classes": ["pothole", "asphalt"]}
    },
    headers={"Authorization": f"Bearer {API_KEY}"}
)

print(response.json())
```

---

## πŸ”„ Unified Deployment Workflow

Since both platforms use the **same Docker image**, you can deploy to both simultaneously:

### Option 1: Separate Tags (Recommended)

```bash
# Build once
docker build -t sam3-base:latest .

# Tag for HuggingFace
docker tag sam3-base:latest sam3acr4hf.azurecr.io/sam3-hf:latest

# Tag for Azure AI Foundry
docker tag sam3-base:latest sam3acr.azurecr.io/sam3-foundry:latest

# Push to both registries
az acr login --name sam3acr4hf
docker push sam3acr4hf.azurecr.io/sam3-hf:latest

az acr login --name sam3acr
docker push sam3acr.azurecr.io/sam3-foundry:latest
```

### Option 2: Deploy Script

Create `deploy_all.sh`:

```bash
#!/bin/bash
set -e

echo "Building Docker image..."
docker build -t sam3:latest .

echo "Pushing to HuggingFace ACR..."
docker tag sam3:latest sam3acr4hf.azurecr.io/sam3-hf:latest
az acr login --name sam3acr4hf
docker push sam3acr4hf.azurecr.io/sam3-hf:latest

echo "Pushing to Azure AI Foundry ACR..."
docker tag sam3:latest sam3acr.azurecr.io/sam3-foundry:latest
az acr login --name sam3acr
docker push sam3acr.azurecr.io/sam3-foundry:latest

echo "βœ… Deployed to both registries!"
```

---

## πŸ“Š Platform Comparison

| Feature | HuggingFace | Azure AI Foundry |
|---------|-------------|------------------|
| **GPU** | NVIDIA A10G (24GB) | Tesla V100 (16GB) or A100 |
| **Auto-scaling** | βœ… Yes (0-5 replicas) | βœ… Yes (configurable) |
| **Authentication** | Public or Token | API Key required |
| **Pricing** | Per-second billing | Per-hour billing |
| **Scale to Zero** | βœ… Yes | ⚠️ Limited support |
| **Integration** | HuggingFace ecosystem | Azure ML ecosystem |
| **Monitoring** | HF Dashboard | Azure Monitor |

---

## πŸ”§ Configuration Differences

### API Authentication

**HuggingFace** (current - public):
```python
response = requests.post(endpoint_url, json=payload)
```

**Azure AI Foundry** (requires key):
```python
response = requests.post(
    endpoint_url,
    json=payload,
    headers={"Authorization": f"Bearer {api_key}"}
)
```

### Environment Variables

For Azure AI Foundry, you may need to add environment variables:

```dockerfile
# Add to Dockerfile if needed for Azure
ENV AZURE_AI_FOUNDRY=true
ENV MLFLOW_TRACKING_URI=<your-mlflow-uri>
```

### Health Check Endpoints

Both platforms expect:
- `GET /health` - Health check
- `POST /` - Inference endpoint

Our current `app.py` already supports both! βœ…

---

## πŸ§ͺ Testing Both Deployments

Create `test_both_platforms.py`:

```python
import requests
import base64

def test_endpoint(name, url, api_key=None):
    """Test an endpoint"""
    print(f"\n{'='*60}")
    print(f"Testing {name}")
    print(f"{'='*60}")

    # Health check
    headers = {"Authorization": f"Bearer {api_key}"} if api_key else {}
    health = requests.get(f"{url}/health", headers=headers)
    print(f"Health: {health.status_code}")

    # Inference
    with open("test.jpg", "rb") as f:
        image_b64 = base64.b64encode(f.read()).decode()

    response = requests.post(
        url,
        json={
            "inputs": image_b64,
            "parameters": {"classes": ["pothole", "asphalt"]}
        },
        headers=headers
    )

    print(f"Inference: {response.status_code}")
    if response.status_code == 200:
        results = response.json()
        print(f"βœ… Generated {len(results)} masks")
    else:
        print(f"❌ Error: {response.text}")

# Test HuggingFace
test_endpoint(
    "HuggingFace",
    "https://p6irm2x7y9mwp4l4.us-east-1.aws.endpoints.huggingface.cloud"
)

# Test Azure AI Foundry (when deployed)
# test_endpoint(
#     "Azure AI Foundry",
#     "https://<your-endpoint>.azureml.net/score",
#     api_key="<your-key>"
# )
```

---

## πŸ“ Deployment Checklist

### HuggingFace (Complete) βœ…
- [x] Azure Container Registry created (`sam3acr4hf`)
- [x] Docker image built and pushed
- [x] HuggingFace endpoint created
- [x] Model validated with test image
- [x] Documentation complete

### Azure AI Foundry (Pending GPU Quota) ⏳
- [x] Azure Container Registry exists (`sam3acr`)
- [ ] GPU quota approved
- [ ] Azure AI Foundry workspace created
- [ ] Docker image pushed to `sam3acr`
- [ ] Endpoint deployed
- [ ] API key obtained
- [ ] Endpoint validated

---

## πŸ†˜ Troubleshooting

### HuggingFace Issues
See main README.md troubleshooting section.

### Azure AI Foundry Issues

**Issue**: GPU quota not available
- **Solution**: Request quota increase in Azure Portal β†’ Quotas β†’ ML quotas

**Issue**: Container registry authentication failed
```bash
az acr login --name sam3acr --expose-token
```

**Issue**: Endpoint deployment fails
- Check Azure Activity Log for detailed error
- Verify image is accessible: `az acr repository show --name sam3acr --image sam3-foundry:latest`

**Issue**: Model loading timeout
- Increase deployment timeout in Azure ML Studio
- Consider using smaller instance for testing

---

## πŸ’‘ Best Practices

1. **Use same Docker image** for both platforms to ensure consistency
2. **Tag images with versions** (e.g., `v1.0.0`) for rollback capability
3. **Test locally first** before pushing to registries
4. **Monitor costs** on both platforms (HF per-second, Azure per-hour)
5. **Set up alerts** for endpoint health on both platforms
6. **Keep API keys secure** (use Azure Key Vault for Azure AI Foundry)

---

## πŸ“š Resources

### HuggingFace
- [Inference Endpoints Docs](https://huggingface.co/docs/inference-endpoints)
- [Custom Docker Images](https://huggingface.co/docs/inference-endpoints/guides/custom_container)

### Azure AI Foundry
- [Azure ML Endpoints](https://learn.microsoft.com/azure/machine-learning/concept-endpoints)
- [Deploy Custom Containers](https://learn.microsoft.com/azure/machine-learning/how-to-deploy-custom-container)
- [GPU Quota Requests](https://learn.microsoft.com/azure/machine-learning/how-to-manage-quotas)

---

**Last Updated**: 2025-11-22
**Next Step**: Deploy to Azure AI Foundry once GPU quota is approved