File size: 4,798 Bytes

1a91148

# Deploying LearningStudio Wrapper to Hugging Face

This guide explains how to deploy the LearningStudio callout detection wrapper to a HuggingFace Inference Endpoint.

## Prerequisites

1. **HuggingFace Account**: Create an account at [huggingface.co](https://huggingface.co)
2. **HuggingFace CLI**: Install the CLI tool
3. **AWS Infrastructure**: The callout detection Lambda stack must be deployed

### Install HuggingFace CLI

```bash
pip install huggingface_hub
```

### Login to HuggingFace

```bash
huggingface-cli login
```

Follow the prompts to enter your HuggingFace token.

## Step 1: Get AWS API Gateway Info

After deploying the callout detection Lambda stack, get the API Gateway URL and key:

```bash
cd callout-detection-lambda

# Get the API Gateway endpoint URL
aws cloudformation describe-stacks \
    --stack-name callout-detection-dev \
    --query "Stacks[0].Outputs[?OutputKey=='ServiceEndpoint'].OutputValue" \
    --output text

# Get the API key
aws apigateway get-api-keys \
    --name-query "learningstudio-key-dev" \
    --include-values \
    --query "items[0].value" \
    --output text
```

Save these values - you'll need them when configuring the HF endpoint.

## Step 2: Create HuggingFace Model Repository

First time only - create the model repository:

```bash
huggingface-cli repo create YOUR_USERNAME/learningstudio-callout-wrapper --type model
```

Or create via the HuggingFace web interface at https://huggingface.co/new

## Step 3: Upload Wrapper Files

Navigate to the wrapper directory and upload files:

```bash
cd callout-detection-lambda/hf_inference/learningstudio_wrapper

# Upload all files to the repository
huggingface-cli upload YOUR_USERNAME/learningstudio-callout-wrapper \
    handler.py inference.py requirements.txt README.md \
    --repo-type model
```

## Step 4: Create Inference Endpoint

1. Go to https://ui.endpoints.huggingface.co/
2. Click "New endpoint"
3. Select your model repository (`YOUR_USERNAME/learningstudio-callout-wrapper`)
4. Configure the endpoint:
   - **Instance type**: CPU (this wrapper doesn't need GPU)
   - **Region**: Choose a region close to your API Gateway
   - **Scaling**: Start with 1 replica

## Step 5: Configure Secrets

In the HuggingFace Inference Endpoint settings, add environment variables:

1. Go to your endpoint settings
2. Click "Settings" or "Environment Variables"
3. Add the following secrets:

| Name | Value |
|------|-------|
| `API_GATEWAY_URL` | `https://xxx.execute-api.us-east-1.amazonaws.com/dev` |
| `API_KEY` | Your API key from Step 1 |

## Step 6: Test the Endpoint

Once the endpoint is running, test it:

```bash
# Set your HuggingFace token
export HF_TOKEN="your-hf-token"

# Test with a URL
curl -X POST https://YOUR_ENDPOINT.endpoints.huggingface.cloud \
  -H "Authorization: Bearer $HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"inputs": "https://example.com/test-drawing.png"}'
```

Expected response:

```json
{
  "predictions": [
    {
      "id": 1,
      "label": "callout",
      "class_id": 0,
      "confidence": 0.95,
      "bbox": {"x1": 100, "y1": 200, "x2": 300, "y2": 400}
    }
  ],
  "total_detections": 1,
  "image": "...",
  "image_width": 1920,
  "image_height": 1080
}
```

## Updating the Wrapper

To update the wrapper code:

```bash
cd callout-detection-lambda/hf_inference/learningstudio_wrapper

# Upload updated files
huggingface-cli upload YOUR_USERNAME/learningstudio-callout-wrapper \
    handler.py inference.py requirements.txt README.md \
    --repo-type model
```

The endpoint will automatically pick up the changes on the next request (after a brief cold start).

## Rotating API Keys

To rotate the API key without touching the HF endpoint:

1. Create a new API key in AWS API Gateway
2. Update the `API_KEY` secret in HF endpoint settings
3. Delete the old API key in AWS

## Troubleshooting

### "API_GATEWAY_URL and API_KEY must be set"

The environment variables are not configured. Go to your endpoint settings and add the secrets.

### Timeout errors

The callout detection pipeline takes 30-120 seconds typically. If you're getting timeouts:
- Check that the Lambda stack is deployed and working
- Verify the API Gateway URL is correct
- Check CloudWatch logs for the Lambda functions

### Authentication errors

- Verify the API key is correct
- Check that the key hasn't been deleted or rotated
- Ensure the key is associated with the usage plan

### Connection refused

- Verify the API Gateway URL is correct
- Check that the endpoint is in the right region
- Ensure the Lambda stack is deployed

## Monitoring

- **HuggingFace**: Check endpoint logs in the HF dashboard
- **AWS CloudWatch**: Monitor Lambda function logs and metrics
- **API Gateway**: View API Gateway metrics for request counts and errors