File size: 1,370 Bytes
8b64cd2 badd580 48e3736 badd580 48e3736 8b64cd2 48e3736 8b64cd2 48e3736 8b64cd2 48e3736 8b64cd2 48e3736 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | # Audio Flamingo 3 Caption Endpoint Template
Use this as a custom `handler.py` runtime for a Hugging Face Dedicated Endpoint.
## Request contract
```json
{
"inputs": {
"prompt": "Analyze this full song and summarize arrangement changes.",
"audio_base64": "<base64-encoded WAV bytes>",
"max_new_tokens": 1200,
"temperature": 0.1
}
}
```
## Response contract
```json
{
"generated_text": "..."
}
```
## Setup
Fastest path from this repo:
```bash
python scripts/hf_clone.py af3-endpoint --repo-id YOUR_USERNAME/YOUR_AF3_ENDPOINT_REPO
```
Then deploy a Dedicated Endpoint from that model repo.
Important: make sure your endpoint repo contains top-level:
- `handler.py`
- `requirements.txt`
- `README.md`
Use endpoint task `custom` so the runtime loads `handler.py` instead of a default Transformers pipeline.
## Endpoint env vars
Required:
- `AF3_MODEL_ID=nvidia/audio-flamingo-3-hf`
Optional runtime bootstrap (defaults shown):
- `AF3_BOOTSTRAP_RUNTIME=1`
- `AF3_TRANSFORMERS_SPEC=transformers==5.1.0`
- `AF3_RUNTIME_DIR=/tmp/af3_runtime`
- `AF3_STUB_TORCHVISION=1`
## Notes
- Audio Flamingo 3 is large; use a GPU endpoint.
- First boot can take longer because the handler installs AF3-compatible runtime dependencies.
- This handler returns raw prose analysis. Use the local AF3+ChatGPT pipeline to normalize to LoRA sidecar JSON.
|