from pathlib import Path

content = """# ACE-Step 1.5 Deployment Context (Hugging Face Endpoint) — Handoff Notes

## Objective
Set up `ACE-Step/Ace-Step1.5` on Hugging Face so we can:
1. Serve music generations through a private endpoint (token-protected),
2. Run on GPU (A100 preferred),
3. Control costs aggressively (scale-to-zero / pause when idle),
4. Call the endpoint from local scripts/app backend,
5. Transition from a sine-wave smoke test to real ACE-Step generations.

---

## Current State (What’s Already Done)

- Hugging Face auth is working in terminal (`hf auth` usable).
- A private dedicated endpoint exists:
  - `https://xr81s77sis7hoggq.us-east-1.aws.endpoints.huggingface.cloud`
- Endpoint is connected to a custom repo containing `handler.py`.
- Smoke-test `handler.py` was deployed and tested successfully:
  - Returns base64-encoded WAV generated as sine wave/noise.
- Local `.bat` + `.ps1` testing flow works to hit endpoint and save `.wav`.

---

## Key Constraint Discovered

`ACE-Step/Ace-Step1.5` is **not** a one-click “Model Catalog verified” endpoint deployment.
HF warning indicates:
- no verified config,
- missing `handler.py` if trying to deploy model repo directly.

### Implication
Use a **custom endpoint repo** (our own repo) with:
- `handler.py`
- `requirements.txt`
Then load `ACE-Step/Ace-Step1.5` from code at runtime.

---

## Important Product/Infra Notes

### ZeroGPU vs Dedicated Endpoints
- **ZeroGPU** applies to **Spaces** (good for demos/prototypes).
- For production-like API serving, use **Dedicated Inference Endpoints**.

### Idle Scaling
- In Dedicated Endpoints UI, minimum idle scale-to-zero window observed is **15 minutes**.
- Faster than 15 min is not available in current UI setting.
- To stop all billing immediately, use **Pause** endpoint.
- Scale-to-zero (min replicas 0) is good for auto-wake behavior with cold starts.

---

## Files in Custom Endpoint Repo (Expected)

- `handler.py`  -> custom inference logic
- `requirements.txt` -> runtime dependencies
- `README.md` -> optional docs/config context

---

## Smoke-Test Handler Behavior (Current)

Current handler:
- Does NOT load ACE-Step model.
- Generates synthetic audio via numpy sine + noise.
- Returns:
  - `audio_base64_wav`
  - `sample_rate`
  - `duration_sec`

This validated endpoint wiring, auth, request/response format, and client decode pipeline.

---

## What Needs to Happen Next (Critical Path)

## 1) Replace fallback generation with real ACE-Step inference

In `handler.py`:

- `__init__`:
  - Load ACE-Step pipeline/model once at container startup.
  - Use model source `ACE-Step/Ace-Step1.5`.
  - Move model to CUDA when available.

- `__call__`:
  - Parse request inputs:
    - `prompt`
    - `lyrics`
    - `duration_sec`
    - `sample_rate`
    - `seed`
    - optional: `guidance_scale`, `steps`, `use_lm`
  - Execute ACE-Step generation.
  - Convert output waveform to WAV bytes.
  - Return base64 WAV in JSON.

## 2) Ensure dependencies are correct in `requirements.txt`
At minimum for current scaffold:
- `numpy`
- `soundfile`

Likely needed for ACE runtime:
- `torch`
- `torchaudio`
- `transformers`
- `accelerate`
- `huggingface_hub`
- any ACE-Step-specific package requirements from ACE docs/repo.

## 3) Push repo changes and redeploy/rebuild endpoint
- `git add .`
- `git commit -m "..."`
- `git push`
- wait for endpoint rebuild healthy status.

## 4) Run generation call with real payload
Use same client script and include `prompt/lyrics`.

---

## Request Payload Contract (Target)

```json
{
  "inputs": {
    "prompt": "upbeat pop rap with emotional guitar",
    "lyrics": "[Verse] city lights and midnight rain",
    "duration_sec": 12,
    "sample_rate": 44100,
    "seed": 42,
    "guidance_scale": 7.0,
    "steps": 50,
    "use_lm": true
  }
}