Spaces:
Running
Running
| # API Documentation | |
| ## Overview | |
| This document describes the new API endpoints added to the AFS backend for face recognition and audio streaming. | |
| ## Face Recognition APIs | |
| ### 1. Upload 360-Degree Reference Video | |
| **Endpoint:** `POST /api/face/upload-video` | |
| **Description:** Upload a 360-degree reference video for face recognition training. The video will be processed to extract high-quality face embeddings. | |
| **Authentication:** Required (JWT token) | |
| **Request:** | |
| - Content-Type: `multipart/form-data` | |
| - Body: `file` (video file - .mp4, .avi, .mov, .mkv) | |
| **Response:** | |
| ```json | |
| { | |
| "ok": true, | |
| "message": "Video processed successfully", | |
| "frames_used": 10, | |
| "embeddings_count": 1 | |
| } | |
| ``` | |
| ### 2. Upload Reference Image | |
| **Endpoint:** `POST /api/face/upload-image` | |
| **Description:** Upload a single reference image for face recognition. | |
| **Authentication:** Required (JWT token) | |
| **Request:** | |
| - Content-Type: `multipart/form-data` | |
| - Body: `file` (image file - .jpg, .jpeg, .png) | |
| **Response:** | |
| ```json | |
| { | |
| "ok": true, | |
| "message": "Image processed successfully", | |
| "embeddings_count": 1, | |
| "saved_path": "/path/to/Model/ref_image.jpg" | |
| } | |
| ``` | |
| ### 3. Get Cache Status | |
| **Endpoint:** `GET /api/face/cache-status` | |
| **Description:** Check if face recognition embeddings are cached and ready to use. | |
| **Authentication:** Required (JWT token) | |
| **Response (Cached):** | |
| ```json | |
| { | |
| "ok": true, | |
| "cached": true, | |
| "video_path": "my_scan.mp4", | |
| "model_name": "ArcFace", | |
| "num_frames_used": 10, | |
| "version": 2 | |
| } | |
| ``` | |
| **Response (Not Cached):** | |
| ```json | |
| { | |
| "ok": true, | |
| "cached": false, | |
| "message": "No cache found. Please upload a reference video or image." | |
| } | |
| ``` | |
| ## Audio Streaming APIs | |
| ### 1. Start Audio Stream | |
| **Endpoint:** `POST /api/audio/start-stream` | |
| **Description:** Start a new audio recording session. Returns a session ID for streaming. | |
| **Authentication:** Required (JWT token) | |
| **Request:** | |
| - Content-Type: `multipart/form-data` | |
| - Body: | |
| - `sample_rate` (optional, default: 16000) | |
| - `channels` (optional, default: 1 for mono, 2 for stereo) | |
| **Response:** | |
| ```json | |
| { | |
| "ok": true, | |
| "session_id": "uuid-here", | |
| "filename": "/path/to/Model/audio_recordings/audio_uuid_timestamp.wav", | |
| "sample_rate": 16000, | |
| "channels": 1 | |
| } | |
| ``` | |
| ### 2. Audio WebSocket Stream | |
| **Endpoint:** `WebSocket /ws/audio/{session_id}` | |
| **Description:** WebSocket endpoint for streaming audio data with optional angle information. | |
| **Authentication:** Not required at WebSocket level (use session_id from start-stream) | |
| **Send (Binary Audio Data):** | |
| ``` | |
| WebSocket Binary Message: raw audio bytes (16-bit PCM) | |
| ``` | |
| **Send (JSON with Angle):** | |
| ```json | |
| { | |
| "audio_data": "base64-encoded-audio-bytes", | |
| "angle": 45.5 | |
| } | |
| ``` | |
| **Send (Stop Command):** | |
| ```json | |
| { | |
| "command": "stop" | |
| } | |
| ``` | |
| **Receive:** | |
| ```json | |
| { | |
| "status": "received", | |
| "bytes": 1024 | |
| } | |
| ``` | |
| or | |
| ```json | |
| { | |
| "status": "received", | |
| "angle": 45.5 | |
| } | |
| ``` | |
| ### 3. Stop Audio Stream | |
| **Endpoint:** `POST /api/audio/stop-stream/{session_id}` | |
| **Description:** Stop an active audio recording stream. | |
| **Authentication:** Required (JWT token) | |
| **Response:** | |
| ```json | |
| { | |
| "ok": true, | |
| "message": "Audio stream stopped successfully" | |
| } | |
| ``` | |
| ### 4. List Audio Recordings | |
| **Endpoint:** `GET /api/audio/recordings` | |
| **Description:** Get a list of all audio recordings. | |
| **Authentication:** Required (JWT token) | |
| **Response:** | |
| ```json | |
| { | |
| "ok": true, | |
| "recordings": [ | |
| "/path/to/Model/audio_recordings/audio_uuid1_timestamp1.wav", | |
| "/path/to/Model/audio_recordings/audio_uuid2_timestamp2.wav" | |
| ], | |
| "count": 2 | |
| } | |
| ``` | |
| ### 5. Get Angle Metadata for Session | |
| **Endpoint:** `GET /api/audio/angles/{session_id}` | |
| **Description:** Retrieve angle data collected during an audio streaming session. | |
| **Authentication:** Required (JWT token) | |
| **Parameters:** | |
| - `session_id` (path parameter): The UUID of the audio session | |
| **Response:** | |
| ```json | |
| { | |
| "ok": true, | |
| "session_id": "uuid-here", | |
| "angles": [ | |
| {"timestamp": 0.000, "angle": 45.50}, | |
| {"timestamp": 0.064, "angle": 46.20}, | |
| {"timestamp": 0.128, "angle": 47.00} | |
| ], | |
| "count": 3 | |
| } | |
| ``` | |
| ### 6. Download Audio File | |
| **Endpoint:** `GET /api/audio/download/{session_id}` | |
| **Description:** Download the recorded audio file (.wav) for a specific session. | |
| **Authentication:** Required (JWT token) | |
| **Parameters:** | |
| - `session_id` (path parameter): The UUID of the audio session | |
| **Response:** | |
| - Binary WAV file with `Content-Type: audio/wav` | |
| - File download with appropriate filename header | |
| ### 7. Set Desired Angle | |
| **Endpoint:** `POST /api/audio/set-angle/{session_id}` | |
| **Description:** Send a desired/target angle to the audio processing backend for a session. | |
| **Authentication:** Required (JWT token) | |
| **Parameters:** | |
| - `session_id` (path parameter): The UUID of the audio session | |
| - `angle` (form parameter, required): Desired angle in degrees (0-360) | |
| **Request:** | |
| ``` | |
| POST /api/audio/set-angle/session-uuid-here | |
| Content-Type: multipart/form-data | |
| angle=45.5 | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "ok": true, | |
| "message": "Desired angle set to 45.5°", | |
| "session_id": "uuid-here", | |
| "angle": 45.5 | |
| } | |
| ``` | |
| **Error Response (Invalid Angle):** | |
| ```json | |
| { | |
| "detail": "Angle must be between 0 and 360 degrees" | |
| } | |
| ``` | |
| ## File Storage | |
| All uploaded files and processed data are stored in the `/Model/` directory: | |
| - **Reference Videos:** `/Model/my_scan.mp4` (overwritten on each upload) | |
| - **Reference Images:** `/Model/ref_{filename}` | |
| - **Embeddings Cache:** `/Model/embeddings_cache.pkl` | |
| - **Audio Recordings:** `/Model/audio_recordings/audio_{session_id}_{timestamp}.wav` | |
| - **Audio Metadata:** `/Model/audio_recordings/audio_{session_id}_{timestamp}_metadata.txt` | |
| ## Metadata Format | |
| Audio metadata files contain timestamp and angle data in CSV format: | |
| ``` | |
| timestamp,angle | |
| 0.000,45.50 | |
| 0.064,46.20 | |
| 0.128,47.00 | |
| ``` | |
| ## Usage Example (Python) | |
| ```python | |
| import requests | |
| import websockets | |
| import asyncio | |
| # 1. Upload reference video | |
| with open("my_360_scan.mp4", "rb") as f: | |
| response = requests.post( | |
| "http://localhost:8000/api/face/upload-video", | |
| files={"file": f}, | |
| headers={"Authorization": f"Bearer {token}"} | |
| ) | |
| print(response.json()) | |
| # 2. Start audio stream | |
| response = requests.post( | |
| "http://localhost:8000/api/audio/start-stream", | |
| data={"sample_rate": 16000, "channels": 1}, | |
| headers={"Authorization": f"Bearer {token}"} | |
| ) | |
| session_id = response.json()["session_id"] | |
| # 3. Stream audio via WebSocket | |
| async def stream_audio(): | |
| uri = f"ws://localhost:8000/ws/audio/{session_id}" | |
| async with websockets.connect(uri) as websocket: | |
| # Send audio chunk with angle | |
| await websocket.send(json.dumps({ | |
| "audio_data": base64.b64encode(audio_bytes).decode(), | |
| "angle": 45.5 | |
| })) | |
| # Or send raw binary | |
| await websocket.send(audio_bytes) | |
| # Stop when done | |
| await websocket.send(json.dumps({"command": "stop"})) | |
| asyncio.run(stream_audio()) | |
| # 4. Get angle data for a session | |
| response = requests.get( | |
| f"http://localhost:8000/api/audio/angles/{session_id}", | |
| headers={"Authorization": f"Bearer {token}"} | |
| ) | |
| angles = response.json()["angles"] | |
| print(f"Recorded {len(angles)} angle measurements") | |
| # 5. Download recorded audio | |
| response = requests.get( | |
| f"http://localhost:8000/api/audio/download/{session_id}", | |
| headers={"Authorization": f"Bearer {token}"} | |
| ) | |
| with open("downloaded_audio.wav", "wb") as f: | |
| f.write(response.content) | |
| # 6. Send desired angle to backend | |
| response = requests.post( | |
| f"http://localhost:8000/api/audio/set-angle/{session_id}", | |
| data={"angle": 90.0}, | |
| headers={"Authorization": f"Bearer {token}"} | |
| ) | |
| print(response.json()) | |
| ``` | |