Spaces:

AutoFramingSoftware
/

afs-backend

Running

App Files Files Community

afs-backend / documents /API_DOCS.md

arnavam

made seld into fastapi

514a298 3 days ago

preview code

raw

history blame contribute delete

7.68 kB

	# API Documentation

	## Overview
	This document describes the new API endpoints added to the AFS backend for face recognition and audio streaming.

	## Face Recognition APIs

	### 1. Upload 360-Degree Reference Video
	Endpoint: `POST /api/face/upload-video`

	Description: Upload a 360-degree reference video for face recognition training. The video will be processed to extract high-quality face embeddings.

	Authentication: Required (JWT token)

	Request:
	- Content-Type: `multipart/form-data`
	- Body: `file` (video file - .mp4, .avi, .mov, .mkv)

	Response:
	```json
	{
	"ok": true,
	"message": "Video processed successfully",
	"frames_used": 10,
	"embeddings_count": 1
	}
	```

	### 2. Upload Reference Image
	Endpoint: `POST /api/face/upload-image`

	Description: Upload a single reference image for face recognition.

	Authentication: Required (JWT token)

	Request:
	- Content-Type: `multipart/form-data`
	- Body: `file` (image file - .jpg, .jpeg, .png)

	Response:
	```json
	{
	"ok": true,
	"message": "Image processed successfully",
	"embeddings_count": 1,
	"saved_path": "/path/to/Model/ref_image.jpg"
	}
	```

	### 3. Get Cache Status
	Endpoint: `GET /api/face/cache-status`

	Description: Check if face recognition embeddings are cached and ready to use.

	Authentication: Required (JWT token)

	Response (Cached):
	```json
	{
	"ok": true,
	"cached": true,
	"video_path": "my_scan.mp4",
	"model_name": "ArcFace",
	"num_frames_used": 10,
	"version": 2
	}
	```

	Response (Not Cached):
	```json
	{
	"ok": true,
	"cached": false,
	"message": "No cache found. Please upload a reference video or image."
	}
	```

	## Audio Streaming APIs

	### 1. Start Audio Stream
	Endpoint: `POST /api/audio/start-stream`

	Description: Start a new audio recording session. Returns a session ID for streaming.

	Authentication: Required (JWT token)

	Request:
	- Content-Type: `multipart/form-data`
	- Body:
	- `sample_rate` (optional, default: 16000)
	- `channels` (optional, default: 1 for mono, 2 for stereo)

	Response:
	```json
	{
	"ok": true,
	"session_id": "uuid-here",
	"filename": "/path/to/Model/audio_recordings/audio_uuid_timestamp.wav",
	"sample_rate": 16000,
	"channels": 1
	}
	```

	### 2. Audio WebSocket Stream
	Endpoint: `WebSocket /ws/audio/{session_id}`

	Description: WebSocket endpoint for streaming audio data with optional angle information.

	Authentication: Not required at WebSocket level (use session_id from start-stream)

	Send (Binary Audio Data):
	```
	WebSocket Binary Message: raw audio bytes (16-bit PCM)
	```

	Send (JSON with Angle):
	```json
	{
	"audio_data": "base64-encoded-audio-bytes",
	"angle": 45.5
	}
	```

	Send (Stop Command):
	```json
	{
	"command": "stop"
	}
	```

	Receive:
	```json
	{
	"status": "received",
	"bytes": 1024
	}
	```

	or

	```json
	{
	"status": "received",
	"angle": 45.5
	}
	```

	### 3. Stop Audio Stream
	Endpoint: `POST /api/audio/stop-stream/{session_id}`

	Description: Stop an active audio recording stream.

	Authentication: Required (JWT token)

	Response:
	```json
	{
	"ok": true,
	"message": "Audio stream stopped successfully"
	}
	```

	### 4. List Audio Recordings
	Endpoint: `GET /api/audio/recordings`

	Description: Get a list of all audio recordings.

	Authentication: Required (JWT token)

	Response:
	```json
	{
	"ok": true,
	"recordings": [
	"/path/to/Model/audio_recordings/audio_uuid1_timestamp1.wav",
	"/path/to/Model/audio_recordings/audio_uuid2_timestamp2.wav"
	],
	"count": 2
	}
	```

	### 5. Get Angle Metadata for Session
	Endpoint: `GET /api/audio/angles/{session_id}`

	Description: Retrieve angle data collected during an audio streaming session.

	Authentication: Required (JWT token)

	Parameters:
	- `session_id` (path parameter): The UUID of the audio session

	Response:
	```json
	{
	"ok": true,
	"session_id": "uuid-here",
	"angles": [
	{"timestamp": 0.000, "angle": 45.50},
	{"timestamp": 0.064, "angle": 46.20},
	{"timestamp": 0.128, "angle": 47.00}
	],
	"count": 3
	}
	```

	### 6. Download Audio File
	Endpoint: `GET /api/audio/download/{session_id}`

	Description: Download the recorded audio file (.wav) for a specific session.

	Authentication: Required (JWT token)

	Parameters:
	- `session_id` (path parameter): The UUID of the audio session

	Response:
	- Binary WAV file with `Content-Type: audio/wav`
	- File download with appropriate filename header

	### 7. Set Desired Angle
	Endpoint: `POST /api/audio/set-angle/{session_id}`

	Description: Send a desired/target angle to the audio processing backend for a session.

	Authentication: Required (JWT token)

	Parameters:
	- `session_id` (path parameter): The UUID of the audio session
	- `angle` (form parameter, required): Desired angle in degrees (0-360)

	Request:
	```
	POST /api/audio/set-angle/session-uuid-here
	Content-Type: multipart/form-data

	angle=45.5
	```

	Response:
	```json
	{
	"ok": true,
	"message": "Desired angle set to 45.5°",
	"session_id": "uuid-here",
	"angle": 45.5
	}
	```

	Error Response (Invalid Angle):
	```json
	{
	"detail": "Angle must be between 0 and 360 degrees"
	}
	```

	## File Storage

	All uploaded files and processed data are stored in the `/Model/` directory:

	- Reference Videos: `/Model/my_scan.mp4` (overwritten on each upload)
	- Reference Images: `/Model/ref_{filename}`
	- Embeddings Cache: `/Model/embeddings_cache.pkl`
	- Audio Recordings: `/Model/audio_recordings/audio_{session_id}_{timestamp}.wav`
	- Audio Metadata: `/Model/audio_recordings/audio_{session_id}_{timestamp}_metadata.txt`

	## Metadata Format

	Audio metadata files contain timestamp and angle data in CSV format:
	```
	timestamp,angle
	0.000,45.50
	0.064,46.20
	0.128,47.00
	```

	## Usage Example (Python)

	```python
	import requests
	import websockets
	import asyncio

	# 1. Upload reference video
	with open("my_360_scan.mp4", "rb") as f:
	response = requests.post(
	"http://localhost:8000/api/face/upload-video",
	files={"file": f},
	headers={"Authorization": f"Bearer {token}"}
	)
	print(response.json())

	# 2. Start audio stream
	response = requests.post(
	"http://localhost:8000/api/audio/start-stream",
	data={"sample_rate": 16000, "channels": 1},
	headers={"Authorization": f"Bearer {token}"}
	)
	session_id = response.json()["session_id"]

	# 3. Stream audio via WebSocket
	async def stream_audio():
	uri = f"ws://localhost:8000/ws/audio/{session_id}"
	async with websockets.connect(uri) as websocket:
	# Send audio chunk with angle
	await websocket.send(json.dumps({
	"audio_data": base64.b64encode(audio_bytes).decode(),
	"angle": 45.5
	}))

	# Or send raw binary
	await websocket.send(audio_bytes)

	# Stop when done
	await websocket.send(json.dumps({"command": "stop"}))

	asyncio.run(stream_audio())

	# 4. Get angle data for a session
	response = requests.get(
	f"http://localhost:8000/api/audio/angles/{session_id}",
	headers={"Authorization": f"Bearer {token}"}
	)
	angles = response.json()["angles"]
	print(f"Recorded {len(angles)} angle measurements")

	# 5. Download recorded audio
	response = requests.get(
	f"http://localhost:8000/api/audio/download/{session_id}",
	headers={"Authorization": f"Bearer {token}"}
	)
	with open("downloaded_audio.wav", "wb") as f:
	f.write(response.content)

	# 6. Send desired angle to backend
	response = requests.post(
	f"http://localhost:8000/api/audio/set-angle/{session_id}",
	data={"angle": 90.0},
	headers={"Authorization": f"Bearer {token}"}
	)
	print(response.json())
	```