DeepCritical / docs /implementation /TOKEN_AUTHENTICATION_REVIEW.md
Joseph Pollack
adds new features and graphs integration with configuration options
e568430
|
raw
history blame
7.84 kB
# Token Authentication Review - Gradio & HuggingFace
## Summary
This document reviews the implementation of token authentication for Gradio Client API calls and HuggingFace API usage to ensure tokens are always passed correctly.
## βœ… Implementation Status
### 1. Gradio Client Services
#### STT Service (`src/services/stt_gradio.py`)
- βœ… **Token Support**: Service accepts `hf_token` parameter in `__init__` and methods
- βœ… **Client Initialization**: `Client` is created with `hf_token` parameter when token is available
- βœ… **Token Priority**: Method-level token > instance-level token
- βœ… **Token Updates**: Client is recreated if token changes
**Implementation Pattern:**
```python
async def _get_client(self, hf_token: str | None = None) -> Client:
token = hf_token or self.hf_token
if token:
self.client = Client(self.api_url, hf_token=token)
else:
self.client = Client(self.api_url)
```
#### Image OCR Service (`src/services/image_ocr.py`)
- βœ… **Token Support**: Service accepts `hf_token` parameter in `__init__` and methods
- βœ… **Client Initialization**: `Client` is created with `hf_token` parameter when token is available
- βœ… **Token Priority**: Method-level token > instance-level token
- βœ… **Token Updates**: Client is recreated if token changes
**Same pattern as STT Service**
### 2. Service Layer Integration
#### Audio Service (`src/services/audio_processing.py`)
- βœ… **Token Passthrough**: `process_audio_input()` accepts `hf_token` and passes to STT service
- βœ… **Token Flow**: `audio_service.process_audio_input(audio, hf_token=token)`
#### Multimodal Service (`src/services/multimodal_processing.py`)
- βœ… **Token Passthrough**: `process_multimodal_input()` accepts `hf_token` and passes to both audio and OCR services
- βœ… **Token Flow**: `multimodal_service.process_multimodal_input(..., hf_token=token)`
### 3. Application Layer (`src/app.py`)
#### Token Extraction
- βœ… **OAuth Token**: Extracted from `gr.OAuthToken` via `oauth_token.token`
- βœ… **Fallback**: Uses `HF_TOKEN` or `HUGGINGFACE_API_KEY` from environment
- βœ… **Token Priority**: `oauth_token > HF_TOKEN > HUGGINGFACE_API_KEY`
**Implementation:**
```python
token_value: str | None = None
if oauth_token is not None:
token_value = oauth_token.token if hasattr(oauth_token, "token") else None
# Fallback to env vars
effective_token = token_value or os.getenv("HF_TOKEN") or os.getenv("HUGGINGFACE_API_KEY")
```
#### Token Usage in Services
- βœ… **Multimodal Processing**: Token passed to `process_multimodal_input(..., hf_token=token_value)`
- βœ… **Consistent Usage**: Token is extracted once and passed through all service layers
### 4. HuggingFace API Integration
#### LLM Factory (`src/utils/llm_factory.py`)
- βœ… **Token Priority**: `oauth_token > settings.hf_token > settings.huggingface_api_key`
- βœ… **Provider Usage**: `HuggingFaceProvider(api_key=effective_hf_token)`
- βœ… **Model Usage**: `HuggingFaceModel(model_name, provider=provider)`
#### Judge Handler (`src/agent_factory/judges.py`)
- βœ… **Token Priority**: `oauth_token > settings.hf_token > settings.huggingface_api_key`
- βœ… **InferenceClient**: `InferenceClient(api_key=api_key)` when token provided
- βœ… **Fallback**: Uses `HF_TOKEN` from environment if no token provided
**Implementation:**
```python
effective_hf_token = oauth_token or settings.hf_token or settings.huggingface_api_key
hf_provider = HuggingFaceProvider(api_key=effective_hf_token)
```
### 5. MCP Tools (`src/mcp_tools.py`)
#### Image OCR Tool
- βœ… **Token Support**: `extract_text_from_image()` accepts `hf_token` parameter
- βœ… **Token Fallback**: Uses `settings.hf_token` or `settings.huggingface_api_key` if not provided
- βœ… **Service Integration**: Passes token to `ImageOCRService.extract_text()`
#### Audio Transcription Tool
- βœ… **Token Support**: `transcribe_audio_file()` accepts `hf_token` parameter
- βœ… **Token Fallback**: Uses `settings.hf_token` or `settings.huggingface_api_key` if not provided
- βœ… **Service Integration**: Passes token to `STTService.transcribe_file()`
## Token Flow Diagram
```
User Login (OAuth)
↓
oauth_token.token
↓
app.py: token_value
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Service Layer β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ MultimodalService β”‚
β”‚ ↓ hf_token=token_value β”‚
β”‚ AudioService β”‚
β”‚ ↓ hf_token=token_value β”‚
β”‚ STTService / ImageOCRService β”‚
β”‚ ↓ hf_token=token_value β”‚
β”‚ Gradio Client(hf_token=token) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Alternative: Environment Variables
↓
HF_TOKEN or HUGGINGFACE_API_KEY
↓
settings.hf_token or settings.huggingface_api_key
↓
Same service flow as above
```
## Verification Checklist
- [x] STT Service accepts and uses `hf_token` parameter
- [x] Image OCR Service accepts and uses `hf_token` parameter
- [x] Audio Service passes token to STT service
- [x] Multimodal Service passes token to both audio and OCR services
- [x] App.py extracts OAuth token correctly
- [x] App.py passes token to multimodal service
- [x] HuggingFace API calls use token via `HuggingFaceProvider`
- [x] HuggingFace API calls use token via `InferenceClient`
- [x] MCP tools accept and use token parameter
- [x] Token priority is consistent: OAuth > Env Vars
- [x] Fallback to environment variables when OAuth not available
## Token Parameter Naming
All services consistently use `hf_token` parameter name:
- `STTService.transcribe_audio(..., hf_token=...)`
- `STTService.transcribe_file(..., hf_token=...)`
- `ImageOCRService.extract_text(..., hf_token=...)`
- `ImageOCRService.extract_text_from_image(..., hf_token=...)`
- `AudioService.process_audio_input(..., hf_token=...)`
- `MultimodalService.process_multimodal_input(..., hf_token=...)`
- `extract_text_from_image(..., hf_token=...)` (MCP tool)
- `transcribe_audio_file(..., hf_token=...)` (MCP tool)
## Gradio Client API Usage
According to Gradio documentation, the `Client` constructor accepts:
```python
Client(space_name, hf_token=None)
```
Our implementation correctly uses:
```python
Client(self.api_url, hf_token=token) # When token available
Client(self.api_url) # When no token (public Space)
```
## HuggingFace API Usage
### HuggingFaceProvider
```python
HuggingFaceProvider(api_key=effective_hf_token)
```
βœ… Correctly passes token as `api_key` parameter
### InferenceClient
```python
InferenceClient(api_key=api_key) # When token provided
InferenceClient() # Falls back to HF_TOKEN env var
```
βœ… Correctly passes token as `api_key` parameter
## Edge Cases Handled
1. **No Token Available**: Services work without token (public Gradio Spaces)
2. **Token Changes**: Client is recreated when token changes
3. **OAuth vs Env**: OAuth token takes priority over environment variables
4. **Multiple Token Sources**: Consistent priority across all services
5. **MCP Tools**: Support both explicit token and fallback to settings
## Recommendations
βœ… **All implementations are correct and consistent**
The token authentication is properly implemented throughout:
- Gradio Client services accept and use tokens
- Service layer passes tokens through correctly
- Application layer extracts and passes OAuth tokens
- HuggingFace API calls use tokens via correct parameters
- MCP tools support token authentication
- Token priority is consistent across all layers
No changes needed - implementation follows best practices.