Spaces:
Running
Running
File size: 7,836 Bytes
e568430 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 |
# Token Authentication Review - Gradio & HuggingFace
## Summary
This document reviews the implementation of token authentication for Gradio Client API calls and HuggingFace API usage to ensure tokens are always passed correctly.
## β
Implementation Status
### 1. Gradio Client Services
#### STT Service (`src/services/stt_gradio.py`)
- β
**Token Support**: Service accepts `hf_token` parameter in `__init__` and methods
- β
**Client Initialization**: `Client` is created with `hf_token` parameter when token is available
- β
**Token Priority**: Method-level token > instance-level token
- β
**Token Updates**: Client is recreated if token changes
**Implementation Pattern:**
```python
async def _get_client(self, hf_token: str | None = None) -> Client:
token = hf_token or self.hf_token
if token:
self.client = Client(self.api_url, hf_token=token)
else:
self.client = Client(self.api_url)
```
#### Image OCR Service (`src/services/image_ocr.py`)
- β
**Token Support**: Service accepts `hf_token` parameter in `__init__` and methods
- β
**Client Initialization**: `Client` is created with `hf_token` parameter when token is available
- β
**Token Priority**: Method-level token > instance-level token
- β
**Token Updates**: Client is recreated if token changes
**Same pattern as STT Service**
### 2. Service Layer Integration
#### Audio Service (`src/services/audio_processing.py`)
- β
**Token Passthrough**: `process_audio_input()` accepts `hf_token` and passes to STT service
- β
**Token Flow**: `audio_service.process_audio_input(audio, hf_token=token)`
#### Multimodal Service (`src/services/multimodal_processing.py`)
- β
**Token Passthrough**: `process_multimodal_input()` accepts `hf_token` and passes to both audio and OCR services
- β
**Token Flow**: `multimodal_service.process_multimodal_input(..., hf_token=token)`
### 3. Application Layer (`src/app.py`)
#### Token Extraction
- β
**OAuth Token**: Extracted from `gr.OAuthToken` via `oauth_token.token`
- β
**Fallback**: Uses `HF_TOKEN` or `HUGGINGFACE_API_KEY` from environment
- β
**Token Priority**: `oauth_token > HF_TOKEN > HUGGINGFACE_API_KEY`
**Implementation:**
```python
token_value: str | None = None
if oauth_token is not None:
token_value = oauth_token.token if hasattr(oauth_token, "token") else None
# Fallback to env vars
effective_token = token_value or os.getenv("HF_TOKEN") or os.getenv("HUGGINGFACE_API_KEY")
```
#### Token Usage in Services
- β
**Multimodal Processing**: Token passed to `process_multimodal_input(..., hf_token=token_value)`
- β
**Consistent Usage**: Token is extracted once and passed through all service layers
### 4. HuggingFace API Integration
#### LLM Factory (`src/utils/llm_factory.py`)
- β
**Token Priority**: `oauth_token > settings.hf_token > settings.huggingface_api_key`
- β
**Provider Usage**: `HuggingFaceProvider(api_key=effective_hf_token)`
- β
**Model Usage**: `HuggingFaceModel(model_name, provider=provider)`
#### Judge Handler (`src/agent_factory/judges.py`)
- β
**Token Priority**: `oauth_token > settings.hf_token > settings.huggingface_api_key`
- β
**InferenceClient**: `InferenceClient(api_key=api_key)` when token provided
- β
**Fallback**: Uses `HF_TOKEN` from environment if no token provided
**Implementation:**
```python
effective_hf_token = oauth_token or settings.hf_token or settings.huggingface_api_key
hf_provider = HuggingFaceProvider(api_key=effective_hf_token)
```
### 5. MCP Tools (`src/mcp_tools.py`)
#### Image OCR Tool
- β
**Token Support**: `extract_text_from_image()` accepts `hf_token` parameter
- β
**Token Fallback**: Uses `settings.hf_token` or `settings.huggingface_api_key` if not provided
- β
**Service Integration**: Passes token to `ImageOCRService.extract_text()`
#### Audio Transcription Tool
- β
**Token Support**: `transcribe_audio_file()` accepts `hf_token` parameter
- β
**Token Fallback**: Uses `settings.hf_token` or `settings.huggingface_api_key` if not provided
- β
**Service Integration**: Passes token to `STTService.transcribe_file()`
## Token Flow Diagram
```
User Login (OAuth)
β
oauth_token.token
β
app.py: token_value
β
βββββββββββββββββββββββββββββββββββββββ
β Service Layer β
βββββββββββββββββββββββββββββββββββββββ€
β MultimodalService β
β β hf_token=token_value β
β AudioService β
β β hf_token=token_value β
β STTService / ImageOCRService β
β β hf_token=token_value β
β Gradio Client(hf_token=token) β
βββββββββββββββββββββββββββββββββββββββ
Alternative: Environment Variables
β
HF_TOKEN or HUGGINGFACE_API_KEY
β
settings.hf_token or settings.huggingface_api_key
β
Same service flow as above
```
## Verification Checklist
- [x] STT Service accepts and uses `hf_token` parameter
- [x] Image OCR Service accepts and uses `hf_token` parameter
- [x] Audio Service passes token to STT service
- [x] Multimodal Service passes token to both audio and OCR services
- [x] App.py extracts OAuth token correctly
- [x] App.py passes token to multimodal service
- [x] HuggingFace API calls use token via `HuggingFaceProvider`
- [x] HuggingFace API calls use token via `InferenceClient`
- [x] MCP tools accept and use token parameter
- [x] Token priority is consistent: OAuth > Env Vars
- [x] Fallback to environment variables when OAuth not available
## Token Parameter Naming
All services consistently use `hf_token` parameter name:
- `STTService.transcribe_audio(..., hf_token=...)`
- `STTService.transcribe_file(..., hf_token=...)`
- `ImageOCRService.extract_text(..., hf_token=...)`
- `ImageOCRService.extract_text_from_image(..., hf_token=...)`
- `AudioService.process_audio_input(..., hf_token=...)`
- `MultimodalService.process_multimodal_input(..., hf_token=...)`
- `extract_text_from_image(..., hf_token=...)` (MCP tool)
- `transcribe_audio_file(..., hf_token=...)` (MCP tool)
## Gradio Client API Usage
According to Gradio documentation, the `Client` constructor accepts:
```python
Client(space_name, hf_token=None)
```
Our implementation correctly uses:
```python
Client(self.api_url, hf_token=token) # When token available
Client(self.api_url) # When no token (public Space)
```
## HuggingFace API Usage
### HuggingFaceProvider
```python
HuggingFaceProvider(api_key=effective_hf_token)
```
β
Correctly passes token as `api_key` parameter
### InferenceClient
```python
InferenceClient(api_key=api_key) # When token provided
InferenceClient() # Falls back to HF_TOKEN env var
```
β
Correctly passes token as `api_key` parameter
## Edge Cases Handled
1. **No Token Available**: Services work without token (public Gradio Spaces)
2. **Token Changes**: Client is recreated when token changes
3. **OAuth vs Env**: OAuth token takes priority over environment variables
4. **Multiple Token Sources**: Consistent priority across all services
5. **MCP Tools**: Support both explicit token and fallback to settings
## Recommendations
β
**All implementations are correct and consistent**
The token authentication is properly implemented throughout:
- Gradio Client services accept and use tokens
- Service layer passes tokens through correctly
- Application layer extracts and passes OAuth tokens
- HuggingFace API calls use tokens via correct parameters
- MCP tools support token authentication
- Token priority is consistent across all layers
No changes needed - implementation follows best practices.
|