Spaces:
Paused
Paused
Mirrowel
commited on
Commit
Β·
ae7ffce
1
Parent(s):
e2f4e9e
docs(timeout): π add comprehensive HTTP timeout configuration documentation
Browse filesAdds detailed documentation section explaining the TimeoutConfig class and its usage across LLM providers:
- Explains timeout types (connect, read, write, pool) and their purposes
- Documents default values with rationale for streaming vs non-streaming requests
- Provides environment variable override reference
- Details behavioral differences between streaming (3 min read timeout) and non-streaming (10 min read timeout) configurations
- Maps which providers use which timeout configurations
- Includes tuning recommendations for different use cases
- Provides example configurations for complex reasoning tasks and unstable networks
- DOCUMENTATION.md +100 -0
DOCUMENTATION.md
CHANGED
|
@@ -858,6 +858,106 @@ class AntigravityAuthBase(GoogleOAuthBase):
|
|
| 858 |
|
| 859 |
---
|
| 860 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 861 |
|
| 862 |
---
|
| 863 |
|
|
|
|
| 858 |
|
| 859 |
---
|
| 860 |
|
| 861 |
+
### 2.14. HTTP Timeout Configuration (`timeout_config.py`)
|
| 862 |
+
|
| 863 |
+
Centralized timeout configuration for all HTTP requests to LLM providers.
|
| 864 |
+
|
| 865 |
+
#### Purpose
|
| 866 |
+
|
| 867 |
+
The `TimeoutConfig` class provides fine-grained control over HTTP timeouts for streaming and non-streaming LLM requests. This addresses the common issue of proxy hangs when upstream providers stall during connection establishment or response generation.
|
| 868 |
+
|
| 869 |
+
#### Timeout Types Explained
|
| 870 |
+
|
| 871 |
+
| Timeout | Description |
|
| 872 |
+
|---------|-------------|
|
| 873 |
+
| **connect** | Maximum time to establish a TCP/TLS connection to the upstream server |
|
| 874 |
+
| **read** | Maximum time to wait between receiving data chunks (resets on each chunk for streaming) |
|
| 875 |
+
| **write** | Maximum time to wait while sending the request body |
|
| 876 |
+
| **pool** | Maximum time to wait for a connection from the connection pool |
|
| 877 |
+
|
| 878 |
+
#### Default Values
|
| 879 |
+
|
| 880 |
+
| Setting | Streaming | Non-Streaming | Rationale |
|
| 881 |
+
|---------|-----------|---------------|-----------|
|
| 882 |
+
| **connect** | 30s | 30s | Fast fail if server is unreachable |
|
| 883 |
+
| **read** | 180s (3 min) | 600s (10 min) | Streaming expects periodic chunks; non-streaming may wait for full generation |
|
| 884 |
+
| **write** | 30s | 30s | Request bodies are typically small |
|
| 885 |
+
| **pool** | 60s | 60s | Reasonable wait for connection pool |
|
| 886 |
+
|
| 887 |
+
#### Environment Variable Overrides
|
| 888 |
+
|
| 889 |
+
All timeout values can be customized via environment variables:
|
| 890 |
+
|
| 891 |
+
```env
|
| 892 |
+
# Connection establishment timeout (seconds)
|
| 893 |
+
TIMEOUT_CONNECT=30
|
| 894 |
+
|
| 895 |
+
# Request body send timeout (seconds)
|
| 896 |
+
TIMEOUT_WRITE=30
|
| 897 |
+
|
| 898 |
+
# Connection pool acquisition timeout (seconds)
|
| 899 |
+
TIMEOUT_POOL=60
|
| 900 |
+
|
| 901 |
+
# Read timeout between chunks for streaming requests (seconds)
|
| 902 |
+
# If no data arrives for this duration, the connection is considered stalled
|
| 903 |
+
TIMEOUT_READ_STREAMING=180
|
| 904 |
+
|
| 905 |
+
# Read timeout for non-streaming responses (seconds)
|
| 906 |
+
# Longer to accommodate models that take time to generate full responses
|
| 907 |
+
TIMEOUT_READ_NON_STREAMING=600
|
| 908 |
+
```
|
| 909 |
+
|
| 910 |
+
#### Streaming vs Non-Streaming Behavior
|
| 911 |
+
|
| 912 |
+
**Streaming Requests** (`TimeoutConfig.streaming()`):
|
| 913 |
+
- Uses shorter read timeout (default 3 minutes)
|
| 914 |
+
- Timer resets every time a chunk arrives
|
| 915 |
+
- If no data for 3 minutes β connection considered dead β failover to next credential
|
| 916 |
+
- Appropriate for chat completions where tokens should arrive periodically
|
| 917 |
+
|
| 918 |
+
**Non-Streaming Requests** (`TimeoutConfig.non_streaming()`):
|
| 919 |
+
- Uses longer read timeout (default 10 minutes)
|
| 920 |
+
- Server may take significant time to generate the complete response before sending anything
|
| 921 |
+
- Complex reasoning tasks or large outputs may legitimately take several minutes
|
| 922 |
+
- Only used by Antigravity provider's `_handle_non_streaming()` method
|
| 923 |
+
|
| 924 |
+
#### Provider Usage
|
| 925 |
+
|
| 926 |
+
The following providers use `TimeoutConfig`:
|
| 927 |
+
|
| 928 |
+
| Provider | Method | Timeout Type |
|
| 929 |
+
|----------|--------|--------------|
|
| 930 |
+
| `antigravity_provider.py` | `_handle_non_streaming()` | `non_streaming()` |
|
| 931 |
+
| `antigravity_provider.py` | `_handle_streaming()` | `streaming()` |
|
| 932 |
+
| `gemini_cli_provider.py` | `acompletion()` | `streaming()` |
|
| 933 |
+
| `iflow_provider.py` | `acompletion()` | `streaming()` |
|
| 934 |
+
| `qwen_code_provider.py` | `acompletion()` | `streaming()` |
|
| 935 |
+
|
| 936 |
+
**Note:** iFlow, Qwen Code, and Gemini CLI providers always use streaming internally (even for non-streaming requests), aggregating chunks into a complete response. Only Antigravity has a true non-streaming path.
|
| 937 |
+
|
| 938 |
+
#### Tuning Recommendations
|
| 939 |
+
|
| 940 |
+
| Use Case | Recommendation |
|
| 941 |
+
|----------|----------------|
|
| 942 |
+
| **Long thinking tasks** | Increase `TIMEOUT_READ_STREAMING` to 300-360s |
|
| 943 |
+
| **Unstable network** | Increase `TIMEOUT_CONNECT` to 60s |
|
| 944 |
+
| **High concurrency** | Increase `TIMEOUT_POOL` if seeing pool exhaustion |
|
| 945 |
+
| **Large context/output** | Increase `TIMEOUT_READ_NON_STREAMING` to 900s+ |
|
| 946 |
+
|
| 947 |
+
#### Example Configuration
|
| 948 |
+
|
| 949 |
+
```env
|
| 950 |
+
# For environments with complex reasoning tasks
|
| 951 |
+
TIMEOUT_READ_STREAMING=300
|
| 952 |
+
TIMEOUT_READ_NON_STREAMING=900
|
| 953 |
+
|
| 954 |
+
# For unstable network conditions
|
| 955 |
+
TIMEOUT_CONNECT=60
|
| 956 |
+
TIMEOUT_POOL=120
|
| 957 |
+
```
|
| 958 |
+
|
| 959 |
+
---
|
| 960 |
+
|
| 961 |
|
| 962 |
---
|
| 963 |
|