ming commited on
Commit
817d281
Β·
1 Parent(s): 6e01ea3

docs: Update README.md with streaming endpoint documentation

Browse files

- Add streaming endpoint documentation with SSE format examples
- Include comprehensive usage examples for Python, cURL, and Android
- Add streaming benefits section explaining real-time feedback advantages
- Update performance section to mention streaming capabilities
- Provide Android Kotlin code example with OkHttp EventSource
- Document both standard and streaming API endpoints

Files changed (1) hide show
  1. README.md +101 -3
README.md CHANGED
@@ -18,11 +18,22 @@ A FastAPI-based text summarization service powered by Ollama and Llama 3.2 1B mo
18
  ## πŸš€ Features
19
 
20
  - **Fast text summarization** using local LLM inference
 
21
  - **RESTful API** with FastAPI
22
  - **Health monitoring** and logging
23
  - **Docker containerized** for easy deployment
24
  - **Free deployment** on Hugging Face Spaces
25
 
 
 
 
 
 
 
 
 
 
 
26
  ## πŸ“‘ API Endpoints
27
 
28
  ### Health Check
@@ -30,7 +41,7 @@ A FastAPI-based text summarization service powered by Ollama and Llama 3.2 1B mo
30
  GET /health
31
  ```
32
 
33
- ### Summarize Text
34
  ```
35
  POST /api/v1/summarize
36
  Content-Type: application/json
@@ -42,6 +53,29 @@ Content-Type: application/json
42
  }
43
  ```
44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
  ### API Documentation
46
  - **Swagger UI**: `/docs`
47
  - **ReDoc**: `/redoc`
@@ -78,6 +112,7 @@ This app is configured for deployment on Hugging Face Spaces using Docker SDK.
78
  - **Startup time**: ~1-2 minutes (includes model download)
79
  - **Inference speed**: ~1-3 seconds per request
80
  - **Memory usage**: ~2GB RAM
 
81
 
82
  ## πŸ› οΈ Development
83
 
@@ -101,7 +136,7 @@ pytest --cov=app
101
 
102
  ## πŸ“ Usage Examples
103
 
104
- ### Python
105
  ```python
106
  import requests
107
 
@@ -118,7 +153,30 @@ result = response.json()
118
  print(result["summary"])
119
  ```
120
 
121
- ### cURL
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
  ```bash
123
  curl -X POST "https://huggingface.co/spaces/colin730/SummarizerApp/api/v1/summarize" \
124
  -H "Content-Type: application/json" \
@@ -128,6 +186,46 @@ curl -X POST "https://huggingface.co/spaces/colin730/SummarizerApp/api/v1/summar
128
  }'
129
  ```
130
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
131
  ## πŸ”’ Security
132
 
133
  - Non-root user execution
 
18
  ## πŸš€ Features
19
 
20
  - **Fast text summarization** using local LLM inference
21
+ - **Real-time streaming** with Server-Sent Events (SSE) for Android compatibility
22
  - **RESTful API** with FastAPI
23
  - **Health monitoring** and logging
24
  - **Docker containerized** for easy deployment
25
  - **Free deployment** on Hugging Face Spaces
26
 
27
+ ## 🌊 Streaming Benefits
28
+
29
+ The streaming endpoint (`/api/v1/summarize/stream`) provides several advantages:
30
+
31
+ - **Real-time feedback**: Users see text being generated as it happens
32
+ - **Better UX**: No waiting for complete response before seeing results
33
+ - **Android-friendly**: Uses Server-Sent Events (SSE) for easy mobile integration
34
+ - **Progressive loading**: Content appears incrementally, improving perceived performance
35
+ - **Error resilience**: Errors are sent as SSE events, maintaining connection
36
+
37
  ## πŸ“‘ API Endpoints
38
 
39
  ### Health Check
 
41
  GET /health
42
  ```
43
 
44
+ ### Summarize Text (Standard)
45
  ```
46
  POST /api/v1/summarize
47
  Content-Type: application/json
 
53
  }
54
  ```
55
 
56
+ ### Summarize Text (Streaming)
57
+ ```
58
+ POST /api/v1/summarize/stream
59
+ Content-Type: application/json
60
+
61
+ {
62
+ "text": "Your long text to summarize here...",
63
+ "max_tokens": 256,
64
+ "prompt": "Summarize the following text concisely:"
65
+ }
66
+ ```
67
+
68
+ **Response Format**: Server-Sent Events (SSE)
69
+ ```
70
+ data: {"content": "This", "done": false, "tokens_used": 1}
71
+
72
+ data: {"content": " is", "done": false, "tokens_used": 2}
73
+
74
+ data: {"content": " a", "done": false, "tokens_used": 3}
75
+
76
+ data: {"content": " summary.", "done": true, "tokens_used": 4}
77
+ ```
78
+
79
  ### API Documentation
80
  - **Swagger UI**: `/docs`
81
  - **ReDoc**: `/redoc`
 
112
  - **Startup time**: ~1-2 minutes (includes model download)
113
  - **Inference speed**: ~1-3 seconds per request
114
  - **Memory usage**: ~2GB RAM
115
+ - **Streaming**: Real-time text generation with SSE for responsive user experience
116
 
117
  ## πŸ› οΈ Development
118
 
 
136
 
137
  ## πŸ“ Usage Examples
138
 
139
+ ### Python (Standard)
140
  ```python
141
  import requests
142
 
 
153
  print(result["summary"])
154
  ```
155
 
156
+ ### Python (Streaming)
157
+ ```python
158
+ import requests
159
+ import json
160
+
161
+ # Stream summarization
162
+ response = requests.post(
163
+ "https://huggingface.co/spaces/colin730/SummarizerApp/api/v1/summarize/stream",
164
+ json={
165
+ "text": "Your long article or text here...",
166
+ "max_tokens": 256
167
+ },
168
+ stream=True
169
+ )
170
+
171
+ for line in response.iter_lines():
172
+ if line.startswith(b'data: '):
173
+ data = json.loads(line[6:]) # Remove 'data: ' prefix
174
+ print(data["content"], end='', flush=True)
175
+ if data["done"]:
176
+ break
177
+ ```
178
+
179
+ ### cURL (Standard)
180
  ```bash
181
  curl -X POST "https://huggingface.co/spaces/colin730/SummarizerApp/api/v1/summarize" \
182
  -H "Content-Type: application/json" \
 
186
  }'
187
  ```
188
 
189
+ ### cURL (Streaming)
190
+ ```bash
191
+ curl -N -X POST "https://huggingface.co/spaces/colin730/SummarizerApp/api/v1/summarize/stream" \
192
+ -H "Content-Type: application/json" \
193
+ -d '{
194
+ "text": "Your text to summarize...",
195
+ "max_tokens": 256
196
+ }'
197
+ ```
198
+
199
+ ### Android (Kotlin)
200
+ ```kotlin
201
+ // Add to build.gradle
202
+ implementation("com.squareup.okhttp3:okhttp-sse:4.12.0")
203
+
204
+ // Usage
205
+ val client = OkHttpClient()
206
+ val request = Request.Builder()
207
+ .url("https://huggingface.co/spaces/colin730/SummarizerApp/api/v1/summarize/stream")
208
+ .post(/* JSON body */)
209
+ .build()
210
+
211
+ val eventSource = EventSources.createFactory(client)
212
+ .newEventSource(request, object : EventSourceListener() {
213
+ override fun onEvent(eventSource: EventSource, id: String?, type: String?, data: String) {
214
+ val chunk = JSONObject(data)
215
+ val content = chunk.getString("content")
216
+ val done = chunk.getBoolean("done")
217
+
218
+ // Update UI with streaming content
219
+ runOnUiThread {
220
+ textView.append(content)
221
+ if (done) {
222
+ // Streaming complete
223
+ }
224
+ }
225
+ }
226
+ })
227
+ ```
228
+
229
  ## πŸ”’ Security
230
 
231
  - Non-root user execution