Update app.py
Browse files
app.py
CHANGED
|
@@ -82,14 +82,14 @@ with gr.Blocks() as demo:
|
|
| 82 |
# Phi-2 Chatbot Demo
|
| 83 |
This chatbot was created using Microsoft's 2.7 billion parameter [phi-2](https://huggingface.co/microsoft/phi-2) Transformer model.
|
| 84 |
|
| 85 |
-
In order to reduce the response time on this hardware, `max_new_tokens` has been set to `
|
| 86 |
"""
|
| 87 |
)
|
| 88 |
|
| 89 |
tokens_slider = gr.Slider(
|
| 90 |
8,
|
| 91 |
128,
|
| 92 |
-
value=
|
| 93 |
label="Maximum new tokens",
|
| 94 |
info="A larger `max_new_tokens` parameter value gives you longer text responses but at the cost of a slower response time.",
|
| 95 |
)
|
|
|
|
| 82 |
# Phi-2 Chatbot Demo
|
| 83 |
This chatbot was created using Microsoft's 2.7 billion parameter [phi-2](https://huggingface.co/microsoft/phi-2) Transformer model.
|
| 84 |
|
| 85 |
+
In order to reduce the response time on this hardware, `max_new_tokens` has been set to `128` in the text generation pipeline. With this default configuration, it takes approximately `60 seconds` for the response to start being generated, and streamed one word at a time. Use the slider below to increase or decrease the length of the generated text.
|
| 86 |
"""
|
| 87 |
)
|
| 88 |
|
| 89 |
tokens_slider = gr.Slider(
|
| 90 |
8,
|
| 91 |
128,
|
| 92 |
+
value=128,
|
| 93 |
label="Maximum new tokens",
|
| 94 |
info="A larger `max_new_tokens` parameter value gives you longer text responses but at the cost of a slower response time.",
|
| 95 |
)
|