renll commited on
Commit
a5d1907
·
verified ·
1 Parent(s): 2840921

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -81,7 +81,7 @@ The two figures below compare the latency and throughput performance of the Phi-
81
  <img src="lat.png" width="300"/>
82
  <img src="thr_lat.png" width="298"/>
83
  </div>
84
- Figure 1. The first plot shows average inference latency as a function of generation length, while the second plot illustrates how inference latency varies with throughput. Both experiments were conducted using the vLLM inference framework on a single A100-80GB GPU over various number of concurrent requests.
85
 
86
  ## Usage
87
 
 
81
  <img src="lat.png" width="300"/>
82
  <img src="thr_lat.png" width="298"/>
83
  </div>
84
+ Figure 1. The first plot shows average inference latency as a function of generation length, while the second plot illustrates how inference latency varies with throughput. Both experiments were conducted using the vLLM inference framework on a single A100-80GB GPU over varying concurrency levels of user requests.
85
 
86
  ## Usage
87