INFO 09-18 17:24:35 [__init__.py:241] Automatically detected platform cuda.
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:24:38 [api_server.py:1805] vLLM API server version 0.10.1.1
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:24:38 [utils.py:326] non-default args: {'model_tag': '/data/wyt/codes/DocDPO/sft/checkpoints_multilang/ted_base_balanced_en_zhdefr_320/dpo/merged/checkpoint-1000', 'host': '0.0.0.0', 'port': 8013, 'model': '/data/wyt/codes/DocDPO/sft/checkpoints_multilang/ted_base_balanced_en_zhdefr_320/dpo/merged/checkpoint-1000', 'served_model_name': ['qwen'], 'enable_prefix_caching': True}
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:24:45 [__init__.py:711] Resolved architecture: Qwen2ForCausalLM
[1;36m(APIServer pid=3603498)[0;0m `torch_dtype` is deprecated! Use `dtype` instead!
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:24:45 [__init__.py:1750] Using max model len 32768
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:24:45 [scheduler.py:222] Chunked prefill is enabled with max_num_batched_tokens=8192.
INFO 09-18 17:24:49 [__init__.py:241] Automatically detected platform cuda.
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:24:51 [core.py:636] Waiting for init message from front-end.
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:24:51 [core.py:74] Initializing a V1 LLM engine (v0.10.1.1) with config: model='/data/wyt/codes/DocDPO/sft/checkpoints_multilang/ted_base_balanced_en_zhdefr_320/dpo/merged/checkpoint-1000', speculative_config=None, tokenizer='/data/wyt/codes/DocDPO/sft/checkpoints_multilang/ted_base_balanced_en_zhdefr_320/dpo/merged/checkpoint-1000', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config={}, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, decoding_config=DecodingConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_backend=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=qwen, enable_prefix_caching=True, chunked_prefill_enabled=True, use_async_output_proc=True, pooler_config=None, compilation_config={"level":3,"debug_dump_path":"","cache_dir":"","backend":"","custom_ops":[],"splitting_ops":["vllm.unified_attention","vllm.unified_attention_with_output","vllm.mamba_mixer2"],"use_inductor":true,"compile_sizes":[],"inductor_compile_config":{"enable_auto_functionalized_v2":false},"inductor_passes":{},"cudagraph_mode":1,"use_cudagraph":true,"cudagraph_num_of_warmups":1,"cudagraph_capture_sizes":[512,504,496,488,480,472,464,456,448,440,432,424,416,408,400,392,384,376,368,360,352,344,336,328,320,312,304,296,288,280,272,264,256,248,240,232,224,216,208,200,192,184,176,168,160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"cudagraph_copy_inputs":false,"full_cuda_graph":false,"pass_config":{},"max_capture_size":512,"local_cache_dir":null}
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:24:53 [parallel_state.py:1134] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
[1;36m(EngineCore_0 pid=3604733)[0;0m WARNING 09-18 17:24:53 [topk_topp_sampler.py:61] FlashInfer is not available. Falling back to the PyTorch-native implementation of top-p & top-k sampling. For the best performance, please install FlashInfer.
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:24:53 [gpu_model_runner.py:1953] Starting to load model /data/wyt/codes/DocDPO/sft/checkpoints_multilang/ted_base_balanced_en_zhdefr_320/dpo/merged/checkpoint-1000...
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:24:53 [gpu_model_runner.py:1985] Loading model from scratch...
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:24:53 [cuda.py:328] Using Flash Attention backend on V1 engine.
[1;36m(EngineCore_0 pid=3604733)[0;0m Loading safetensors checkpoint shards:   0% Completed | 0/4 [00:00<?, ?it/s]
[1;36m(EngineCore_0 pid=3604733)[0;0m Loading safetensors checkpoint shards:  25% Completed | 1/4 [00:01<00:03,  1.14s/it]
[1;36m(EngineCore_0 pid=3604733)[0;0m Loading safetensors checkpoint shards:  50% Completed | 2/4 [00:02<00:02,  1.31s/it]
[1;36m(EngineCore_0 pid=3604733)[0;0m Loading safetensors checkpoint shards:  75% Completed | 3/4 [00:03<00:00,  1.09it/s]
[1;36m(EngineCore_0 pid=3604733)[0;0m Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:04<00:00,  1.07s/it]
[1;36m(EngineCore_0 pid=3604733)[0;0m Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:04<00:00,  1.08s/it]
[1;36m(EngineCore_0 pid=3604733)[0;0m 
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:24:58 [default_loader.py:262] Loading weights took 4.50 seconds
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:24:58 [gpu_model_runner.py:2007] Model loading took 14.2488 GiB and 4.703133 seconds
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:05 [backends.py:548] Using cache directory: /data/wyt/.cache/vllm/torch_compile_cache/7cc7c33ae3/rank_0_0/backbone for vLLM's torch.compile
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:05 [backends.py:559] Dynamo bytecode transform time: 6.41 s
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:10 [backends.py:161] Directly load the compiled graph(s) for dynamic shape from the cache, took 4.806 s
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:13 [monitor.py:34] torch.compile takes 6.41 s in total
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:14 [gpu_worker.py:276] Available KV cache memory: 51.38 GiB
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:15 [kv_cache_utils.py:849] GPU KV cache size: 962,112 tokens
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:15 [kv_cache_utils.py:853] Maximum concurrency for 32,768 tokens per request: 29.36x
[1;36m(EngineCore_0 pid=3604733)[0;0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   0%|          | 0/67 [00:00<?, ?it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   3%|▎         | 2/67 [00:00<00:03, 18.70it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   6%|▌         | 4/67 [00:00<00:03, 18.97it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   9%|▉         | 6/67 [00:00<00:03, 19.11it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  12%|█▏        | 8/67 [00:00<00:03, 19.16it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  15%|█▍        | 10/67 [00:00<00:02, 19.45it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  18%|█▊        | 12/67 [00:00<00:02, 19.37it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  21%|██        | 14/67 [00:00<00:02, 19.42it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  24%|██▍       | 16/67 [00:00<00:02, 19.50it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  28%|██▊       | 19/67 [00:00<00:02, 20.36it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  33%|███▎      | 22/67 [00:01<00:02, 20.83it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  37%|███▋      | 25/67 [00:01<00:01, 21.17it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  42%|████▏     | 28/67 [00:01<00:01, 21.32it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  46%|████▋     | 31/67 [00:01<00:01, 20.99it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  51%|█████     | 34/67 [00:01<00:01, 21.92it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  55%|█████▌    | 37/67 [00:01<00:01, 22.93it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  60%|█████▉    | 40/67 [00:01<00:01, 23.67it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  64%|██████▍   | 43/67 [00:02<00:01, 23.42it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  69%|██████▊   | 46/67 [00:02<00:00, 23.95it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  73%|███████▎  | 49/67 [00:02<00:00, 24.62it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  78%|███████▊  | 52/67 [00:02<00:00, 24.57it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  82%|████████▏ | 55/67 [00:02<00:00, 25.20it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  87%|████████▋ | 58/67 [00:02<00:00, 24.84it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  91%|█████████ | 61/67 [00:02<00:00, 25.43it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  96%|█████████▌| 64/67 [00:02<00:00, 25.73it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 100%|██████████| 67/67 [00:02<00:00, 24.91it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 100%|██████████| 67/67 [00:02<00:00, 22.61it/s]
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:18 [gpu_model_runner.py:2708] Graph capturing finished in 3 secs, took 1.56 GiB
[1;36m(EngineCore_0 pid=3604733)[0;0m INFO 09-18 17:25:18 [core.py:214] init engine (profile, create kv cache, warmup model) took 19.58 seconds
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [loggers.py:142] Engine 000: vllm cache_config_info with initialization after num_gpu_blocks is: 60132
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [api_server.py:1611] Supported_tasks: ['generate']
[1;36m(APIServer pid=3603498)[0;0m WARNING 09-18 17:25:19 [__init__.py:1625] Default sampling parameters have been overridden by the model's Hugging Face generation config recommended from the model creator. If this is not intended, please relaunch vLLM instance with `--generation-config vllm`.
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [serving_responses.py:120] Using default chat sampling params from model: {'repetition_penalty': 1.05, 'temperature': 0.7, 'top_k': 20, 'top_p': 0.8}
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [serving_chat.py:134] Using default chat sampling params from model: {'repetition_penalty': 1.05, 'temperature': 0.7, 'top_k': 20, 'top_p': 0.8}
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [serving_completion.py:77] Using default completion sampling params from model: {'repetition_penalty': 1.05, 'temperature': 0.7, 'top_k': 20, 'top_p': 0.8}
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [api_server.py:1880] Starting vLLM API server 0 on http://0.0.0.0:8013
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:36] Available routes are:
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /openapi.json, Methods: HEAD, GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /docs, Methods: HEAD, GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /docs/oauth2-redirect, Methods: HEAD, GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /redoc, Methods: HEAD, GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /health, Methods: GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /load, Methods: GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /ping, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /ping, Methods: GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /tokenize, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /detokenize, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/models, Methods: GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /version, Methods: GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/responses, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/responses/{response_id}, Methods: GET
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/responses/{response_id}/cancel, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/chat/completions, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/completions, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/embeddings, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /pooling, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /classify, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /score, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/score, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/audio/transcriptions, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/audio/translations, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /rerank, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v1/rerank, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /v2/rerank, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /scale_elastic_ep, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /is_scaling_elastic_ep, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /invocations, Methods: POST
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:19 [launcher.py:44] Route: /metrics, Methods: GET
[1;36m(APIServer pid=3603498)[0;0m INFO:     Started server process [3603498]
[1;36m(APIServer pid=3603498)[0;0m INFO:     Waiting for application startup.
[1;36m(APIServer pid=3603498)[0;0m INFO:     Application startup complete.
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:21 [chat_utils.py:470] Detected the chat template content format to be 'string'. You can set `--chat-template-content-format` to override this.
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35458 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:29 [loggers.py:123] Engine 000: Avg prompt throughput: 47.1 tokens/s, Avg generation throughput: 22.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 3.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:39 [loggers.py:123] Engine 000: Avg prompt throughput: 611.5 tokens/s, Avg generation throughput: 75.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 53.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44758 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:49 [loggers.py:123] Engine 000: Avg prompt throughput: 327.1 tokens/s, Avg generation throughput: 82.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 50.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44758 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:25:59 [loggers.py:123] Engine 000: Avg prompt throughput: 220.7 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 49.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44758 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:26:09 [loggers.py:123] Engine 000: Avg prompt throughput: 845.3 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 54.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58948 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:26:19 [loggers.py:123] Engine 000: Avg prompt throughput: 399.0 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 53.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58948 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58948 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:26:29 [loggers.py:123] Engine 000: Avg prompt throughput: 355.2 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:26:39 [loggers.py:123] Engine 000: Avg prompt throughput: 657.3 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54662 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:26:49 [loggers.py:123] Engine 000: Avg prompt throughput: 328.4 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 55.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54662 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57264 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:26:59 [loggers.py:123] Engine 000: Avg prompt throughput: 481.0 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:27:09 [loggers.py:123] Engine 000: Avg prompt throughput: 605.0 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37016 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:27:19 [loggers.py:123] Engine 000: Avg prompt throughput: 150.0 tokens/s, Avg generation throughput: 84.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37016 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:27:29 [loggers.py:123] Engine 000: Avg prompt throughput: 703.0 tokens/s, Avg generation throughput: 81.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53338 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:27:39 [loggers.py:123] Engine 000: Avg prompt throughput: 557.9 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34724 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:27:49 [loggers.py:123] Engine 000: Avg prompt throughput: 165.9 tokens/s, Avg generation throughput: 84.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37138 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:27:59 [loggers.py:123] Engine 000: Avg prompt throughput: 266.4 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40532 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:28:09 [loggers.py:123] Engine 000: Avg prompt throughput: 584.2 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 57.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:28:19 [loggers.py:123] Engine 000: Avg prompt throughput: 982.4 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46084 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:28:29 [loggers.py:123] Engine 000: Avg prompt throughput: 177.7 tokens/s, Avg generation throughput: 84.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46084 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37262 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:28:39 [loggers.py:123] Engine 000: Avg prompt throughput: 658.8 tokens/s, Avg generation throughput: 80.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:28:49 [loggers.py:123] Engine 000: Avg prompt throughput: 781.8 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 57.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:60724 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:28:59 [loggers.py:123] Engine 000: Avg prompt throughput: 250.8 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:60724 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:29:09 [loggers.py:123] Engine 000: Avg prompt throughput: 377.4 tokens/s, Avg generation throughput: 80.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:29:19 [loggers.py:123] Engine 000: Avg prompt throughput: 806.8 tokens/s, Avg generation throughput: 82.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:29:29 [loggers.py:123] Engine 000: Avg prompt throughput: 291.6 tokens/s, Avg generation throughput: 84.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42262 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:29:39 [loggers.py:123] Engine 000: Avg prompt throughput: 162.8 tokens/s, Avg generation throughput: 84.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47658 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:29:49 [loggers.py:123] Engine 000: Avg prompt throughput: 359.6 tokens/s, Avg generation throughput: 80.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 57.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55356 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:29:59 [loggers.py:123] Engine 000: Avg prompt throughput: 507.3 tokens/s, Avg generation throughput: 79.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:30:09 [loggers.py:123] Engine 000: Avg prompt throughput: 503.6 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:30:19 [loggers.py:123] Engine 000: Avg prompt throughput: 299.2 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33594 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:30:29 [loggers.py:123] Engine 000: Avg prompt throughput: 316.1 tokens/s, Avg generation throughput: 79.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 57.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33594 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33594 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:30:39 [loggers.py:123] Engine 000: Avg prompt throughput: 478.7 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 57.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:30:49 [loggers.py:123] Engine 000: Avg prompt throughput: 425.8 tokens/s, Avg generation throughput: 82.7 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58644 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58644 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:30:59 [loggers.py:123] Engine 000: Avg prompt throughput: 412.2 tokens/s, Avg generation throughput: 77.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54586 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:31:09 [loggers.py:123] Engine 000: Avg prompt throughput: 471.3 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54626 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:31:19 [loggers.py:123] Engine 000: Avg prompt throughput: 438.6 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54626 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54626 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:31:29 [loggers.py:123] Engine 000: Avg prompt throughput: 569.2 tokens/s, Avg generation throughput: 76.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:31:39 [loggers.py:123] Engine 000: Avg prompt throughput: 45.8 tokens/s, Avg generation throughput: 38.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:31:49 [loggers.py:123] Engine 000: Avg prompt throughput: 799.6 tokens/s, Avg generation throughput: 75.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:31:59 [loggers.py:123] Engine 000: Avg prompt throughput: 325.8 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:32:09 [loggers.py:123] Engine 000: Avg prompt throughput: 500.6 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:32:19 [loggers.py:123] Engine 000: Avg prompt throughput: 1055.2 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:32:29 [loggers.py:123] Engine 000: Avg prompt throughput: 562.5 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:32:39 [loggers.py:123] Engine 000: Avg prompt throughput: 291.1 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:32:49 [loggers.py:123] Engine 000: Avg prompt throughput: 419.2 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:32:59 [loggers.py:123] Engine 000: Avg prompt throughput: 469.8 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58568 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:33:09 [loggers.py:123] Engine 000: Avg prompt throughput: 269.9 tokens/s, Avg generation throughput: 82.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47286 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54964 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:33:19 [loggers.py:123] Engine 000: Avg prompt throughput: 507.9 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:33:29 [loggers.py:123] Engine 000: Avg prompt throughput: 963.2 tokens/s, Avg generation throughput: 81.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:33:39 [loggers.py:123] Engine 000: Avg prompt throughput: 380.4 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56552 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:33:49 [loggers.py:123] Engine 000: Avg prompt throughput: 431.6 tokens/s, Avg generation throughput: 81.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47990 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45644 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:33:59 [loggers.py:123] Engine 000: Avg prompt throughput: 560.2 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:34:09 [loggers.py:123] Engine 000: Avg prompt throughput: 845.2 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:34:19 [loggers.py:123] Engine 000: Avg prompt throughput: 527.4 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49686 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:34:29 [loggers.py:123] Engine 000: Avg prompt throughput: 368.3 tokens/s, Avg generation throughput: 80.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49686 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49686 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:34:39 [loggers.py:123] Engine 000: Avg prompt throughput: 481.8 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:34:49 [loggers.py:123] Engine 000: Avg prompt throughput: 414.7 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:32848 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:34:59 [loggers.py:123] Engine 000: Avg prompt throughput: 94.6 tokens/s, Avg generation throughput: 84.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54720 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:35:09 [loggers.py:123] Engine 000: Avg prompt throughput: 241.6 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37392 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:35:19 [loggers.py:123] Engine 000: Avg prompt throughput: 390.4 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:35:29 [loggers.py:123] Engine 000: Avg prompt throughput: 601.1 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:35:39 [loggers.py:123] Engine 000: Avg prompt throughput: 260.1 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56612 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:35:49 [loggers.py:123] Engine 000: Avg prompt throughput: 382.0 tokens/s, Avg generation throughput: 78.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56612 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56612 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:35:59 [loggers.py:123] Engine 000: Avg prompt throughput: 451.2 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:38690 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:36:09 [loggers.py:123] Engine 000: Avg prompt throughput: 634.1 tokens/s, Avg generation throughput: 80.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:38690 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:38690 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:36:19 [loggers.py:123] Engine 000: Avg prompt throughput: 512.9 tokens/s, Avg generation throughput: 79.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:36:29 [loggers.py:123] Engine 000: Avg prompt throughput: 337.2 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50224 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:36:39 [loggers.py:123] Engine 000: Avg prompt throughput: 391.8 tokens/s, Avg generation throughput: 77.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49896 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57688 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:36:49 [loggers.py:123] Engine 000: Avg prompt throughput: 618.0 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:36:59 [loggers.py:123] Engine 000: Avg prompt throughput: 580.1 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:37:09 [loggers.py:123] Engine 000: Avg prompt throughput: 375.2 tokens/s, Avg generation throughput: 77.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:37:19 [loggers.py:123] Engine 000: Avg prompt throughput: 577.9 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:37:29 [loggers.py:123] Engine 000: Avg prompt throughput: 143.9 tokens/s, Avg generation throughput: 35.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:37:39 [loggers.py:123] Engine 000: Avg prompt throughput: 524.4 tokens/s, Avg generation throughput: 76.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39486 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:37:49 [loggers.py:123] Engine 000: Avg prompt throughput: 230.9 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39486 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39486 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:37:59 [loggers.py:123] Engine 000: Avg prompt throughput: 448.5 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:38:09 [loggers.py:123] Engine 000: Avg prompt throughput: 399.1 tokens/s, Avg generation throughput: 82.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36644 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36644 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:38:19 [loggers.py:123] Engine 000: Avg prompt throughput: 274.7 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36644 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:38:29 [loggers.py:123] Engine 000: Avg prompt throughput: 602.2 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:38:39 [loggers.py:123] Engine 000: Avg prompt throughput: 207.1 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45638 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45638 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:38:49 [loggers.py:123] Engine 000: Avg prompt throughput: 285.3 tokens/s, Avg generation throughput: 80.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45638 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:38:59 [loggers.py:123] Engine 000: Avg prompt throughput: 617.5 tokens/s, Avg generation throughput: 81.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48874 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:39:09 [loggers.py:123] Engine 000: Avg prompt throughput: 199.2 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48874 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48874 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:39:19 [loggers.py:123] Engine 000: Avg prompt throughput: 591.9 tokens/s, Avg generation throughput: 79.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:39:29 [loggers.py:123] Engine 000: Avg prompt throughput: 883.2 tokens/s, Avg generation throughput: 79.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:39:39 [loggers.py:123] Engine 000: Avg prompt throughput: 413.8 tokens/s, Avg generation throughput: 81.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40052 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40052 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:39:49 [loggers.py:123] Engine 000: Avg prompt throughput: 284.6 tokens/s, Avg generation throughput: 79.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40052 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:39:59 [loggers.py:123] Engine 000: Avg prompt throughput: 703.4 tokens/s, Avg generation throughput: 80.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44716 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:40:09 [loggers.py:123] Engine 000: Avg prompt throughput: 289.0 tokens/s, Avg generation throughput: 78.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44716 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44716 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:40:19 [loggers.py:123] Engine 000: Avg prompt throughput: 412.6 tokens/s, Avg generation throughput: 82.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:40:29 [loggers.py:123] Engine 000: Avg prompt throughput: 663.8 tokens/s, Avg generation throughput: 80.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46306 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:40:39 [loggers.py:123] Engine 000: Avg prompt throughput: 240.7 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46306 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:40:49 [loggers.py:123] Engine 000: Avg prompt throughput: 316.3 tokens/s, Avg generation throughput: 79.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46306 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:40:59 [loggers.py:123] Engine 000: Avg prompt throughput: 625.9 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44924 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44924 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:41:09 [loggers.py:123] Engine 000: Avg prompt throughput: 296.1 tokens/s, Avg generation throughput: 78.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44924 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:41:19 [loggers.py:123] Engine 000: Avg prompt throughput: 484.3 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58568 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:41:29 [loggers.py:123] Engine 000: Avg prompt throughput: 329.2 tokens/s, Avg generation throughput: 78.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58568 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58568 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:41:39 [loggers.py:123] Engine 000: Avg prompt throughput: 396.4 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:41:49 [loggers.py:123] Engine 000: Avg prompt throughput: 745.1 tokens/s, Avg generation throughput: 81.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:41:59 [loggers.py:123] Engine 000: Avg prompt throughput: 432.7 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40642 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40642 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:42:09 [loggers.py:123] Engine 000: Avg prompt throughput: 345.5 tokens/s, Avg generation throughput: 76.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40642 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:42:19 [loggers.py:123] Engine 000: Avg prompt throughput: 460.5 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:42:29 [loggers.py:123] Engine 000: Avg prompt throughput: 313.8 tokens/s, Avg generation throughput: 77.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39528 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:42:39 [loggers.py:123] Engine 000: Avg prompt throughput: 477.8 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:42:49 [loggers.py:123] Engine 000: Avg prompt throughput: 376.4 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59566 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59566 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:42:59 [loggers.py:123] Engine 000: Avg prompt throughput: 273.0 tokens/s, Avg generation throughput: 77.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59566 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:43:09 [loggers.py:123] Engine 000: Avg prompt throughput: 613.5 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:43:19 [loggers.py:123] Engine 000: Avg prompt throughput: 327.2 tokens/s, Avg generation throughput: 77.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:43:29 [loggers.py:123] Engine 000: Avg prompt throughput: 407.7 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:43:39 [loggers.py:123] Engine 000: Avg prompt throughput: 317.1 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57572 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:43:49 [loggers.py:123] Engine 000: Avg prompt throughput: 341.2 tokens/s, Avg generation throughput: 75.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57578 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57556 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:43:59 [loggers.py:123] Engine 000: Avg prompt throughput: 525.5 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:44:09 [loggers.py:123] Engine 000: Avg prompt throughput: 527.8 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50962 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:44:19 [loggers.py:123] Engine 000: Avg prompt throughput: 275.0 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50962 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:44:29 [loggers.py:123] Engine 000: Avg prompt throughput: 260.6 tokens/s, Avg generation throughput: 77.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59964 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:44:39 [loggers.py:123] Engine 000: Avg prompt throughput: 503.5 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43572 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:44:49 [loggers.py:123] Engine 000: Avg prompt throughput: 826.1 tokens/s, Avg generation throughput: 81.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43572 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43572 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:44:59 [loggers.py:123] Engine 000: Avg prompt throughput: 584.6 tokens/s, Avg generation throughput: 74.8 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:45:09 [loggers.py:123] Engine 000: Avg prompt throughput: 175.2 tokens/s, Avg generation throughput: 35.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:45:19 [loggers.py:123] Engine 000: Avg prompt throughput: 315.3 tokens/s, Avg generation throughput: 77.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35166 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35166 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:45:29 [loggers.py:123] Engine 000: Avg prompt throughput: 284.1 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35166 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:45:39 [loggers.py:123] Engine 000: Avg prompt throughput: 663.2 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:45:49 [loggers.py:123] Engine 000: Avg prompt throughput: 292.4 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:45:59 [loggers.py:123] Engine 000: Avg prompt throughput: 580.5 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:46:09 [loggers.py:123] Engine 000: Avg prompt throughput: 328.3 tokens/s, Avg generation throughput: 81.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:46:19 [loggers.py:123] Engine 000: Avg prompt throughput: 401.2 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:46:29 [loggers.py:123] Engine 000: Avg prompt throughput: 495.0 tokens/s, Avg generation throughput: 82.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59656 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59656 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:46:39 [loggers.py:123] Engine 000: Avg prompt throughput: 309.6 tokens/s, Avg generation throughput: 81.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59656 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:46:49 [loggers.py:123] Engine 000: Avg prompt throughput: 779.5 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:46:59 [loggers.py:123] Engine 000: Avg prompt throughput: 330.6 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55010 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:47:09 [loggers.py:123] Engine 000: Avg prompt throughput: 332.3 tokens/s, Avg generation throughput: 80.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55010 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55010 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:47:19 [loggers.py:123] Engine 000: Avg prompt throughput: 627.7 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51062 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:47:29 [loggers.py:123] Engine 000: Avg prompt throughput: 289.7 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51062 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:47:39 [loggers.py:123] Engine 000: Avg prompt throughput: 506.7 tokens/s, Avg generation throughput: 80.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51062 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:47:49 [loggers.py:123] Engine 000: Avg prompt throughput: 381.8 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37798 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:47:59 [loggers.py:123] Engine 000: Avg prompt throughput: 172.3 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56924 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:48:09 [loggers.py:123] Engine 000: Avg prompt throughput: 331.7 tokens/s, Avg generation throughput: 79.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37918 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:48:19 [loggers.py:123] Engine 000: Avg prompt throughput: 817.7 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:48:29 [loggers.py:123] Engine 000: Avg prompt throughput: 340.5 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45074 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45074 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:48:39 [loggers.py:123] Engine 000: Avg prompt throughput: 296.7 tokens/s, Avg generation throughput: 79.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37418 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:48:49 [loggers.py:123] Engine 000: Avg prompt throughput: 430.8 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:48:59 [loggers.py:123] Engine 000: Avg prompt throughput: 821.2 tokens/s, Avg generation throughput: 80.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40032 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:49:09 [loggers.py:123] Engine 000: Avg prompt throughput: 148.0 tokens/s, Avg generation throughput: 84.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40032 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:49:19 [loggers.py:123] Engine 000: Avg prompt throughput: 688.8 tokens/s, Avg generation throughput: 77.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55900 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:49:29 [loggers.py:123] Engine 000: Avg prompt throughput: 487.4 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:49:39 [loggers.py:123] Engine 000: Avg prompt throughput: 235.9 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:49:49 [loggers.py:123] Engine 000: Avg prompt throughput: 323.8 tokens/s, Avg generation throughput: 78.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58688 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:49:59 [loggers.py:123] Engine 000: Avg prompt throughput: 516.4 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 57.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:50:09 [loggers.py:123] Engine 000: Avg prompt throughput: 1080.1 tokens/s, Avg generation throughput: 80.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50142 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:50:19 [loggers.py:123] Engine 000: Avg prompt throughput: 194.7 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50712 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:50:29 [loggers.py:123] Engine 000: Avg prompt throughput: 328.8 tokens/s, Avg generation throughput: 77.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58100 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:50:39 [loggers.py:123] Engine 000: Avg prompt throughput: 516.0 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:50:49 [loggers.py:123] Engine 000: Avg prompt throughput: 1026.5 tokens/s, Avg generation throughput: 80.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55316 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:50:59 [loggers.py:123] Engine 000: Avg prompt throughput: 156.2 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55316 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55316 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:51:09 [loggers.py:123] Engine 000: Avg prompt throughput: 648.0 tokens/s, Avg generation throughput: 76.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:51:19 [loggers.py:123] Engine 000: Avg prompt throughput: 668.6 tokens/s, Avg generation throughput: 81.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55604 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:51:29 [loggers.py:123] Engine 000: Avg prompt throughput: 381.8 tokens/s, Avg generation throughput: 75.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55604 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55604 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:51:39 [loggers.py:123] Engine 000: Avg prompt throughput: 440.3 tokens/s, Avg generation throughput: 76.4 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:51:49 [loggers.py:123] Engine 000: Avg prompt throughput: 51.1 tokens/s, Avg generation throughput: 44.2 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:51:59 [loggers.py:123] Engine 000: Avg prompt throughput: 1035.6 tokens/s, Avg generation throughput: 75.8 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56780 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:52:09 [loggers.py:123] Engine 000: Avg prompt throughput: 284.1 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:52:19 [loggers.py:123] Engine 000: Avg prompt throughput: 324.7 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:60130 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:52:29 [loggers.py:123] Engine 000: Avg prompt throughput: 1092.2 tokens/s, Avg generation throughput: 81.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:52:39 [loggers.py:123] Engine 000: Avg prompt throughput: 558.6 tokens/s, Avg generation throughput: 81.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:52:49 [loggers.py:123] Engine 000: Avg prompt throughput: 253.0 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:52:59 [loggers.py:123] Engine 000: Avg prompt throughput: 534.9 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:53:09 [loggers.py:123] Engine 000: Avg prompt throughput: 593.2 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53426 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:53:19 [loggers.py:123] Engine 000: Avg prompt throughput: 318.6 tokens/s, Avg generation throughput: 81.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53426 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53426 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:53:29 [loggers.py:123] Engine 000: Avg prompt throughput: 336.7 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:53:39 [loggers.py:123] Engine 000: Avg prompt throughput: 538.4 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50966 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50966 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:53:49 [loggers.py:123] Engine 000: Avg prompt throughput: 424.1 tokens/s, Avg generation throughput: 80.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50966 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:53:59 [loggers.py:123] Engine 000: Avg prompt throughput: 617.8 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:54:09 [loggers.py:123] Engine 000: Avg prompt throughput: 215.1 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:38960 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:54:19 [loggers.py:123] Engine 000: Avg prompt throughput: 368.0 tokens/s, Avg generation throughput: 80.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56194 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:54:29 [loggers.py:123] Engine 000: Avg prompt throughput: 228.2 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39978 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:54:39 [loggers.py:123] Engine 000: Avg prompt throughput: 605.0 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:54:49 [loggers.py:123] Engine 000: Avg prompt throughput: 1527.7 tokens/s, Avg generation throughput: 80.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:54:59 [loggers.py:123] Engine 000: Avg prompt throughput: 498.4 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41904 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:55:09 [loggers.py:123] Engine 000: Avg prompt throughput: 205.0 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36136 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:55:19 [loggers.py:123] Engine 000: Avg prompt throughput: 331.5 tokens/s, Avg generation throughput: 79.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48370 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:55:29 [loggers.py:123] Engine 000: Avg prompt throughput: 490.2 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:55:39 [loggers.py:123] Engine 000: Avg prompt throughput: 1273.5 tokens/s, Avg generation throughput: 80.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39576 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:55:49 [loggers.py:123] Engine 000: Avg prompt throughput: 226.0 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49782 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:55:59 [loggers.py:123] Engine 000: Avg prompt throughput: 290.7 tokens/s, Avg generation throughput: 79.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37608 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:56:09 [loggers.py:123] Engine 000: Avg prompt throughput: 789.5 tokens/s, Avg generation throughput: 82.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:56:19 [loggers.py:123] Engine 000: Avg prompt throughput: 489.3 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33854 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33854 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:56:29 [loggers.py:123] Engine 000: Avg prompt throughput: 320.2 tokens/s, Avg generation throughput: 79.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33854 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:56:39 [loggers.py:123] Engine 000: Avg prompt throughput: 583.9 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:56:49 [loggers.py:123] Engine 000: Avg prompt throughput: 286.0 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34556 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:56:59 [loggers.py:123] Engine 000: Avg prompt throughput: 218.1 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:57:09 [loggers.py:123] Engine 000: Avg prompt throughput: 355.8 tokens/s, Avg generation throughput: 79.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 58.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53366 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:57:19 [loggers.py:123] Engine 000: Avg prompt throughput: 269.1 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40886 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:57:29 [loggers.py:123] Engine 000: Avg prompt throughput: 685.4 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:57:39 [loggers.py:123] Engine 000: Avg prompt throughput: 785.5 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59750 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:57:49 [loggers.py:123] Engine 000: Avg prompt throughput: 370.2 tokens/s, Avg generation throughput: 77.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59754 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:57:59 [loggers.py:123] Engine 000: Avg prompt throughput: 451.7 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 58.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54118 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:58:09 [loggers.py:123] Engine 000: Avg prompt throughput: 1024.9 tokens/s, Avg generation throughput: 80.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:58:19 [loggers.py:123] Engine 000: Avg prompt throughput: 343.0 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35510 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:58:29 [loggers.py:123] Engine 000: Avg prompt throughput: 410.9 tokens/s, Avg generation throughput: 76.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35524 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:58:39 [loggers.py:123] Engine 000: Avg prompt throughput: 62.8 tokens/s, Avg generation throughput: 85.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:58:49 [loggers.py:123] Engine 000: Avg prompt throughput: 605.0 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.4%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39112 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:58:59 [loggers.py:123] Engine 000: Avg prompt throughput: 158.0 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:59:09 [loggers.py:123] Engine 000: Avg prompt throughput: 346.4 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:59:19 [loggers.py:123] Engine 000: Avg prompt throughput: 408.3 tokens/s, Avg generation throughput: 47.2 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:59:29 [loggers.py:123] Engine 000: Avg prompt throughput: 390.2 tokens/s, Avg generation throughput: 59.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:59:39 [loggers.py:123] Engine 000: Avg prompt throughput: 375.1 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:59:49 [loggers.py:123] Engine 000: Avg prompt throughput: 297.1 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47552 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 17:59:59 [loggers.py:123] Engine 000: Avg prompt throughput: 501.7 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47552 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47552 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:00:09 [loggers.py:123] Engine 000: Avg prompt throughput: 489.7 tokens/s, Avg generation throughput: 81.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:00:19 [loggers.py:123] Engine 000: Avg prompt throughput: 695.4 tokens/s, Avg generation throughput: 81.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:00:29 [loggers.py:123] Engine 000: Avg prompt throughput: 277.4 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43004 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43004 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:00:39 [loggers.py:123] Engine 000: Avg prompt throughput: 236.0 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43004 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:00:49 [loggers.py:123] Engine 000: Avg prompt throughput: 676.6 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 59.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:00:59 [loggers.py:123] Engine 000: Avg prompt throughput: 235.9 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:32856 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:01:09 [loggers.py:123] Engine 000: Avg prompt throughput: 333.7 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 59.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:32856 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:32856 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:01:19 [loggers.py:123] Engine 000: Avg prompt throughput: 439.8 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:01:29 [loggers.py:123] Engine 000: Avg prompt throughput: 1401.5 tokens/s, Avg generation throughput: 80.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:01:39 [loggers.py:123] Engine 000: Avg prompt throughput: 337.0 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39872 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:01:49 [loggers.py:123] Engine 000: Avg prompt throughput: 285.9 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39872 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39872 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:01:59 [loggers.py:123] Engine 000: Avg prompt throughput: 466.2 tokens/s, Avg generation throughput: 80.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 59.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:02:09 [loggers.py:123] Engine 000: Avg prompt throughput: 510.6 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58746 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58746 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58746 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:02:19 [loggers.py:123] Engine 000: Avg prompt throughput: 511.6 tokens/s, Avg generation throughput: 80.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:02:29 [loggers.py:123] Engine 000: Avg prompt throughput: 446.0 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37910 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37910 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:02:39 [loggers.py:123] Engine 000: Avg prompt throughput: 236.4 tokens/s, Avg generation throughput: 80.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37910 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:02:49 [loggers.py:123] Engine 000: Avg prompt throughput: 663.1 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34690 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:02:59 [loggers.py:123] Engine 000: Avg prompt throughput: 149.5 tokens/s, Avg generation throughput: 84.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53864 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:03:09 [loggers.py:123] Engine 000: Avg prompt throughput: 324.5 tokens/s, Avg generation throughput: 79.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 59.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:03:19 [loggers.py:123] Engine 000: Avg prompt throughput: 1118.4 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:03:29 [loggers.py:123] Engine 000: Avg prompt throughput: 347.6 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:03:39 [loggers.py:123] Engine 000: Avg prompt throughput: 307.9 tokens/s, Avg generation throughput: 78.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 59.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:03:49 [loggers.py:123] Engine 000: Avg prompt throughput: 458.4 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:03:59 [loggers.py:123] Engine 000: Avg prompt throughput: 855.7 tokens/s, Avg generation throughput: 81.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:04:09 [loggers.py:123] Engine 000: Avg prompt throughput: 501.5 tokens/s, Avg generation throughput: 82.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52498 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52498 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:04:19 [loggers.py:123] Engine 000: Avg prompt throughput: 284.7 tokens/s, Avg generation throughput: 77.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52212 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:04:29 [loggers.py:123] Engine 000: Avg prompt throughput: 552.4 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:04:39 [loggers.py:123] Engine 000: Avg prompt throughput: 1317.8 tokens/s, Avg generation throughput: 80.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:04:49 [loggers.py:123] Engine 000: Avg prompt throughput: 783.8 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:04:59 [loggers.py:123] Engine 000: Avg prompt throughput: 483.3 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52738 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52738 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:05:09 [loggers.py:123] Engine 000: Avg prompt throughput: 233.0 tokens/s, Avg generation throughput: 77.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52738 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:05:19 [loggers.py:123] Engine 000: Avg prompt throughput: 578.9 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 60.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:05:29 [loggers.py:123] Engine 000: Avg prompt throughput: 255.5 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55472 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55472 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:05:39 [loggers.py:123] Engine 000: Avg prompt throughput: 313.5 tokens/s, Avg generation throughput: 78.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56536 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:05:49 [loggers.py:123] Engine 000: Avg prompt throughput: 455.4 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:05:59 [loggers.py:123] Engine 000: Avg prompt throughput: 556.1 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 60.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:06:09 [loggers.py:123] Engine 000: Avg prompt throughput: 357.6 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36774 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36774 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:06:19 [loggers.py:123] Engine 000: Avg prompt throughput: 242.8 tokens/s, Avg generation throughput: 77.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36774 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:06:29 [loggers.py:123] Engine 000: Avg prompt throughput: 765.9 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 60.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:06:39 [loggers.py:123] Engine 000: Avg prompt throughput: 144.9 tokens/s, Avg generation throughput: 84.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35574 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:06:49 [loggers.py:123] Engine 000: Avg prompt throughput: 259.6 tokens/s, Avg generation throughput: 76.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 60.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44610 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44610 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:06:59 [loggers.py:123] Engine 000: Avg prompt throughput: 390.7 tokens/s, Avg generation throughput: 84.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:07:09 [loggers.py:123] Engine 000: Avg prompt throughput: 457.2 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44084 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44084 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:07:19 [loggers.py:123] Engine 000: Avg prompt throughput: 248.0 tokens/s, Avg generation throughput: 75.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44084 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:07:29 [loggers.py:123] Engine 000: Avg prompt throughput: 791.7 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:60782 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:07:39 [loggers.py:123] Engine 000: Avg prompt throughput: 213.8 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59056 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:07:49 [loggers.py:123] Engine 000: Avg prompt throughput: 270.5 tokens/s, Avg generation throughput: 75.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 60.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43410 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:07:59 [loggers.py:123] Engine 000: Avg prompt throughput: 972.0 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:08:09 [loggers.py:123] Engine 000: Avg prompt throughput: 574.4 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35836 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:08:19 [loggers.py:123] Engine 000: Avg prompt throughput: 177.1 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35836 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:08:29 [loggers.py:123] Engine 000: Avg prompt throughput: 488.6 tokens/s, Avg generation throughput: 75.6 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49770 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:08:39 [loggers.py:123] Engine 000: Avg prompt throughput: 652.5 tokens/s, Avg generation throughput: 80.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37448 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:08:49 [loggers.py:123] Engine 000: Avg prompt throughput: 220.2 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37448 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:08:59 [loggers.py:123] Engine 000: Avg prompt throughput: 648.7 tokens/s, Avg generation throughput: 72.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 61.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:38328 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:09:09 [loggers.py:123] Engine 000: Avg prompt throughput: 789.6 tokens/s, Avg generation throughput: 80.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:09:19 [loggers.py:123] Engine 000: Avg prompt throughput: 444.1 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:09:29 [loggers.py:123] Engine 000: Avg prompt throughput: 333.3 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55728 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55728 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:09:39 [loggers.py:123] Engine 000: Avg prompt throughput: 235.5 tokens/s, Avg generation throughput: 74.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55728 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:09:49 [loggers.py:123] Engine 000: Avg prompt throughput: 547.7 tokens/s, Avg generation throughput: 82.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:09:59 [loggers.py:123] Engine 000: Avg prompt throughput: 252.9 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36558 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36558 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:10:09 [loggers.py:123] Engine 000: Avg prompt throughput: 303.6 tokens/s, Avg generation throughput: 74.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50478 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:10:19 [loggers.py:123] Engine 000: Avg prompt throughput: 493.1 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:10:29 [loggers.py:123] Engine 000: Avg prompt throughput: 430.6 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47754 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47754 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:10:39 [loggers.py:123] Engine 000: Avg prompt throughput: 301.8 tokens/s, Avg generation throughput: 74.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:60582 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:10:49 [loggers.py:123] Engine 000: Avg prompt throughput: 672.5 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:10:59 [loggers.py:123] Engine 000: Avg prompt throughput: 604.8 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52132 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52132 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52132 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:11:09 [loggers.py:123] Engine 000: Avg prompt throughput: 420.2 tokens/s, Avg generation throughput: 59.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:11:19 [loggers.py:123] Engine 000: Avg prompt throughput: 77.4 tokens/s, Avg generation throughput: 48.4 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:11:29 [loggers.py:123] Engine 000: Avg prompt throughput: 771.7 tokens/s, Avg generation throughput: 76.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54938 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:11:39 [loggers.py:123] Engine 000: Avg prompt throughput: 203.9 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45262 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:11:49 [loggers.py:123] Engine 000: Avg prompt throughput: 253.2 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44128 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:11:59 [loggers.py:123] Engine 000: Avg prompt throughput: 445.2 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:12:09 [loggers.py:123] Engine 000: Avg prompt throughput: 704.4 tokens/s, Avg generation throughput: 81.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48858 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:12:19 [loggers.py:123] Engine 000: Avg prompt throughput: 289.7 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48858 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48858 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:12:29 [loggers.py:123] Engine 000: Avg prompt throughput: 941.3 tokens/s, Avg generation throughput: 81.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:12:39 [loggers.py:123] Engine 000: Avg prompt throughput: 539.0 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:12:49 [loggers.py:123] Engine 000: Avg prompt throughput: 235.8 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:12:59 [loggers.py:123] Engine 000: Avg prompt throughput: 556.2 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:13:09 [loggers.py:123] Engine 000: Avg prompt throughput: 514.8 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46796 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:13:19 [loggers.py:123] Engine 000: Avg prompt throughput: 148.3 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46796 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:13:29 [loggers.py:123] Engine 000: Avg prompt throughput: 729.9 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46044 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:13:39 [loggers.py:123] Engine 000: Avg prompt throughput: 514.2 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:13:49 [loggers.py:123] Engine 000: Avg prompt throughput: 350.1 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54698 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:13:59 [loggers.py:123] Engine 000: Avg prompt throughput: 188.0 tokens/s, Avg generation throughput: 84.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54698 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:14:09 [loggers.py:123] Engine 000: Avg prompt throughput: 414.3 tokens/s, Avg generation throughput: 79.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44440 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:14:19 [loggers.py:123] Engine 000: Avg prompt throughput: 795.3 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:14:29 [loggers.py:123] Engine 000: Avg prompt throughput: 197.5 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36972 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:14:39 [loggers.py:123] Engine 000: Avg prompt throughput: 369.9 tokens/s, Avg generation throughput: 80.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36980 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40338 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:14:49 [loggers.py:123] Engine 000: Avg prompt throughput: 434.9 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:14:59 [loggers.py:123] Engine 000: Avg prompt throughput: 520.8 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39026 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:15:09 [loggers.py:123] Engine 000: Avg prompt throughput: 206.0 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35474 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:15:19 [loggers.py:123] Engine 000: Avg prompt throughput: 264.2 tokens/s, Avg generation throughput: 80.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48238 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:15:29 [loggers.py:123] Engine 000: Avg prompt throughput: 582.4 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:15:39 [loggers.py:123] Engine 000: Avg prompt throughput: 865.2 tokens/s, Avg generation throughput: 81.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34044 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:15:49 [loggers.py:123] Engine 000: Avg prompt throughput: 158.5 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37270 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:15:59 [loggers.py:123] Engine 000: Avg prompt throughput: 299.8 tokens/s, Avg generation throughput: 79.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37280 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:16:09 [loggers.py:123] Engine 000: Avg prompt throughput: 790.0 tokens/s, Avg generation throughput: 81.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37074 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:16:19 [loggers.py:123] Engine 000: Avg prompt throughput: 313.8 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:16:29 [loggers.py:123] Engine 000: Avg prompt throughput: 312.5 tokens/s, Avg generation throughput: 78.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:16:39 [loggers.py:123] Engine 000: Avg prompt throughput: 722.6 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.4%, Prefix cache hit rate: 61.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49072 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:16:49 [loggers.py:123] Engine 000: Avg prompt throughput: 768.5 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:16:59 [loggers.py:123] Engine 000: Avg prompt throughput: 551.1 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48464 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:17:09 [loggers.py:123] Engine 000: Avg prompt throughput: 251.9 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:17:19 [loggers.py:123] Engine 000: Avg prompt throughput: 257.9 tokens/s, Avg generation throughput: 78.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46656 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:17:29 [loggers.py:123] Engine 000: Avg prompt throughput: 666.4 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.4%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36540 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:17:39 [loggers.py:123] Engine 000: Avg prompt throughput: 146.4 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:17:49 [loggers.py:123] Engine 000: Avg prompt throughput: 1151.6 tokens/s, Avg generation throughput: 80.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51782 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:17:59 [loggers.py:123] Engine 000: Avg prompt throughput: 231.2 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55228 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:18:09 [loggers.py:123] Engine 000: Avg prompt throughput: 323.6 tokens/s, Avg generation throughput: 78.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39938 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:18:19 [loggers.py:123] Engine 000: Avg prompt throughput: 514.7 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:18:29 [loggers.py:123] Engine 000: Avg prompt throughput: 752.0 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:18:39 [loggers.py:123] Engine 000: Avg prompt throughput: 273.6 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:18:49 [loggers.py:123] Engine 000: Avg prompt throughput: 99.0 tokens/s, Avg generation throughput: 85.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:18:59 [loggers.py:123] Engine 000: Avg prompt throughput: 49.5 tokens/s, Avg generation throughput: 85.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:19:09 [loggers.py:123] Engine 000: Avg prompt throughput: 99.0 tokens/s, Avg generation throughput: 85.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:38350 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:19:19 [loggers.py:123] Engine 000: Avg prompt throughput: 227.4 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:19:29 [loggers.py:123] Engine 000: Avg prompt throughput: 124.8 tokens/s, Avg generation throughput: 85.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:19:39 [loggers.py:123] Engine 000: Avg prompt throughput: 62.4 tokens/s, Avg generation throughput: 84.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59436 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:19:49 [loggers.py:123] Engine 000: Avg prompt throughput: 348.7 tokens/s, Avg generation throughput: 77.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:19:59 [loggers.py:123] Engine 000: Avg prompt throughput: 126.5 tokens/s, Avg generation throughput: 84.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53450 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:20:09 [loggers.py:123] Engine 000: Avg prompt throughput: 784.4 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:20:19 [loggers.py:123] Engine 000: Avg prompt throughput: 175.6 tokens/s, Avg generation throughput: 84.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:20:29 [loggers.py:123] Engine 000: Avg prompt throughput: 175.6 tokens/s, Avg generation throughput: 84.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:20:39 [loggers.py:123] Engine 000: Avg prompt throughput: 712.6 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:20:49 [loggers.py:123] Engine 000: Avg prompt throughput: 468.6 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42610 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:20:59 [loggers.py:123] Engine 000: Avg prompt throughput: 252.1 tokens/s, Avg generation throughput: 77.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42610 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40638 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:21:09 [loggers.py:123] Engine 000: Avg prompt throughput: 484.0 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:21:19 [loggers.py:123] Engine 000: Avg prompt throughput: 205.0 tokens/s, Avg generation throughput: 40.8 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:21:29 [loggers.py:123] Engine 000: Avg prompt throughput: 747.5 tokens/s, Avg generation throughput: 69.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:21:39 [loggers.py:123] Engine 000: Avg prompt throughput: 559.2 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50358 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:21:49 [loggers.py:123] Engine 000: Avg prompt throughput: 308.7 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50358 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50358 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:21:59 [loggers.py:123] Engine 000: Avg prompt throughput: 395.3 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:22:09 [loggers.py:123] Engine 000: Avg prompt throughput: 900.7 tokens/s, Avg generation throughput: 81.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:22:19 [loggers.py:123] Engine 000: Avg prompt throughput: 266.6 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41970 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:22:29 [loggers.py:123] Engine 000: Avg prompt throughput: 339.5 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 62.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50566 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:22:39 [loggers.py:123] Engine 000: Avg prompt throughput: 474.6 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46426 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:22:49 [loggers.py:123] Engine 000: Avg prompt throughput: 678.1 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:22:59 [loggers.py:123] Engine 000: Avg prompt throughput: 522.9 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34176 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:23:09 [loggers.py:123] Engine 000: Avg prompt throughput: 171.6 tokens/s, Avg generation throughput: 82.7 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34192 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:60330 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:23:19 [loggers.py:123] Engine 000: Avg prompt throughput: 645.5 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:23:29 [loggers.py:123] Engine 000: Avg prompt throughput: 1142.8 tokens/s, Avg generation throughput: 81.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:23:39 [loggers.py:123] Engine 000: Avg prompt throughput: 501.3 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:23:49 [loggers.py:123] Engine 000: Avg prompt throughput: 386.1 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53340 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:23:59 [loggers.py:123] Engine 000: Avg prompt throughput: 190.6 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33430 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:24:09 [loggers.py:123] Engine 000: Avg prompt throughput: 313.4 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42908 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:24:19 [loggers.py:123] Engine 000: Avg prompt throughput: 567.5 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:24:29 [loggers.py:123] Engine 000: Avg prompt throughput: 654.8 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:24:39 [loggers.py:123] Engine 000: Avg prompt throughput: 361.0 tokens/s, Avg generation throughput: 80.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:24:49 [loggers.py:123] Engine 000: Avg prompt throughput: 527.7 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45418 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:24:59 [loggers.py:123] Engine 000: Avg prompt throughput: 817.8 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:25:09 [loggers.py:123] Engine 000: Avg prompt throughput: 350.7 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58250 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58250 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:25:19 [loggers.py:123] Engine 000: Avg prompt throughput: 389.9 tokens/s, Avg generation throughput: 80.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58250 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:25:29 [loggers.py:123] Engine 000: Avg prompt throughput: 744.1 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:25:39 [loggers.py:123] Engine 000: Avg prompt throughput: 361.3 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37960 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:25:49 [loggers.py:123] Engine 000: Avg prompt throughput: 333.5 tokens/s, Avg generation throughput: 79.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37960 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37960 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:25:59 [loggers.py:123] Engine 000: Avg prompt throughput: 532.7 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:26:09 [loggers.py:123] Engine 000: Avg prompt throughput: 738.3 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.1%
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:26:19 [loggers.py:123] Engine 000: Avg prompt throughput: 44.7 tokens/s, Avg generation throughput: 34.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:26:29 [loggers.py:123] Engine 000: Avg prompt throughput: 1328.7 tokens/s, Avg generation throughput: 73.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:26:39 [loggers.py:123] Engine 000: Avg prompt throughput: 498.5 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:26:49 [loggers.py:123] Engine 000: Avg prompt throughput: 606.1 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57600 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:26:59 [loggers.py:123] Engine 000: Avg prompt throughput: 221.5 tokens/s, Avg generation throughput: 84.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57600 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:27:09 [loggers.py:123] Engine 000: Avg prompt throughput: 349.3 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37236 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:27:19 [loggers.py:123] Engine 000: Avg prompt throughput: 679.8 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:27:29 [loggers.py:123] Engine 000: Avg prompt throughput: 452.1 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52376 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:27:39 [loggers.py:123] Engine 000: Avg prompt throughput: 388.1 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 62.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52378 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:27:49 [loggers.py:123] Engine 000: Avg prompt throughput: 482.8 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37116 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:27:59 [loggers.py:123] Engine 000: Avg prompt throughput: 830.4 tokens/s, Avg generation throughput: 81.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:28:09 [loggers.py:123] Engine 000: Avg prompt throughput: 470.9 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45744 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:28:19 [loggers.py:123] Engine 000: Avg prompt throughput: 233.1 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:32950 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:28:29 [loggers.py:123] Engine 000: Avg prompt throughput: 295.8 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51840 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:28:39 [loggers.py:123] Engine 000: Avg prompt throughput: 494.7 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:28:49 [loggers.py:123] Engine 000: Avg prompt throughput: 1423.3 tokens/s, Avg generation throughput: 80.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:28:59 [loggers.py:123] Engine 000: Avg prompt throughput: 596.8 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:29:09 [loggers.py:123] Engine 000: Avg prompt throughput: 419.7 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49978 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:29:19 [loggers.py:123] Engine 000: Avg prompt throughput: 452.6 tokens/s, Avg generation throughput: 79.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 62.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49980 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37540 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:29:29 [loggers.py:123] Engine 000: Avg prompt throughput: 601.9 tokens/s, Avg generation throughput: 82.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:29:39 [loggers.py:123] Engine 000: Avg prompt throughput: 631.4 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:29:49 [loggers.py:123] Engine 000: Avg prompt throughput: 274.0 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45732 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:29:59 [loggers.py:123] Engine 000: Avg prompt throughput: 318.5 tokens/s, Avg generation throughput: 80.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 61.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58276 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42308 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:30:09 [loggers.py:123] Engine 000: Avg prompt throughput: 467.4 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:30:19 [loggers.py:123] Engine 000: Avg prompt throughput: 1655.0 tokens/s, Avg generation throughput: 80.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:30:29 [loggers.py:123] Engine 000: Avg prompt throughput: 489.6 tokens/s, Avg generation throughput: 82.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:30:39 [loggers.py:123] Engine 000: Avg prompt throughput: 749.5 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:30:49 [loggers.py:123] Engine 000: Avg prompt throughput: 161.4 tokens/s, Avg generation throughput: 83.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 62.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35394 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:30:59 [loggers.py:123] Engine 000: Avg prompt throughput: 65.0 tokens/s, Avg generation throughput: 84.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 62.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56320 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:31:09 [loggers.py:123] Engine 000: Avg prompt throughput: 436.0 tokens/s, Avg generation throughput: 79.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 62.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:31:19 [loggers.py:123] Engine 000: Avg prompt throughput: 856.8 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.5%, Prefix cache hit rate: 62.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51360 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:31:29 [loggers.py:123] Engine 000: Avg prompt throughput: 586.3 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 62.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:31:39 [loggers.py:123] Engine 000: Avg prompt throughput: 2594.9 tokens/s, Avg generation throughput: 78.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:31:49 [loggers.py:123] Engine 000: Avg prompt throughput: 833.7 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:31:59 [loggers.py:123] Engine 000: Avg prompt throughput: 732.7 tokens/s, Avg generation throughput: 81.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40492 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40492 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:32:09 [loggers.py:123] Engine 000: Avg prompt throughput: 385.6 tokens/s, Avg generation throughput: 78.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:40492 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:32:19 [loggers.py:123] Engine 000: Avg prompt throughput: 736.1 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 63.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51368 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:32:29 [loggers.py:123] Engine 000: Avg prompt throughput: 54.0 tokens/s, Avg generation throughput: 33.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:32:39 [loggers.py:123] Engine 000: Avg prompt throughput: 955.5 tokens/s, Avg generation throughput: 75.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:32:49 [loggers.py:123] Engine 000: Avg prompt throughput: 331.6 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:37108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:32:59 [loggers.py:123] Engine 000: Avg prompt throughput: 449.5 tokens/s, Avg generation throughput: 82.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:33:09 [loggers.py:123] Engine 000: Avg prompt throughput: 499.8 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50942 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50958 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:33:19 [loggers.py:123] Engine 000: Avg prompt throughput: 392.4 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45088 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:33:29 [loggers.py:123] Engine 000: Avg prompt throughput: 315.3 tokens/s, Avg generation throughput: 84.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:33:39 [loggers.py:123] Engine 000: Avg prompt throughput: 984.2 tokens/s, Avg generation throughput: 80.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52536 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:33:49 [loggers.py:123] Engine 000: Avg prompt throughput: 150.1 tokens/s, Avg generation throughput: 84.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52536 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52536 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:33:59 [loggers.py:123] Engine 000: Avg prompt throughput: 561.3 tokens/s, Avg generation throughput: 81.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:34:09 [loggers.py:123] Engine 000: Avg prompt throughput: 943.7 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34012 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:34:19 [loggers.py:123] Engine 000: Avg prompt throughput: 244.7 tokens/s, Avg generation throughput: 83.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34012 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:34:29 [loggers.py:123] Engine 000: Avg prompt throughput: 479.4 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:34012 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:34:39 [loggers.py:123] Engine 000: Avg prompt throughput: 852.1 tokens/s, Avg generation throughput: 81.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:34:49 [loggers.py:123] Engine 000: Avg prompt throughput: 414.8 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:39652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:34:59 [loggers.py:123] Engine 000: Avg prompt throughput: 475.0 tokens/s, Avg generation throughput: 81.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:35:09 [loggers.py:123] Engine 000: Avg prompt throughput: 408.0 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35038 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35038 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:35:19 [loggers.py:123] Engine 000: Avg prompt throughput: 344.9 tokens/s, Avg generation throughput: 79.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35038 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:35:29 [loggers.py:123] Engine 000: Avg prompt throughput: 461.5 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51906 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:35:39 [loggers.py:123] Engine 000: Avg prompt throughput: 312.1 tokens/s, Avg generation throughput: 79.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51906 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:51906 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:35:49 [loggers.py:123] Engine 000: Avg prompt throughput: 534.9 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49830 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:35:59 [loggers.py:123] Engine 000: Avg prompt throughput: 174.6 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49830 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49830 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:36:09 [loggers.py:123] Engine 000: Avg prompt throughput: 503.0 tokens/s, Avg generation throughput: 79.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:36:19 [loggers.py:123] Engine 000: Avg prompt throughput: 330.8 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:36:29 [loggers.py:123] Engine 000: Avg prompt throughput: 477.0 tokens/s, Avg generation throughput: 78.2 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:36:39 [loggers.py:123] Engine 000: Avg prompt throughput: 336.7 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49938 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49938 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:36:49 [loggers.py:123] Engine 000: Avg prompt throughput: 268.9 tokens/s, Avg generation throughput: 77.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:49938 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:36:59 [loggers.py:123] Engine 000: Avg prompt throughput: 722.7 tokens/s, Avg generation throughput: 81.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36356 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:37:09 [loggers.py:123] Engine 000: Avg prompt throughput: 282.0 tokens/s, Avg generation throughput: 77.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36356 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36356 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:37:19 [loggers.py:123] Engine 000: Avg prompt throughput: 529.8 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56954 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:37:29 [loggers.py:123] Engine 000: Avg prompt throughput: 131.1 tokens/s, Avg generation throughput: 80.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56954 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:37:39 [loggers.py:123] Engine 000: Avg prompt throughput: 280.8 tokens/s, Avg generation throughput: 80.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33846 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:37:49 [loggers.py:123] Engine 000: Avg prompt throughput: 992.7 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:37:59 [loggers.py:123] Engine 000: Avg prompt throughput: 374.3 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55520 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:38:09 [loggers.py:123] Engine 000: Avg prompt throughput: 471.5 tokens/s, Avg generation throughput: 75.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55520 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55520 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:38:19 [loggers.py:123] Engine 000: Avg prompt throughput: 604.2 tokens/s, Avg generation throughput: 82.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:38:29 [loggers.py:123] Engine 000: Avg prompt throughput: 227.3 tokens/s, Avg generation throughput: 78.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:38:39 [loggers.py:123] Engine 000: Avg prompt throughput: 95.0 tokens/s, Avg generation throughput: 37.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:38:49 [loggers.py:123] Engine 000: Avg prompt throughput: 731.3 tokens/s, Avg generation throughput: 81.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44832 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:38:59 [loggers.py:123] Engine 000: Avg prompt throughput: 307.1 tokens/s, Avg generation throughput: 82.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 64.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44832 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44832 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:39:09 [loggers.py:123] Engine 000: Avg prompt throughput: 351.0 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:39:19 [loggers.py:123] Engine 000: Avg prompt throughput: 552.5 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55232 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55232 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:39:29 [loggers.py:123] Engine 000: Avg prompt throughput: 357.9 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55232 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:39:39 [loggers.py:123] Engine 000: Avg prompt throughput: 644.6 tokens/s, Avg generation throughput: 82.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:39:49 [loggers.py:123] Engine 000: Avg prompt throughput: 185.7 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52962 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:39:59 [loggers.py:123] Engine 000: Avg prompt throughput: 412.2 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 64.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:47006 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:40:09 [loggers.py:123] Engine 000: Avg prompt throughput: 275.0 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42298 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:40:19 [loggers.py:123] Engine 000: Avg prompt throughput: 474.7 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:40:29 [loggers.py:123] Engine 000: Avg prompt throughput: 1070.9 tokens/s, Avg generation throughput: 80.6 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:40:39 [loggers.py:123] Engine 000: Avg prompt throughput: 335.0 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:52744 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58126 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:40:49 [loggers.py:123] Engine 000: Avg prompt throughput: 370.0 tokens/s, Avg generation throughput: 80.7 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:40:59 [loggers.py:123] Engine 000: Avg prompt throughput: 621.7 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 64.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:59716 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:41:09 [loggers.py:123] Engine 000: Avg prompt throughput: 605.0 tokens/s, Avg generation throughput: 82.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:41:19 [loggers.py:123] Engine 000: Avg prompt throughput: 261.1 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.7%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43054 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:41:29 [loggers.py:123] Engine 000: Avg prompt throughput: 160.7 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:43060 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:41:39 [loggers.py:123] Engine 000: Avg prompt throughput: 728.5 tokens/s, Avg generation throughput: 81.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54160 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:41:49 [loggers.py:123] Engine 000: Avg prompt throughput: 863.3 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.6%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:41:59 [loggers.py:123] Engine 000: Avg prompt throughput: 591.7 tokens/s, Avg generation throughput: 82.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.5%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:35298 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:42:09 [loggers.py:123] Engine 000: Avg prompt throughput: 234.1 tokens/s, Avg generation throughput: 83.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46366 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:42:19 [loggers.py:123] Engine 000: Avg prompt throughput: 288.2 tokens/s, Avg generation throughput: 80.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:42:29 [loggers.py:123] Engine 000: Avg prompt throughput: 690.5 tokens/s, Avg generation throughput: 81.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.4%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54382 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:42:39 [loggers.py:123] Engine 000: Avg prompt throughput: 724.4 tokens/s, Avg generation throughput: 77.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.4%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:42:49 [loggers.py:123] Engine 000: Avg prompt throughput: 471.2 tokens/s, Avg generation throughput: 81.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:38888 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:42:59 [loggers.py:123] Engine 000: Avg prompt throughput: 316.3 tokens/s, Avg generation throughput: 79.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:38888 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:43:09 [loggers.py:123] Engine 000: Avg prompt throughput: 466.8 tokens/s, Avg generation throughput: 83.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55280 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:43:19 [loggers.py:123] Engine 000: Avg prompt throughput: 565.8 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:43:29 [loggers.py:123] Engine 000: Avg prompt throughput: 48.2 tokens/s, Avg generation throughput: 39.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:43:39 [loggers.py:123] Engine 000: Avg prompt throughput: 902.1 tokens/s, Avg generation throughput: 75.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36736 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:43:49 [loggers.py:123] Engine 000: Avg prompt throughput: 265.2 tokens/s, Avg generation throughput: 84.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36736 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:43:59 [loggers.py:123] Engine 000: Avg prompt throughput: 311.8 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:50734 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:44:09 [loggers.py:123] Engine 000: Avg prompt throughput: 477.7 tokens/s, Avg generation throughput: 83.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:44:19 [loggers.py:123] Engine 000: Avg prompt throughput: 1086.9 tokens/s, Avg generation throughput: 81.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:44:29 [loggers.py:123] Engine 000: Avg prompt throughput: 569.0 tokens/s, Avg generation throughput: 82.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:45136 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:44:39 [loggers.py:123] Engine 000: Avg prompt throughput: 224.3 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:33104 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:44:49 [loggers.py:123] Engine 000: Avg prompt throughput: 285.4 tokens/s, Avg generation throughput: 82.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:46994 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:44:59 [loggers.py:123] Engine 000: Avg prompt throughput: 563.1 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:45:09 [loggers.py:123] Engine 000: Avg prompt throughput: 697.0 tokens/s, Avg generation throughput: 81.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:45:19 [loggers.py:123] Engine 000: Avg prompt throughput: 262.3 tokens/s, Avg generation throughput: 83.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:60510 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:45:29 [loggers.py:123] Engine 000: Avg prompt throughput: 309.9 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55716 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:42310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:45:39 [loggers.py:123] Engine 000: Avg prompt throughput: 529.2 tokens/s, Avg generation throughput: 83.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:45:49 [loggers.py:123] Engine 000: Avg prompt throughput: 918.9 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:45:59 [loggers.py:123] Engine 000: Avg prompt throughput: 507.5 tokens/s, Avg generation throughput: 82.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:41002 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:46:09 [loggers.py:123] Engine 000: Avg prompt throughput: 182.1 tokens/s, Avg generation throughput: 84.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:46:19 [loggers.py:123] Engine 000: Avg prompt throughput: 121.4 tokens/s, Avg generation throughput: 85.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:46:29 [loggers.py:123] Engine 000: Avg prompt throughput: 182.1 tokens/s, Avg generation throughput: 85.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54750 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:46:39 [loggers.py:123] Engine 000: Avg prompt throughput: 352.6 tokens/s, Avg generation throughput: 81.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.8%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:54750 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:46:49 [loggers.py:123] Engine 000: Avg prompt throughput: 472.9 tokens/s, Avg generation throughput: 84.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:46:59 [loggers.py:123] Engine 000: Avg prompt throughput: 1012.0 tokens/s, Avg generation throughput: 80.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36940 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:47:09 [loggers.py:123] Engine 000: Avg prompt throughput: 202.9 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36940 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:47:19 [loggers.py:123] Engine 000: Avg prompt throughput: 287.5 tokens/s, Avg generation throughput: 80.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:36280 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:47:29 [loggers.py:123] Engine 000: Avg prompt throughput: 596.6 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:47:39 [loggers.py:123] Engine 000: Avg prompt throughput: 879.9 tokens/s, Avg generation throughput: 81.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48102 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:47:49 [loggers.py:123] Engine 000: Avg prompt throughput: 196.2 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:48102 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:47:59 [loggers.py:123] Engine 000: Avg prompt throughput: 375.9 tokens/s, Avg generation throughput: 79.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:57860 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:48:09 [loggers.py:123] Engine 000: Avg prompt throughput: 1127.4 tokens/s, Avg generation throughput: 81.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:48:19 [loggers.py:123] Engine 000: Avg prompt throughput: 450.6 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:48:29 [loggers.py:123] Engine 000: Avg prompt throughput: 416.9 tokens/s, Avg generation throughput: 83.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.3%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:55514 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:48:39 [loggers.py:123] Engine 000: Avg prompt throughput: 184.1 tokens/s, Avg generation throughput: 84.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:44336 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:48:49 [loggers.py:123] Engine 000: Avg prompt throughput: 429.8 tokens/s, Avg generation throughput: 78.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:53136 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:48:59 [loggers.py:123] Engine 000: Avg prompt throughput: 572.3 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.2%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:49:09 [loggers.py:123] Engine 000: Avg prompt throughput: 619.1 tokens/s, Avg generation throughput: 81.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.1%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58348 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:49:19 [loggers.py:123] Engine 000: Avg prompt throughput: 197.9 tokens/s, Avg generation throughput: 83.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 63.9%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58348 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:58348 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:49:29 [loggers.py:123] Engine 000: Avg prompt throughput: 645.9 tokens/s, Avg generation throughput: 79.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 64.0%
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO:     127.0.0.1:56352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:49:39 [loggers.py:123] Engine 000: Avg prompt throughput: 169.3 tokens/s, Avg generation throughput: 41.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 63.5%
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 18:49:49 [loggers.py:123] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 63.5%
[1;36m(APIServer pid=3603498)[0;0m INFO 09-18 19:53:18 [launcher.py:101] Shutting down FastAPI HTTP server.
[rank0]:[W918 19:53:19.196719493 ProcessGroupNCCL.cpp:1479] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
[1;36m(APIServer pid=3603498)[0;0m INFO:     Shutting down
[1;36m(APIServer pid=3603498)[0;0m INFO:     Waiting for application shutdown.
[1;36m(APIServer pid=3603498)[0;0m INFO:     Application shutdown complete.