Nikhil K. PRO

hexgridcloud

·

https://hexgrid.cloud

AI & ML interests

One-click deployment of Open-source LLMs, on managed and dedicated GPUs.

Recent Activity

new activity 4 days ago

google/gemma-4-31B-it:Benchmarked on HexGrid Cloud : Gemma-4 31B + vLLM + RTX 6000 PRO : 1168 tokens/sec and still asking for more...

upvoted an article 5 days ago

Gemma-4 31B + vLLM on RTX 6000 PRO : A Real-Load Benchmark

published an article 5 days ago

Gemma-4 31B + vLLM on RTX 6000 PRO : A Real-Load Benchmark

View all activity

Organizations

New activity in google/gemma-4-31B-it 4 days ago

Benchmarked on HexGrid Cloud : Gemma-4 31B + vLLM + RTX 6000 PRO : 1168 tokens/sec and still asking for more...

#123 opened 4 days ago by

New activity in Qwen/Qwen3.5-9B 10 days ago

Deployed on HexGrid Cloud: 1x RTX 5090 + Qwen3.5 9B BF16 — 1280 tok/s peak, then TTFT goes from 0.7s to 18s, ShareGPT, concurrency 16–128

#58 opened 19 days ago by

New activity in Qwen/Qwen3.5-9B 19 days ago

Deployed on HexGrid Cloud: 1x RTX 5090 + Qwen3.5 9B BF16 — 1280 tok/s peak, then TTFT goes from 0.7s to 18s, ShareGPT, concurrency 16–128

#58 opened 19 days ago by