Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Nikhil K.'s picture

Nikhil K. PRO

hexgridcloud
2 1
ยท
https://hexgrid.cloud
  • hexgrid_cloud
  • hexgrid-cloud

AI & ML interests

One-click deployment of Open-source LLMs, on managed and dedicated GPUs.

Recent Activity

new activity 4 days ago
google/gemma-4-31B-it:Benchmarked on HexGrid Cloud : Gemma-4 31B + vLLM + RTX 6000 PRO : 1168 tokens/sec and still asking for more...
upvoted an article 5 days ago
Gemma-4 31B + vLLM on RTX 6000 PRO : A Real-Load Benchmark
published an article 5 days ago
Gemma-4 31B + vLLM on RTX 6000 PRO : A Real-Load Benchmark
View all activity

Organizations

Hexgrid Cloud's profile picture

New activity in google/gemma-4-31B-it 4 days ago

Benchmarked on HexGrid Cloud : Gemma-4 31B + vLLM + RTX 6000 PRO : 1168 tokens/sec and still asking for more...

#123 opened 4 days ago by
hexgridcloud
New activity in Qwen/Qwen3.5-9B 10 days ago

Deployed on HexGrid Cloud: 1x RTX 5090 + Qwen3.5 9B BF16 โ€” 1280 tok/s peak, then TTFT goes from 0.7s to 18s, ShareGPT, concurrency 16โ€“128

๐Ÿ‘ 1
#58 opened 19 days ago by
hexgridcloud
New activity in Qwen/Qwen3.5-9B 19 days ago

Deployed on HexGrid Cloud: 1x RTX 5090 + Qwen3.5 9B BF16 โ€” 1280 tok/s peak, then TTFT goes from 0.7s to 18s, ShareGPT, concurrency 16โ€“128

๐Ÿ‘ 1
#58 opened 19 days ago by
hexgridcloud
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs