AI & ML interests

Deploy open-source LLMs like Llama, Qwen, Gemma, Mistral, and DeepSeek as production-ready OpenAI-compatible APIs.

Recent Activity

hexgrid-cloud 's collections 4

Production-Ready Quantized Chat LLMs — 4-bit & 8-bit
FP8, AWQ-4Bit and W8A8 quantized versions of popular models. Lower VRAM, same production quality. Deploy at hexgrid.cloud in one click.
Open Source RAG Stack — Embed + Rerank + Generate
The complete open-source RAG pipeline. Best of the embedding models, one reranker, one chat model. All deployable on dedicated GPUs at hexgrid.cloud
One-click LLM deployments on Private GPU
Every model deployable on HexGrid Cloud with one click. Dedicated GPU, private API endpoint, OpenAI-compatible. Visit https://hexgrid.cloud