ActiveEnergy's picture

ActiveEnergy

islameissa

·

AI & ML interests

None yet

Recent Activity

reacted to constannnt's post with 🔥 about 3 hours ago

We are excited to announce Sipp.sh: a high-performance library for running AI inference locally and in the cloud through a unified API. We began to realize that an LLM isn't just a chat interface for information retrieval. It can be integrated directly into web, games, or productivity apps to handle continuous monitoring and decision-making. It can act as a sort of "second brain,” the silent hand that guides and helps a user without them even realizing it. We see this as the next frontier of UX design, but this is only possible if developers have access to low-cost, zero-latency compute and absolute data privacy. That's why we created Sipp. It’s an opinionated library that lets developers integrate local AI into any application, giving them the superpowers to completely rethink user experiences across the web, games, and desktop. To achieve this, we built an entirely new stack in Rust and C++, working alongside the llama.cpp project. Through our work, we were able to contribute back to that community to help upgrade the GGML WebGPU backend. This deep optimization is what enables our fast, responsive decode speeds directly in the browser. Sipp ships as a zero-dependency library for desktop and web, achieving 3x to 5x speedup in token decode compared to popular alternatives. We are already seeing some incredible use cases emerge from this, from continuous monitoring using local vision to the dynamic generation of game elements in a real-time wizard vs. wizard game. The best part? It's fully open-source! We see this as the start of a dialogue about what the future of user interaction is going to look like, and we built Sipp to lay the foundation for that exciting future. Check out the live demos on our site, run your own benchmarks, or come hang out with us in our Discord. Website: https://www.sipp.sh/ Github: https://github.com/noumena-labs/Sipp

reacted to constannnt's post with 🚀 about 3 hours ago

We are excited to announce Sipp.sh: a high-performance library for running AI inference locally and in the cloud through a unified API. We began to realize that an LLM isn't just a chat interface for information retrieval. It can be integrated directly into web, games, or productivity apps to handle continuous monitoring and decision-making. It can act as a sort of "second brain,” the silent hand that guides and helps a user without them even realizing it. We see this as the next frontier of UX design, but this is only possible if developers have access to low-cost, zero-latency compute and absolute data privacy. That's why we created Sipp. It’s an opinionated library that lets developers integrate local AI into any application, giving them the superpowers to completely rethink user experiences across the web, games, and desktop. To achieve this, we built an entirely new stack in Rust and C++, working alongside the llama.cpp project. Through our work, we were able to contribute back to that community to help upgrade the GGML WebGPU backend. This deep optimization is what enables our fast, responsive decode speeds directly in the browser. Sipp ships as a zero-dependency library for desktop and web, achieving 3x to 5x speedup in token decode compared to popular alternatives. We are already seeing some incredible use cases emerge from this, from continuous monitoring using local vision to the dynamic generation of game elements in a real-time wizard vs. wizard game. The best part? It's fully open-source! We see this as the start of a dialogue about what the future of user interaction is going to look like, and we built Sipp to lay the foundation for that exciting future. Check out the live demos on our site, run your own benchmarks, or come hang out with us in our Discord. Website: https://www.sipp.sh/ Github: https://github.com/noumena-labs/Sipp

reacted to constannnt's post with ❤️ about 3 hours ago

We are excited to announce Sipp.sh: a high-performance library for running AI inference locally and in the cloud through a unified API. We began to realize that an LLM isn't just a chat interface for information retrieval. It can be integrated directly into web, games, or productivity apps to handle continuous monitoring and decision-making. It can act as a sort of "second brain,” the silent hand that guides and helps a user without them even realizing it. We see this as the next frontier of UX design, but this is only possible if developers have access to low-cost, zero-latency compute and absolute data privacy. That's why we created Sipp. It’s an opinionated library that lets developers integrate local AI into any application, giving them the superpowers to completely rethink user experiences across the web, games, and desktop. To achieve this, we built an entirely new stack in Rust and C++, working alongside the llama.cpp project. Through our work, we were able to contribute back to that community to help upgrade the GGML WebGPU backend. This deep optimization is what enables our fast, responsive decode speeds directly in the browser. Sipp ships as a zero-dependency library for desktop and web, achieving 3x to 5x speedup in token decode compared to popular alternatives. We are already seeing some incredible use cases emerge from this, from continuous monitoring using local vision to the dynamic generation of game elements in a real-time wizard vs. wizard game. The best part? It's fully open-source! We see this as the start of a dialogue about what the future of user interaction is going to look like, and we built Sipp to lay the foundation for that exciting future. Check out the live demos on our site, run your own benchmarks, or come hang out with us in our Discord. Website: https://www.sipp.sh/ Github: https://github.com/noumena-labs/Sipp

View all activity

Organizations

None yet

New activity in unsloth/Qwen3.6-35B-A3B-GGUF 2 months ago

Qwen3.6 GGUF Benchmarks

#10 opened 2 months ago by

2-bit Qwen3.6 Tool-calling is amazing!!

#2 opened 2 months ago by

New activity in byteshape/Qwen3.5-35B-A3B-GGUF 3 months ago

Nice Quants

#1 opened 3 months ago by

New activity in unsloth/Qwen3.5-35B-A3B-GGUF 4 months ago

Verbose looping

#28 opened 4 months ago by

New activity in akashmjn/tinydiarize-whisper.cpp 4 months ago

Large or medium versions

#1 opened 4 months ago by

New activity in byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF 5 months ago

Speed on RTX 5090

#7 opened 5 months ago by

New activity in nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 6 months ago

Unexpected... "Performance"?

#15 opened 6 months ago by

New activity in shanjiaz/gpt-oss-120b-nvfp4-modelopt 6 months ago

Quantization command

#1 opened 9 months ago by

New activity in ZJU-AI4H/Hulu-Med-14B 8 months ago

GGUF

#2 opened 8 months ago by

GGUF

#2 opened 8 months ago by