@alfredo-ottomate I seriously thought I was missing a huge breakthrough when reading that lol. I mean even the 4 expert active layers wont fit on the claimed 1.5GB RAM and if we even go further and assume there is disk offloading and the Pi had a high-end Gen3 NVME SSD, I would assume sub 1 tok/s.
@SeaWolf-AI you have a nice structured approach for benchmarking and including different kind of variables and metrics but a lot of info in this is flawed honestly. Also Qwen3.5 models underperforming the Qwen3 ones is unexpected at all, are you sure you have used the recommended generation parameters for each model? As slight variations can lead to totally different outputs especially on the metrics you are looking for.
yehya PRO
ykarout
AI & ML interests
None yet
Recent Activity
commented on an article about 13 hours ago
🏟️ Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do commented on an article 1 day ago
🏟️ Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do