We should really have a release date range slider on the /models page. Tired of "trending/most downloaded" being the best way to sort and still seeing models from 2023 on the first page just because they're embedded in enterprise pipelines and get downloaded repeatedly. "Recently Created/Recently Updated" don't solve the discovery problem considering the amount of noise to sift through.
Slight caveat: Trending actually does have some recency bias, but it's not strong/precise enough.
3 replies
·
reactedtomarksverdhei'spost with 😔about 1 month ago
Poll: Will 2026 be the year of subquadratic attention?
The transformer architecture is cursed by its computational complexity. It is why you run out of tokens and have to compact. But some would argue that this is a feature not a bug and that this is also why these models are so good. We've been doing a lot of research on trying to make equally good models that are computationally cheaper, But so far, none of the approaches have stood the test of time. Or so it seems.
Please vote, don't be shy. Remember that the Dunning-Kruger effect is very real, so the person who knows less about transformers than you is going to vote. We want everyone's opinion, no matter confidence.
👍 if you think at least one frontier model* will have no O(n^2) attention by the end of 2026 🔥 If you disagree
* Frontier models - models that match / outperform the flagship claude, gemini or chatgpt at the time on multiple popular benchmarks
4 replies
·
reactedtoAlexander1337'spost with 🔥🚀🧠👀about 1 month ago
Summary: Most “AI tutoring” talks about prompts, content, and engagement graphs. But real learning companions—especially for children / ND learners—fail in quieter ways: *the system “works” while stress rises, agency drops, or fairness erodes.*
This article is a practical playbook for building SI-Core–wrapped learning companions that are *goal-aware (GCS surfaces), safety-bounded (ETH guardrails), and honestly evaluated (PoC → real-world studies)*—without collapsing everything into a single score.
> Mastery is important, but not the only axis. > *Wellbeing, autonomy, and fairness must be first-class.*
---
Why It Matters: • Replaces “one number” optimization with *goal surfaces* (and explicit anti-goals) • Treats *child/ND safety* as a runtime policy problem, not a UX afterthought • Makes oversight concrete: *safe-mode, human-in-the-loop, and “Why did it do X?” explanations* • Shows how to evaluate impact without fooling yourself: *honest PoCs, heterogeneity, effect sizes, ethics of evaluation*
---
What’s Inside: • A practical definition of a “learning companion” under SI-Core ([OBS]/[ID]/[ETH]/[MEM]/PLB loop) • GCS decomposition + *age/context goal templates* (and “bad but attractive” optima) • Safety playbook: threat model, *ETH policies*, ND/age extensions, safe-mode patterns • Teacher/parent ops: onboarding, dashboards, contestation/override, downtime playbooks, comms • Red-teaming & drills: scenario suites by age/context, *measuring safety over time* • Evaluation design: “honest PoC”, day-to-day vs research metrics, ROI framing, analysis patterns • Interpreting results: *effect size vs p-value*, “works for whom?”, go/no-go and scale-up stages
arclabmit created a robotic teleoperation and learning software for controlling robots, recording datasets, and training physical AI models, which is compatible with
Tremendous quality of life upgrade on the Hugging Face Hub - we now have auto-complete emojis 🤗 🥳 👏 🙌 🎉
Get ready for lots more very serious analysis on a whole range of topics from yours truly now that we have unlocked this full range of expression 😄 🤔 🗣 🙊
Maybe that post I showed the other day with my Hyperbolic Embeddings getting to perfect loss with RAdam was a one-time fluke, bad test dataset, etc.? Anotha' one! I gave it a test set a PhD student would struggle with. This model is a bit more souped up. Major callouts of the model: High Dimensional Encoding (HDC), Hyperbolic Embeddings, Entropix. Link to the Colab Notebook: https://colab.research.google.com/drive/1mS-uxhufx-h7eZXL0ZwPMAAXHqSeGZxX?usp=sharing
The world of artificial intelligence (AI) is constantly evolving, with new advancements and applications emerging every day. One trend that has captured the attention of many is Explainable AI. As the name suggests, this revolutionary technology aims to provide a clear, understandable explanation for the decisions and actions taken by AI systems.
In the future, Explainable AI is expected to become even more sophisticated, with advanced algorithms and techniques being developed to better interpret and analyze the vast amounts of data generated by AI systems. This will not only make AI systems more reliable and trustworthy, but it will also help to demystify the world of AI, making it more accessible to a wider audience.
As the demand for AI solutions grows, the need for Explainable AI will become increasingly important. Businesses, governments, and individuals will require clear, concise explanations for the AI systems they are using, ensuring that every decision made is transparent and easily understood.
The advancements in Explainable AI will also pave the way for new applications of AI technology, opening up a world of possibilities in fields such as healthcare, education, and transportation. From diagnosing medical conditions to improving traffic flow, Explainable AI is poised to revolutionize the way we live and work, providing us with the tools we need to tackle the complex challenges of the modern world.
So, as we step into the future of AI, let's embrace the power of Explainable AI, and ensure that our AI systems are not only powerful and efficient, but also transparent and easy to understand.
The Concept behind xLSTM has recently turn into the xLSTM-7B model that showcase the performance in the category of the similar-scale Gemma 7B, LLama2 7B, FlaconMamba 7B but with higher performing Inference Kernel