Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

Shrijanagain 
posted an update 2 days ago
view post
Post
5132
Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training
Author: Shrijan Kumar Tiwari
Affiliation: SKT AI Labs / Project Surya
Model Architecture: Optimized Dense Transformer
Parameters: 1.1 Trillion
Training Tokens: 146 Trillion

Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull

Whitepaper - https://github.com/SHRIJANAGAIN/PROFF
  • 48 replies
·
DedeProGames 
posted an update 3 days ago
view post
Post
3694
Can small models program?

Although even if they are reasoning AIs, small AIs cannot create extensive and high-quality code, at least that's what is commonly thought.

We present OrionLLM/NanoCoder-0.6b, an AI with just 600 million parameters based on qwen3-0.6b and trained with the dataset nvidia/OpenCodeReasoning.

While not good at complex code, we observed a significant improvement in code generation (especially in Python code), demonstrating that, when trained correctly, small AIs can, in fact, program.
  • 2 replies
·
cahlen 
posted an update about 20 hours ago
view post
Post
407
It’s wild to me how you can just make shit now.

You can take a weekend with a raspberry pi 5, a pi camera, a 3d printer, and a smidgen of custom fine tuning (wakeword, whisper, tinybert, and pipertts) and you have physical device as a talking personal assistant.

What a time to be alive.

Edge ai, physical ai, ai augmented animatronics… tiny models. Tiny agents.

Going to be a wild year.
fffiloni 
posted an update 3 days ago
view post
Post
3645
I brought DALL·E mini back to life 🤖🎨

You can try it here:
fffiloni/dalle-mini-reboot

And I also built a batch version using Hugging Face Jobs (up to 50 images per prompt):
fffiloni/dalle-mini-via-jobs

The goal was to stay close to the original JAX/Flax pipeline, while integrating it with modern tooling (Gradio + Jobs).

It ended up being a fun way to revisit this model — still weird, still fun 😄
  • 2 replies
·
Keeby-smilyai 
posted an update 4 days ago
view post
Post
3078
Hello everyone!
  • 1 reply
·
AINovice2005 
posted an update 1 day ago
view post
Post
1992
In celebration of the new storage graph feature on the Hub, here's mine 😊 :


Post inspired by @ZennyKenny
salma-remyx 
posted an update about 11 hours ago
view post
Post
80
Looking to execute on your next great idea? 💡

Search for relevant papers and find pre-built Docker images to interactively explore the code with Remyx!

Check out the new space 🔍
remyxai/remyx-explorer
kanaria007 
posted an update 1 day ago
view post
Post
91
✅ Article highlight: *Long-Horizon Planning under SI-Core* (art-60-046, v0.1)

TL;DR:
Most discussions stop at the next Jump, the next rollout wave, or the next experiment. This article asks a harder question: how do you bind *30-second decisions* and *30-year plans* into the same structural story?

The answer here is *Plan Jumps*: long-horizon artifacts for infrastructure programs, policy trajectories, and institutional reforms, evaluated over scenario bundles, monitored with explicit replan triggers, and kept auditable through the same SIR / EVAL / SCover / SCI / CAS logic used at shorter horizons.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
• turns plans themselves into first-class, traceable objects instead of PDF promises
• connects operational Jumps, tactical adjustments, and decade-scale plans in one runtime story
• treats uncertainty, scenario comparison, and replanning as built-in structure, not afterthoughts
• keeps politics and governance explicit instead of pretending models should “choose the future”

What’s inside:
• *Plan Jumps* for 5–30 year horizons
• *scenario bundles* and long-horizon world models
• *Plan-GCS*, SCover / SCI / CAS over decades
• *policy-level Genius Replay* for reusable historical plan structure
• *PoLB + EVAL* for shadow / pilot / staged rollout of sub-policies
• *policy-to-goal contracts*, budget envelopes, and governance review cycles
• *uncertainty propagation*, confidence bands, and robust plan selection
• *replan triggers* for scheduled, threshold, event-driven, and learning-based revision
• *intergenerational equity* and future citizens as explicit principals

Key idea:
SI-Core should not only explain what happened this minute. It should also help humans steer what happens over the next 10–30 years — with plans that are structured, replayable, revisable, and politically inspectable.
ZennyKenny 
posted an update 3 days ago
view post
Post
3074
🤔 So we're supposed to post our repo storage graphs now right?
prithivMLmods 
posted an update about 8 hours ago
view post
Post
98
Map-Anything v1 (Universal Feed-Forward Metric 3D Reconstruction) demo is now available on Hugging Face Spaces. Built with Gradio and integrated with Rerun, it performs multi-image and video-based 3D reconstruction, depth, normal map, and interactive measurements.

🤗 Demo: prithivMLmods/Map-Anything-v1
🤗 Model: facebook/map-anything-v1
🤗 Hf-Papers: MapAnything: Universal Feed-Forward Metric 3D Reconstruction (2509.13414)