Activity Feed

AI & ML interests

None defined yet.

Recent Activity

raincandy-u 
posted an update 1 day ago
view post
Post
3413
🤗 Just released Rain-100M, an experimental ~97M-parameter Qwen3-style language model trained from random initialization.

Repo: raincandy-u/Rain-100M

Data: HuggingFaceFW/fineweb-edu, ~3B tokens, English only

Tokenizer: custom 16k BPE, context length 4096

Architecture: 12 Transformer layers, hidden size 768, 12 heads, MLP 2048, SiLU, bf16


Rain-100M is a raw base model (not instruction-tuned or safety-aligned), aimed at small-scale research, debugging training pipelines, and CPU/edge experiments. If you run evaluations, finetunes, or visualizations with it, I would be very interested in your results!
  • 1 reply
·
lorraine2 
posted an update 5 days ago
view post
Post
1470
📽️ New NVIDIA paper: Motion Attribution for Video Generation 📽️

We propose MOTIVE, a method for taking query video clips and identifying which training data will improve or degrade performance after finetuning, enabling sophisticated data curation and beyond!

🔎 Project Page: https://research.nvidia.com/labs/sil/projects/MOTIVE/
📖 Full Paper: https://arxiv.org/abs/2601.08828

Check out more work from the NVIDIA Spatial Intelligence Lab here: https://research.nvidia.com/labs/sil/

This project was led by the great work of Xindi(Cindy) Wu, along with Despoina Paschalidou, Jun Gao, Antonio Torralba, Laura Leal-Taixé, Olga Russakovsky, and Sanja Fidler.
takarajordan 
posted an update 9 days ago
takarajordan 
posted an update about 2 months ago
takarajordan 
posted an update 2 months ago
view post
Post
3200
Two weeks ago I had an engaging discussion with locals in Cockermouth about AI and the broader industry, a reminder that hearing candid perspectives beyond our professional circles is invaluable and something anyone working full-time in this field should make time for.

Thank you!
takarajordan 
posted an update 3 months ago
view post
Post
266
🌞 LOVABLE IS CRACKED

Built a golden hour tracker in under 15 minutes with Lovable: uses your phone’s Geolocation API, the SunCalc library, and runs fully client-side with no servers. https://goldenhour.404missing.link
sourceoftruthdata 
posted an update 3 months ago
view post
Post
3432
What a fantastic community!
  • 1 reply
·
gokaygokay 
posted an update 3 months ago
view post
Post
6192
FlashPack: Lightning-Fast Model Loading for PyTorch

https://github.com/fal-ai/flashpack

FlashPack — a new, high-throughput file format and loading mechanism for PyTorch that makes model checkpoint I/O blazingly fast, even on systems without access to GPU Direct Storage (GDS).

With FlashPack, loading any model can be 3–6× faster than with the current state-of-the-art methods like accelerate or the standard load_state_dict() and to() flow — all wrapped in a lightweight, pure-Python package that works anywhere.

  • 2 replies
·
s3nh 
posted an update 3 months ago
view post
Post
619
Eduhelp with more empathy, based on model finetuned on
psychotheraputic preferences just landed on


Beck-8B as a base model, 13000 steps on educational dataset.
Time to go further and build more 🥰
s3nh/EduHelp_Beck_8B
Thanks to @basilic_ai for computations <3
s3nh 
posted an update 3 months ago
view post
Post
4185
Just tried to create an educational assistant for younger people who can struggle with visualsation of 'what is this sorcery all about'.
Its first step of my spare time projects, sft on Qwen3-8B,

EduHelper is a child-friendly tutoring assistant fine-tuned from the Qwen3-8B base model using parameter-efficient fine-tuning (PEFT) with LoRA on the ajibawa-2023/Education-Young-Children dataset.

s3nh/EduHelp-8B

Glad to share my work, have a wonderful day!
  • 2 replies
·