AI & ML interests

None defined yet.

Recent Activity

danielhanchen 
posted an update 3 days ago
view post
Post
2365
You can now fine-tune embedding models in our free Unsloth notebook! 🤗

Fine-tuning embedding models improves retrieval & RAG by aligning vectors to your domain-specific notion of similarity, improving search, clustering, and recommendations on your data.

⭐ Blog + Notebooks: https://unsloth.ai/docs/new/embedding-finetuning

Unsloth trains embedding models 1.8-3.3x faster with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

We'd like to thank Hugging Face and Unsloth contributor: electroglyph for making this possible!
  • 1 reply
·
danielhanchen 
posted an update 6 days ago
danielhanchen 
posted an update 10 days ago
view post
Post
2754
You can now do reinforcement learning training with 7× longer context and no accuracy loss, via our new batching algorithms.

Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU.

Blog: https://unsloth.ai/docs/new/grpo-long-context
danielhanchen 
posted an update 26 days ago
csabakecskemeti 
posted an update about 1 month ago
view post
Post
3199
Just sharing a result of a homelab infrastructure experiment:

I've managed to setup a distributed inference infra at home using a DGX Spark (128GB unified gddr6) and a linux workstation with an RTX 6000 Pro (96GB gddr7) connected via 100Gbps RoCEv2. The model I've used (https://lnkd.in/gx6J7YuB) is about 140GB so could not fit either of the GPU. Full setup and tutorial soon on devquasar.com



Screen recording:
https://lnkd.in/gKM9H5GJ
·
danielhanchen 
posted an update about 1 month ago
danielhanchen 
posted an update about 1 month ago
danielhanchen 
posted an update about 1 month ago
danielhanchen 
posted an update about 2 months ago
csabakecskemeti 
posted an update about 2 months ago
danielhanchen 
posted an update about 2 months ago
view post
Post
3837
Mistral's new Ministral 3 models can now be Run & Fine-tuned locally! (16GB RAM)
Ministral 3 have vision support and the best-in-class performance for their sizes.
14B Instruct GGUF: unsloth/Ministral-3-14B-Instruct-2512-GGUF
14B Reasoning GGUF: unsloth/Ministral-3-14B-Reasoning-2512-GGUF

🐱 Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3
All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3
·
csabakecskemeti 
posted an update about 2 months ago
view post
Post
2089
Looking for some help to test an INT8 Deepseek 3.2:
SGLang supports Channel wise INT8 quants on CPUs with AMX instructions (Xeon 5 and above AFAIK)
https://lmsys.org/blog/2025-07-14-intel-xeon-optimization/

Currently uploading an INT8 version of Deepseek 3.2 Speciale:
DevQuasar/deepseek-ai.DeepSeek-V3.2-Speciale-Channel-INT8

I cannot test this I'm on AMD
"AssertionError: W8A8Int8LinearMethod on CPU requires that CPU has AMX support"
(I assumed it can fall back to some non optimized kernel but seems not)

If anyone with the required resources (Intel Xeon 5/6 + ~768-1TB ram) can help to test this that would be awesome.

If you have hints how to make this work on AMD Threadripper 7000 Pro series please guide me.

Thanks all!
·
danielhanchen 
posted an update about 2 months ago
csabakecskemeti 
posted an update 3 months ago
view post
Post
310
Recently there are so much activity on token efficient formats, I've also build a package (inspired by toon).

Deep-TOON

My goal was to token efficiently handle json structures with complex embeddings.

So this is what I've built on the weekend. Feel free try:

https://pypi.org/project/deep-toon/0.1.0/

danielhanchen 
posted an update 3 months ago