1 1 8

Emre

rashakol19

AI & ML interests

None yet

Recent Activity

liked a model 14 days ago

HeartMuLa/HeartMuLa-RL-oss-3B-20260123

new activity about 2 months ago

facebook/sam-audio-large:Has anyone been able to access it?

upvoted a collection about 2 months ago

sam-audio

View all activity

Organizations

liked a model 14 days ago

HeartMuLa/HeartMuLa-RL-oss-3B-20260123

4B • Updated 19 days ago • 7.3k • 23

New activity in facebook/sam-audio-large about 2 months ago

Has anyone been able to access it?

➕ 😔 5

#1 opened about 2 months ago by

Abduraxim

upvoted a collection about 2 months ago

sam-audio

Collection

11 items • Updated Dec 16, 2025 • 128

reacted to m-ric's post with 🔥 about 1 year ago

Post

10027

Introducing 𝗼𝗽𝗲𝗻 𝗗𝗲𝗲𝗽-𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 by Hugging Face! 💥

OpenAI's latest agentic app Deep Research seems really good... But it's closed, as usual.

⏱️ So with a team of cracked colleagues, we set ourselves a 24hours deadline to replicate and open-source Deep Research! ⏱️

➡️ We built open-Deep-Research, an entirely open agent that can: navigate the web autonomously, scroll and search through pages, download and manipulate files, run calculation on data...

We aimed for the best performance: are the agent's answers really rigorous?

On GAIA benchmark, Deep Research had 67% accuracy on the validation set.
➡️ open Deep Research is at 55% (powered by o1), it is:
- the best pass@1 solution submitted
- the best open solution 💪💪

And it's only getting started ! Please jump in, drop PRs, and let's bring it to the top !

Read the blog post 👉 https://huggingface.co/blog/open-deep-research

liked a model about 1 year ago

Kijai/flux-fp8

Updated Aug 16, 2025 • 51.6k • 947

liked 5 models over 1 year ago

reacted to Jaward's post with 👍 almost 2 years ago

Post

5533

All You need To Know About Phi-3 (Technical Report Walkthrough)

Summary of Summaries:
Phi-3-mini
- Architecture specs: decoder-only transformer, ModelSize: 3.8 billion
parameters, LongRope [ 128K Context length ], Vocab Size [ 32064 ],
trained on 3.3 trillion tokens. at bfloat16.
- Rivals performance to larger models like Mixtral 8x7B and GPT-3.5,
capable of running locally on a smartphone.
- Utilizes high quality training dataset heavily filtered from web data and
llm-generated synthetic data.
- Can be quantized to 4-bits, occupying ≈ 1.8GB of memory.
- Ran natively on iPhone 14 with A16 Bionic chip with inference speed of up
to 12 tokens per second.

Phi-3-small
- Architecture specs: Also decoder-only, 7B parameters, Vocab size [ 100352 ], default context length [ 8k ], Context Length: 8K, Hidden Dimension: 4096, Number of Heads and Layers: Follows 7B class structure.
- Uses tiktoken tokenizer (for enhanced multilingual tokenization)

Phi-3-medium:
- Architecture specs: Also decoder-only, Hidden Dimension: 5120, Number of Heads: 40, Number of Layers: 40, Tokenization: Consistent with other models, Training on 4.8 trillion tokens.

Training Methodology:
- Focuses on high-quality training data deviating from standard scaling laws.
- The models undergo two-phase pre-training using a mix of web sources and synthetic data for general knowledge and logical reasoning skills.

Performance:
- Phi-3-mini achieves competitive scores on standard benchmarks like MMLU and MT-Bench, indicating strong reasoning capabilities.
- Higher variants show even better performance, suggesting effective scaling with increased model size.

Limitations:
- phi-3-mini: limited by its smaller size in tasks requiring extensive factual knowledge, primarily supports English.
- phi-3-small limited multilingual support.

Hosting LLMs locally is a big win for OSS - private, secured inferencing on the go😎

4 replies

reacted to HaotongQin's post with 👍 almost 2 years ago

Post

1925

We release an empirical study to showcase "How Good Are Low-bit Quantized hashtag#LLaMA3 🦙 Models" with existing LLM quantization techniques!

In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. 🚀 However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.

We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (2404.14047)

https://huggingface.co/collections/LLMQ/llama3-quantization-66251258525135aeda16513c

liked a model about 3 years ago

CompVis/stable-diffusion-v1-4

Text-to-Image • Updated Aug 23, 2023 • 795k • 6.98k

Emre

AI & ML interests

Recent Activity

Organizations

rashakol19's activity

Has anyone been able to access it?