Introducing ๐ผ๐ฝ๐ฒ๐ป ๐๐ฒ๐ฒ๐ฝ-๐ฅ๐ฒ๐๐ฒ๐ฎ๐ฟ๐ฐ๐ต by Hugging Face! ๐ฅ
OpenAI's latest agentic app Deep Research seems really good... But it's closed, as usual.
โฑ๏ธ So with a team of cracked colleagues, we set ourselves a 24hours deadline to replicate and open-source Deep Research! โฑ๏ธ
โก๏ธ We built open-Deep-Research, an entirely open agent that can: navigate the web autonomously, scroll and search through pages, download and manipulate files, run calculation on data...
We aimed for the best performance: are the agent's answers really rigorous?
On GAIA benchmark, Deep Research had 67% accuracy on the validation set. โก๏ธ open Deep Research is at 55% (powered by o1), it is: - the best pass@1 solution submitted - the best open solution ๐ช๐ช
And it's only getting started ! Please jump in, drop PRs, and let's bring it to the top !
All You need To Know About Phi-3 (Technical Report Walkthrough)
Summary of Summaries: Phi-3-mini - Architecture specs: decoder-only transformer, ModelSize: 3.8 billion parameters, LongRope [ 128K Context length ], Vocab Size [ 32064 ], trained on 3.3 trillion tokens. at bfloat16. - Rivals performance to larger models like Mixtral 8x7B and GPT-3.5, capable of running locally on a smartphone. - Utilizes high quality training dataset heavily filtered from web data and llm-generated synthetic data. - Can be quantized to 4-bits, occupying โ 1.8GB of memory. - Ran natively on iPhone 14 with A16 Bionic chip with inference speed of up to 12 tokens per second.
Phi-3-small - Architecture specs: Also decoder-only, 7B parameters, Vocab size [ 100352 ], default context length [ 8k ], Context Length: 8K, Hidden Dimension: 4096, Number of Heads and Layers: Follows 7B class structure. - Uses tiktoken tokenizer (for enhanced multilingual tokenization)
Phi-3-medium: - Architecture specs: Also decoder-only, Hidden Dimension: 5120, Number of Heads: 40, Number of Layers: 40, Tokenization: Consistent with other models, Training on 4.8 trillion tokens.
Training Methodology: - Focuses on high-quality training data deviating from standard scaling laws. - The models undergo two-phase pre-training using a mix of web sources and synthetic data for general knowledge and logical reasoning skills.
Performance: - Phi-3-mini achieves competitive scores on standard benchmarks like MMLU and MT-Bench, indicating strong reasoning capabilities. - Higher variants show even better performance, suggesting effective scaling with increased model size.
Limitations: - phi-3-mini: limited by its smaller size in tasks requiring extensive factual knowledge, primarily supports English. - phi-3-small limited multilingual support.
Hosting LLMs locally is a big win for OSS - private, secured inferencing on the go๐
4 replies
ยท
reacted to HaotongQin's
post with ๐over 1 year ago
We release an empirical study to showcase "How Good Are Low-bit Quantized hashtag#LLaMA3 ๐ฆ Models" with existing LLM quantization techniques!
In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. ๐ However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.
We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...