Zane Peycke PRO

peycke

https://peyck.es

zanepeycke

AI & ML interests

None yet

Recent Activity

updated a Space 15 days ago

peycke/sandbox-7b407456

liked a model almost 2 years ago

meta-llama/Llama-3.1-405B-Instruct

updated a dataset about 2 years ago

spectrallabs/credit-scoring-training-dataset

View all activity

Organizations

updated a Space 15 days ago

ml-intern sandbox

🌍

liked a model almost 2 years ago

meta-llama/Llama-3.1-405B-Instruct

Text Generation • 406B • Updated Sep 25, 2024 • 246k • 595

updated a dataset about 2 years ago

spectrallabs/credit-scoring-training-dataset

Viewer • Updated May 2, 2024 • 443k • 52 • 20

reacted to MoritzLaurer's post with 🔥 about 2 years ago

Post

3726

🆕 Releasing a new series of 8 zeroshot classifiers: better performance, fully commercially useable thanks to synthetic data, up to 8192 tokens, run on any hardware.

Summary:
🤖 The zeroshot-v2.0-c series replaces commercially restrictive training data with synthetic data generated with mistralai/Mixtral-8x7B-Instruct-v0.1 (Apache 2.0). All models are released under the MIT license.
🦾 The best model performs 17%-points better across 28 tasks vs. facebook/bart-large-mnli (the most downloaded commercially-friendly baseline).
🌏 The series includes a multilingual variant fine-tuned from BAAI/bge-m3 for zeroshot classification in 100+ languages and with a context window of 8192 tokens
🪶 The models are 0.2 - 0.6 B parameters small, so they run on any hardware. The base-size models are +2x faster than bart-large-mnli while performing significantly better.
🤏 The models are not generative LLMs, they are efficient encoder-only models specialized in zeroshot classification through the universal NLI task.
🤑 For users where commercially restrictive training data is not an issue, I've also trained variants with even more human data for improved performance.

Next steps:
✍️ I'll publish a blog post with more details soon
🔮 There are several improvements I'm planning for v2.1. Especially the multilingual model has room for improvement.

All models are available for download in this Hugging Face collection: MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f

These models are an extension of the approach explained in this paper, but with additional synthetic data: https://arxiv.org/abs/2312.17543

3 replies

liked a dataset over 2 years ago

ahmadkad/sol_eval

Updated Dec 28, 2023 • 4 • 1

Zane Peycke PRO

AI & ML interests

Recent Activity

Organizations

peycke's activity

ml-intern sandbox