Post
156
š¦ Goldener feature: Semantics aware sampling for better models
Goldener provides smart data sampling out of the box by combining 2 different GoldDoers (classes orchestrating data actions):
1ļøā£ GoldDescriptor: Unlock data semantics access via embeddings computed from foundation models.
2ļøā£ GoldSelector: Select samples automatically by digging into data semantics with coreset algorithms
Both the foundation model and coreset algorithm are fully customizable to achieve the selection goals from a few lines of Python code.
The result? Goldener can replace the usual random selection and help release better models, faster!
š More details: https://huggingface.co/blog/Yann-CV/goldener-smart-sampling
šØ Give it a try: pip install goldener
Goldener provides smart data sampling out of the box by combining 2 different GoldDoers (classes orchestrating data actions):
1ļøā£ GoldDescriptor: Unlock data semantics access via embeddings computed from foundation models.
2ļøā£ GoldSelector: Select samples automatically by digging into data semantics with coreset algorithms
Both the foundation model and coreset algorithm are fully customizable to achieve the selection goals from a few lines of Python code.
The result? Goldener can replace the usual random selection and help release better models, faster!
š More details: https://huggingface.co/blog/Yann-CV/goldener-smart-sampling
šØ Give it a try: pip install goldener