Buckets:
| # Strict Dataclasses | |
| The `huggingface_hub` package provides a utility to create **strict dataclasses**. These are enhanced versions of Python's standard `dataclass` with additional validation features. Strict dataclasses ensure that fields are validated both during initialization and assignment, making them ideal for scenarios where data integrity is critical. | |
| ## Overview | |
| Strict dataclasses are created using the `@strict` decorator. They extend the functionality of regular dataclasses by: | |
| - Validating field types based on type hints | |
| - Supporting custom validators for additional checks | |
| - Optionally allowing arbitrary keyword arguments in the constructor | |
| - Validating fields both at initialization and during assignment | |
| ## Benefits | |
| - **Data Integrity**: Ensures fields always contain valid data | |
| - **Ease of Use**: Integrates seamlessly with Python's `dataclass` module | |
| - **Flexibility**: Supports custom validators for complex validation logic | |
| - **Lightweight**: Requires no additional dependencies such as Pydantic, attrs, or similar libraries | |
| ## Usage | |
| ### Basic Example | |
| ```python | |
| from dataclasses import dataclass | |
| from huggingface_hub.dataclasses import strict, as_validated_field | |
| # Custom validator to ensure a value is positive | |
| @as_validated_field | |
| def positive_int(value: int): | |
| if not value > 0: | |
| raise ValueError(f"Value must be positive, got {value}") | |
| @strict | |
| @dataclass | |
| class Config: | |
| model_type: str | |
| hidden_size: int = positive_int(default=16) | |
| vocab_size: int = 32 # Default value | |
| # Methods named `validate_xxx` are treated as class-wise validators | |
| def validate_big_enough_vocab(self): | |
| if self.vocab_size [!WARNING] | |
| > Method `.validate()` is a reserved name on strict dataclasses. | |
| > To prevent unexpected behaviors, a `StrictDataclassDefinitionError` error will be raised if your class already defines one. | |
| ## API Reference | |
| ### `@strict`[[huggingface_hub.dataclasses.strict]] | |
| The `@strict` decorator enhances a dataclass with strict validation. | |
| #### huggingface_hub.dataclasses.strict[[huggingface_hub.dataclasses.strict]] | |
| [Source](https://github.com/huggingface/huggingface_hub/blob/vr_4113/src/huggingface_hub/dataclasses.py#L56) | |
| Decorator to add strict validation to a dataclass. | |
| This decorator must be used on top of `@dataclass` to ensure IDEs and static typing tools | |
| recognize the class as a dataclass. | |
| Can be used with or without arguments: | |
| - `@strict` | |
| - `@strict(accept_kwargs=True)` | |
| Example: | |
| ```py | |
| >>> from dataclasses import dataclass | |
| >>> from huggingface_hub.dataclasses import as_validated_field, strict, validated_field | |
| >>> @as_validated_field | |
| >>> def positive_int(value: int): | |
| ... if not value >= 0: | |
| ... raise ValueError(f"Value must be positive, got {value}") | |
| >>> @strict(accept_kwargs=True) | |
| ... @dataclass | |
| ... class User: | |
| ... name: str | |
| ... age: int = positive_int(default=10) | |
| # Initialize | |
| >>> User(name="John") | |
| User(name='John', age=10) | |
| # Extra kwargs are accepted | |
| >>> User(name="John", age=30, lastname="Doe") | |
| User(name='John', age=30, *lastname='Doe') | |
| # Invalid type => raises | |
| >>> User(name="John", age="30") | |
| huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'age': | |
| TypeError: Field 'age' expected int, got str (value: '30') | |
| # Invalid value => raises | |
| >>> User(name="John", age=-1) | |
| huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'age': | |
| ValueError: Value must be positive, got -1 | |
| ``` | |
| **Parameters:** | |
| cls : The class to convert to a strict dataclass. | |
| accept_kwargs (`bool`, *optional*) : If True, allows arbitrary keyword arguments in `__init__`. Defaults to False. | |
| **Returns:** | |
| The enhanced dataclass with strict validation on field assignment. | |
| ### `validate_typed_dict`[[huggingface_hub.dataclasses.validate_typed_dict]] | |
| Method to validate that a dictionary conforms to the types defined in a `TypedDict` class. | |
| This is the equivalent to dataclass validation but for `TypedDict`s. Since typed dicts are never instantiated (only used by static type checkers), validation step must be manually called. | |
| #### huggingface_hub.dataclasses.validate_typed_dict[[huggingface_hub.dataclasses.validate_typed_dict]] | |
| [Source](https://github.com/huggingface/huggingface_hub/blob/vr_4113/src/huggingface_hub/dataclasses.py#L286) | |
| Validate that a dictionary conforms to the types defined in a TypedDict class. | |
| Under the hood, the typed dict is converted to a strict dataclass and validated using the `@strict` decorator. | |
| Example: | |
| ```py | |
| >>> from typing import Annotated, TypedDict | |
| >>> from huggingface_hub.dataclasses import validate_typed_dict | |
| >>> def positive_int(value: int): | |
| ... if not value >= 0: | |
| ... raise ValueError(f"Value must be positive, got {value}") | |
| >>> class User(TypedDict): | |
| ... name: str | |
| ... age: Annotated[int, positive_int] | |
| >>> # Valid data | |
| >>> validate_typed_dict(User, {"name": "John", "age": 30}) | |
| >>> # Invalid type for age | |
| >>> validate_typed_dict(User, {"name": "John", "age": "30"}) | |
| huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'age': | |
| TypeError: Field 'age' expected int, got str (value: '30') | |
| >>> # Invalid value for age | |
| >>> validate_typed_dict(User, {"name": "John", "age": -1}) | |
| huggingface_hub.errors.StrictDataclassFieldValidationError: Validation error for field 'age': | |
| ValueError: Value must be positive, got -1 | |
| ``` | |
| **Parameters:** | |
| schema (`type[TypedDictType]`) : The TypedDict class defining the expected structure and types. | |
| data (`dict`) : The dictionary to validate. | |
| ### `as_validated_field`[[huggingface_hub.dataclasses.as_validated_field]] | |
| Decorator to create a `validated_field`. Recommended for fields with a single validator to avoid boilerplate code. | |
| #### huggingface_hub.dataclasses.as_validated_field[[huggingface_hub.dataclasses.as_validated_field]] | |
| [Source](https://github.com/huggingface/huggingface_hub/blob/vr_4113/src/huggingface_hub/dataclasses.py#L426) | |
| Decorates a validator function as a `validated_field` (i.e. a dataclass field with a custom validator). | |
| **Parameters:** | |
| validator (`Callable`) : A method that takes a value as input and raises ValueError/TypeError if the value is invalid. | |
| ### `validated_field`[[huggingface_hub.dataclasses.validated_field]] | |
| Creates a dataclass field with custom validation. | |
| #### huggingface_hub.dataclasses.validated_field[[huggingface_hub.dataclasses.validated_field]] | |
| [Source](https://github.com/huggingface/huggingface_hub/blob/vr_4113/src/huggingface_hub/dataclasses.py#L383) | |
| Create a dataclass field with a custom validator. | |
| Useful to apply several checks to a field. If only applying one rule, check out the `as_validated_field` decorator. | |
| **Parameters:** | |
| validator (`Callable` or `list[Callable]`) : A method that takes a value as input and raises ValueError/TypeError if the value is invalid. Can be a list of validators to apply multiple checks. | |
| - ****kwargs** : Additional arguments to pass to `dataclasses.field()`. | |
| **Returns:** | |
| A field with the validator attached in metadata | |
| ### Errors[[huggingface_hub.errors.StrictDataclassError]] | |
| #### huggingface_hub.errors.StrictDataclassError[[huggingface_hub.errors.StrictDataclassError]] | |
| [Source](https://github.com/huggingface/huggingface_hub/blob/vr_4113/src/huggingface_hub/errors.py#L420) | |
| Base exception for strict dataclasses. | |
| #### huggingface_hub.errors.StrictDataclassDefinitionError[[huggingface_hub.errors.StrictDataclassDefinitionError]] | |
| [Source](https://github.com/huggingface/huggingface_hub/blob/vr_4113/src/huggingface_hub/errors.py#L424) | |
| Exception thrown when a strict dataclass is defined incorrectly. | |
| #### huggingface_hub.errors.StrictDataclassFieldValidationError[[huggingface_hub.errors.StrictDataclassFieldValidationError]] | |
| [Source](https://github.com/huggingface/huggingface_hub/blob/vr_4113/src/huggingface_hub/errors.py#L428) | |
| Exception thrown when a strict dataclass fails validation for a given field. | |
| ## Why Not Use `pydantic`? (or `attrs`? or `marshmallow_dataclass`?) | |
| - See discussion in https://github.com/huggingface/transformers/issues/36329 regarding adding Pydantic as a dependency. It would be a heavy addition and require careful logic to support both v1 and v2. | |
| - We don't need most of Pydantic's features, especially those related to automatic casting, jsonschema, serialization, aliases, etc. | |
| - We don't need the ability to instantiate a class from a dictionary. | |
| - We don't want to mutate data. In `@strict`, "validation" means "checking if a value is valid." In Pydantic, "validation" means "casting a value, possibly mutating it, and then checking if it's valid." | |
| - We don't need blazing-fast validation. `@strict` isn't designed for heavy loads where performance is critical. Common use cases involve validating a model configuration (performed once and negligible compared to running a model). This allows us to keep the code minimal. | |
Xet Storage Details
- Size:
- 8.92 kB
- Xet hash:
- 755ecfd4cfcd0cc2574215d8ed91b7d2f0c3f40c20afd6fe51c8c53ecbd656fa
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.