YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.


license: mit
tags:
- text-normalization
- vietnamese
- lexical-normalization
- visonorm
- visobert
pipeline_tag: fill-mask
---

# hadung1802/visobert-normalizer-mix100

This model is a Vietnamese text normalization model trained using the ViSoNorm framework with VISOBERT architecture.

## Model Description

This model performs lexical normalization for Vietnamese text, converting informal text to standard Vietnamese. It was trained using the ViSoNorm (Self-training with Weak Supervision) framework.

## Performance


    ## Training Configuration

    - **Base Model**: VISOBERT
    - **Training Mode**: weakly_supervised
    - **Learning Rate**: 0.001
    - **Epochs**: 10
    - **Batch Size**: 16

    ## Usage

    ```python
    from transformers import AutoTokenizer, AutoModelForMaskedLM

    # Load model and tokenizer
    model_repo = "your-username/your-model-name"  # Replace with your actual repo
    tokenizer = AutoTokenizer.from_pretrained(model_repo)
    model = AutoModelForMaskedLM.from_pretrained(model_repo, trust_remote_code=True)

    # Normalize text using the built-in method
    text = "sv dh gia dinh chua cho di lam :))"
    normalized_text, source_tokens, predicted_tokens = model.normalize_text(
        tokenizer, text, device='cpu'
    )

    # Output: sinh viên đại học gia đình chưa cho đi làm :))
    ```

    ## Example Outputs

    | Input | Output |
    |-------|--------|
    | `sv dh gia dinh chua cho di lam :))` | `sinh viên đại học gia đình chưa cho đi làm :))` |
    | `chúng nó bảo em là ctrai` | `chúng nó bảo em là con trai` |
    | `anh ơi em muốn đi chơi` | `anh ơi em muốn đi chơi` |

    ## Citation

    If you use this model, please cite the ViSoNorm paper:

    ```bibtex
    @article{visonorm2024,
    title={ViSoNorm: Self-training with Weak Supervision for Vietnamese Text Normalization},
    author={Your Name},
    journal={arXiv preprint},
    year={2024}
    }
    ```
    
Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support