MT5-es-to-quz

This model was trained using synthetic data across several iterations of State-of-the-art Spanish to Quechua (Z) datasets.

Model Details

Model Description

  • Developed by:
    • Julio Santisteban Pablo
    • Ricardo Lazo Vasquez
  • Shared by [optional]: Julio Santisteban Pablo
  • Model type: Translate
  • Language(s) (NLP): Spanish to Quechua (Z)
  • License: Apache 2.0
  • FinetuFine-tunedmodel: MT5

Model Sources

  • Repository: Julio/mt5-es-to-quy Uses

    The overall goal is to improve translation from Spanish to Quechua (Z) and their related applications.

    Direct Use

    Improve translations from spanish to Quechua (Z).

    Downstream Use

    Create High-impact platforms who helps Quechua speakers.

    Out-of-Scope Use

    • Generate malicious content.
    • A 100% trustable platform.

    Bias, Risks, and Limitations

    The overall bias in this model is that is a work in progress, which is not recommended to use this model for a critical scenario.

    Recommendations

    It is important to evaluate the model in your own data before use it blindly. If the end user thinks is necesary an additional training, it must be done in the same conditions over the expected productive data.

    How to Get Started with the Model

    You can use it normally importing the model using MT5 as a base.

    Training Details

    Training Data

    The model was trained using syntetic data from different sources. The overall dataset will be published in the future and this notes will be updated accordingly.

    Training Procedure

    The model was trained over the syntetic dataset and tested in a contest dataset. (In the future, we will give more details).

    Preprocessing

    The model was preprocessed several times for each attempt of final model delivery. (In the future, we will give more details).

    Training Hyperparameters

    (In the future, we will give more details).

    Speeds, Sizes, Times

    (In the future, we will give more details).

    Evaluation

    Testing Data, Factors & Metrics

    Testing Data

    (In the future, we will give more details).

    Factors

    The main end user is the speakers of Quechua (Z) dialect.

    Metrics

    We evaluate our model using BLEU and cHrF evaluation metrics.

    Results

    Summary

    This work was made by several researchers from universities from Peru.

    (In the future, we will give more details).

    Model Examination

    (In the future, we will give more details).

    Environmental Impact

    Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

    Technical Specifications

    Model Architecture and Objective

    (In the future, we will give more details).

    Compute Infrastructure

    (In the future, we will give more details).

    Hardware

    (In the future, we will give more details).

    Software

    The model was trained using PyTorch.

    Citation

    (In the future, we will give more details).

    More Information

    Please feel free to mail to the main researcher Julio Santisteban to his email: jsantisteban@ucsp.edu.pe.

    Model Card Authors

    Ricardo Lazo Vasquez

    Model Card Contact

    Please email us to:

    • jsantisteban@ucsp.edu.pe
    • CC: ricardo.lazo@ucsp.edu.pe
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Julio/mt5-es-to-quz