| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - fistro/gromenauer |
| | language: |
| | - es |
| | pipeline_tag: text-generation |
| | base_model: |
| | - mistralai/Mistral-7B-v0.1 |
| | --- |
| | # Gromenauer-7B |
| |
|
| | <div align=center> |
| | <img alt="gromenauer-7B logo" src="https://huggingface.co/bertin-project/Gromenauer-7B/resolve/main/images/gromenauer.png" width="200px"> |
| | </div> |
| |
|
| | ## Overview |
| | Gromenauer-7B is a Spanish language model designed to understand and generate high-quality Spanish text. Developed using the robust Mistral architecture, this model has been trained on an extensive literary corpus, ensuring it captures a wide range of linguistic nuances, styles, and contexts found in Spanish literature. |
| |
|
| | ## Model Details |
| |
|
| | - **Model Type**: Mistral |
| | - **Sequence Length**: 8192 |
| | - **Hidden Dimension**: 4096 |
| | - **Intermediate Dimension**: 14336 |
| | - **Number of Layers**: 32 |
| | - **Number of Attention Heads**: 32 |
| | - **Number of Key-Value Heads**: 8 |
| | - **Activation Function**: SiLU |
| | - **Initializer Range**: 0.02 |
| | - **Layer Norm Epsilon**: 1.0e-05 |
| | - **Use Flash Attention**: Yes |
| | - **Gradient Checkpointing**: Enabled (Block Size: 5) |
| | - **Sliding Window Attention**: 4096 |
| | - **Use Bias**: No |
| |
|
| | ## Training Details |
| |
|
| | - **Tokenizer**: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) |
| | - **Batch Size**: 512 |
| | - **Learning Rate**: 1e-5 |
| | - **Optimizer**: Adam with beta1=0.9, beta2=0.95, epsilon=1e-8 |
| | - **Weight Decay**: 0.1 |
| | - **Warmup Steps**: 200 |
| | - **Learning Rate Schedule**: Cosine |
| | - **Number of Training Steps**: 7000 |
| |
|
| | ## Usage |
| |
|
| | To load the model in your project, you can use the following code: |
| |
|
| | ```python |
| | from transformers import AutoModel, AutoTokenizer |
| | |
| | # Load the tokenizer |
| | tokenizer = AutoTokenizer.from_pretrained("bertin-project/Gromenauer-7B") |
| | |
| | # Load the model |
| | model = AutoModel.from_pretrained("bertin-project/Gromenauer-7B") |
| | |
| | # Example usage |
| | text = "Introduce aquí tu texto en español." |
| | inputs = tokenizer(text, return_tensors="pt") |
| | outputs = model(**inputs) |