How to use from
SGLangUse Docker images
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "readerbench/RoGPT2-base" \
--host 0.0.0.0 \
--port 30000# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "readerbench/RoGPT2-base",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'Quick Links
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Model card for RoGPT2-base
language: - ro
RoGPT2: Romanian GPT2 for text generation
All models are available:
For code and evaluation check out GitHub.
How to use
# TensorFlow
from transformers import AutoTokenizer, TFAutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained('readerbench/RoGPT2-base')
model = TFAutoModelForCausalLM.from_pretrained('readerbench/RoGPT2-base')
inputs = tokenizer.encode("Este o zi de vara", return_tensors='tf')
text = model.generate(inputs, max_length=1024, no_repeat_ngram_size=2)
print(tokenizer.decode(text[0]))
# PyTorch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained('readerbench/RoGPT2-base')
model = AutoModelForCausalLM.from_pretrained('readerbench/RoGPT2-base')
inputs = tokenizer.encode("Este o zi de vara", return_tensors='pt')
text = model.generate(inputs, max_length=1024, no_repeat_ngram_size=2)
print(tokenizer.decode(text[0]))
Training
Corpus Statistics
| Corpus | Total size | Number of words | Number of sentences |
|---|---|---|---|
| OSCAR | 11.54 GB | 1745M | 48.46M |
| Wiki-Ro | 0.46 GB | 68M | 1.79M |
| Debates | 0.5 GB | 73M | 3.61M |
| Books | 4.37 GB | 667M | 37.39M |
| News | 0.15 GB | 23M | 0.77M |
Training Statistics
| Version | Number of parameters | Number of epoch | Duration of an epoch | Context size | Batch size | PPL |
|---|---|---|---|---|---|---|
| Base | 124M | 15 | 7h | 1024 | 72 | 22.96 |
| Medium | 354M | 10 | 22h | 1024 | 24 | 17.64 |
| Large | 774M | 5 | 45h | 512 | 16 | 16.77 |
Evaluation
1. MOROCO
| Model | Dialect | Md to Ro | Ro to Md |
|---|---|---|---|
| KRR + SK | 94.06 | 67.59 | 75.47 |
| BERT-base-ro | 95.98 | 69.90 | 78.08 |
| RoBERT-small | 95.76 | 69.05 | 80.15 |
| RoBERT-base | 97.24 | 68.80 | 82.37 |
| RoBERT-large | 97.21 | 69.50 | 83.26 |
| RoGPT2-base | 96.69 | 69.82 | 77.55 |
| RoGPT2-medium | 96.42 | 69.77 | 80.51 |
| RoGPT2-large | 96.93 | 71.07 | 82.56 |
2. LaRoSeDa
| Model | Binary: Accuracy | Binary: F1-Score | Multi-Class: Accuracy | Multi-Class: F1-Score |
|---|---|---|---|---|
| BERT-base-ro | 98.07 | 97.94 | - | 79.61 |
| RoDiBERT | 98.40 | 98.31 | - | 83.01 |
| RoBERT-small | 97.44 | 97.43 | 89.30 | 84.23 |
| RoBERT-base | 98.27 | 98.26 | 90.59 | 86.27 |
| RoBERT-large | 98.20 | 98.19 | 90.93 | 86.63 |
| RoGPT2-base | 97.89 | 97.88 | 89.65 | 84.68 |
| RoGPT2-medium | 98.03 | 98.04 | 90.29 | 85.37 |
| RoGPT2-large | 98.06 | 98.07 | 90.26 | 84.89 |
3. RoSTS
| Model | Spearman dev-set | Spearman test-set | Pearson dev-set | Pearson test-set |
|---|---|---|---|---|
| BERT-base-ro | 84.26 | 80.86 | 84.59 | 81.59 |
| RoDiBERT | 77.07 | 71.47 | 77.13 | 72.25 |
| RoBERT-small | 82.06 | 78.06 | 81.66 | 78.49 |
| RoBERT-base | 84.93 | 80.39 | 85.03 | 80.39 |
| RoBERT-large | 86.25 | 83.15 | 86.58 | 83.76 |
| RoGPT2-base | 83.51 | 79.77 | 83.74 | 80.56 |
| RoGPT2-medium | 85.75 | 82.25 | 86.04 | 83.16 |
| RoGPT2-large | 85.70 | 82.64 | 86.14 | 83.46 |
4. WMT16
| Model | Decoder method | Ro-En | En-Ro |
|---|---|---|---|
| mBART | - | 38.5 | 38.5 |
| OpenNMT | - | - | 24.7 |
| RoGPT2-base | Greedy | 30.37 | 20.27 |
| RoGPT2-base | Beam-search-4 | 31.26 | 22.31 |
| RoGPT2-base | Beam-search-8 | 31.39 | 22.95 |
| RoGPT2-medium | Greedy | 32.48 | 22.18 |
| RoGPT2-medium | Beam-search-4 | 34.08 | 24.03 |
| RoGPT2-medium | Beam-search-8 | 34.16 | 24.13 |
| RoGPT2-large | Greedy | 33.69 | 23.31 |
| RoGPT2-large | Beam-search-4 | 34.40 | 24.23 |
| RoGPT2-large | Beam-search-8 | 34.51 | 24.32 |
5. XQuAD
| Model | Decoder method | EM | F1-Score |
|---|---|---|---|
| BERT-base-ro | - | 47.89 | 63.74 |
| RoDiBERT | - | 21.76 | 34.57 |
| RoBERT-small | - | 30.84 | 45.17 |
| RoBERT-base | - | 53.52 | 70.04 |
| RoBERT-large | - | 55.46 | 69.64 |
| mBERT | - | 59.9 | 72.7 |
| XLM-R Large | - | 69.7 | 83.6 |
| RoGPT2-base | Greedy | 23.69 | 35.97 |
| RoGPT2-base | Beam-search-4 | 24.11 | 35.27 |
| RoGPT2-medium | Greedy | 29.66 | 44.74 |
| RoGPT2-medium | Beam-search-4 | 31.59 | 45.32 |
| RoGPT2-large | Greedy | 29.74 | 42.98 |
| RoGPT2-large | Beam-search-4 | 29.66 | 43.05 |
| RoGPT2-base-en-ro | Greedy | 23.86 | 34.27 |
| RoGPT2-base-en-ro | Beam-search-4 | 25.04 | 34.51 |
| RoGPT2-medium-en-ro | Greedy | 27.05 | 39.75 |
| RoGPT2-medium-en-ro | Beam-search-4 | 27.64 | 39.11 |
| RoGPT2-large-en-ro | Greedy | 28.40 | 39.79 |
| RoGPT2-large-en-ro | Beam-search-4 | 28.73 | 39.71 |
| RoGPT2-large-en-ro-mask | Greedy | 31.34 | 44.71 |
| RoGPT2-large-en-ro-mask | Beam-search-4 | 31.59 | 43.53 |
6. Wiki-Ro: LM
| Model | PPL dev | PPL test |
|---|---|---|
| BERT-base-ro | 29.0897 | 28.0043 |
| RoGPT2-base | 34.3795 | 33.7460 |
| RoGPT2-medium | 23.7879 | 23.4581 |
| RoGPT2-large | 21.7491 | 21.5200 |
7. RoGEC
| Model | Decoder mothod | P | R | F0.5 |
|---|---|---|---|---|
| Transformer-tiny | Beam-search | 53.53 | 26.36 | 44.38 |
| Transformer-base Finetuning | Beam-search | 56.05 | 46.19 | 53.76 |
| Transformer-base Finetuning | Beam-search-LM | 50.68 | 45.39 | 49.52 |
| Transformer-base Finetuning | Beam-search-norm-LM | 51.06 | 45.43 | 49.83 |
| RoGPT2-base | Greedy | 59.02 | 49.35 | 56.80 |
| RoGPT2-base | Beam-search-4 | 65.23 | 49.26 | 61.26 |
| RoGPT2-base | Beam-search-8 | 65.88 | 49.64 | 61.84 |
| RoGPT2-medium | Greedy | 69.97 | 57.94 | 67.18 |
| RoGPT2-medium | Beam-search-4 | 72.46 | 57.99 | 69.01 |
| RoGPT2-medium | Beam-search-8 | 72.24 | 57.69 | 68.77 |
| RoGP2-large | Greedy | 61.90 | 49.09 | 58.83 |
| RoGP2-large | Beam-search-4 | 65.24 | 49.43 | 61.32 |
| RoGP2-large | Beam-search-8 | 64.96 | 49.22 | 61.06 |
| RoGPT2-base* | Greedy | 68.67 | 49.60 | 63.77 |
| RoGPT2-base* | Beam-search-4 | 71.16 | 50.53 | 65.79 |
| RoGPT2-base* | Beam-search-8 | 71.68 | 50.65 | 66.18 |
| RoGPT2-medium* | Greedy | 58.21 | 43.32 | 54.47 |
| RoGPT2-medium* | Beam-search-4 | 68.31 | 43.78 | 61.43 |
| RoGPT2-medium* | Beam-search-8 | 68.68 | 43.99 | 61.75 |
| RoGPT2-large* | Greedy | 64.86 | 41.30 | 58.22 |
| RoGPT2-large* | Beam-search-4 | 65.57 | 41.00 | 58.55 |
| RoGPT2-large* | Beam-search-8 | 65.44 | 41.09 | 58.50 |
Note: * the models were trained using the dataset of 3,000,000 artificially generated pairs
Acknowledgments
Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC)
How to cite
@inproceedings{niculescu2021rogpt2,
title={RoGPT2: Romanian GPT2 for Text Generation},
author={Niculescu, Mihai Alexandru and Ruseti, Stefan and Dascalu, Mihai},
booktitle={2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI)},
pages={1154--1161},
year={2021},
organization={IEEE}
}
- Downloads last month
- 2,041
Install from pip and serve model
# Install SGLang from pip: pip install sglang# Start the SGLang server: python3 -m sglang.launch_server \ --model-path "readerbench/RoGPT2-base" \ --host 0.0.0.0 \ --port 30000# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "readerbench/RoGPT2-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'