Translation
Transformers
Safetensors
llama
text-generation
lora
fine-tuned
legal
eu-legislation
multilingual
eurohpc
ai4trad
text-generation-inference
Instructions to use EuropeanParliament/EuroLLM-22B-EU-legislative with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use EuropeanParliament/EuroLLM-22B-EU-legislative with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="EuropeanParliament/EuroLLM-22B-EU-legislative")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("EuropeanParliament/EuroLLM-22B-EU-legislative") model = AutoModelForCausalLM.from_pretrained("EuropeanParliament/EuroLLM-22B-EU-legislative") - Notebooks
- Google Colab
- Kaggle
EuroLLM-22B Legislative Text (Fine-tuned) AI4TRA
Fine-tuned version of EuroLLM-22B-Instruct-2512 for testing translation task : English into 23 official EU languages, specifically for EU legislative texts.
Model Details
- Base model: utter-project/EuroLLM-22B-Instruct-2512
- Fine-tuning method: LoRA (Low-Rank Adaptation)
- Task: EN → 23 EU languages translation (legislative domain)
- Precision: bf16
- Developed by: AI4TRAD
- Organization: European Parliament
- Compute: EuroHPC (6x GPU, Discoverer supercomputer)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = "EuropeanParliament/EuroLLM-22B-EU-legislative"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path, torch_dtype=torch.bfloat16, device_map="auto"
)
messages = [
{"role": "user", "content": "Translate the following English text to French:\n\nThe regulation applies to all member states."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Authors
- AI4TRAD
Acknowledgements
This work was supported by EuroHPC resources on the Discoverer supercomputer (Sofia, Bulgaria).
- Downloads last month
- 38