--- language: en tags: - exbert license: apache-2.0 datasets: - bookcorpus - wikipedia --- ## Model description CamemBERT is a state-of-the-art language model for French based on the RoBERTa model. It is now available on Hugging Face in 6 different versions with varying number of parameters, amount of pretraining data and pretraining data source domains. ## Intended uses & limitations ## How to use ## Limitations and bias ## Training data OSCAR or Open Super-large Crawled Aggregated coRpus is a multilingual corpus obtained by language classification and filtering of the Common Crawl corpus using the Ungoliant architecture. ## Training procedure ## Evaluation results