model_name: FrenchTextCategorizer language: French tags: - text-classification - fine-tuned - french license: mit dataset: "French News Dataset" # 📝 Usage This model is a **FLAUBERT** fine-tuned version to categorize French texts into the following categories: > **CULTURE**, **DEBATS_ET_OPINIONS**, **ECONOMIE**, **EDUCATION**, **FAIT_DIVERS**, **INTERNATIONAL**, **LIFESTYLE**, **NUMERIQUE**, **POLITIQUE**, **RELIGION**, **SANTE**, **SCIENCE_ET_ENVIRONNEMENT**, **SOCIETE**, **SPORT**, **INDEFINI** --- ## 🚀 Quick Start ```python from transformers import AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained("juenp/FrenchTextCategorizer") model.eval() ``` --- ## 🔎 Full Example (with Tokenizer, Prediction and Probabilities) ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch import torch.nn.functional as F # Load model and tokenizer model = AutoModelForSequenceClassification.from_pretrained("juenp/FrenchTextCategorizer") tokenizer = AutoTokenizer.from_pretrained("juenp/FrenchTextCategorizer") model.eval() # Input text text = "Ce film est un chef-d'œuvre incroyable, tout était parfait." # Tokenize inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True) inputs.pop("token_type_ids", None) # Predict with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits probs = F.softmax(logits, dim=-1) predicted_class_idx = torch.argmax(probs, dim=-1).item() # Decode predicted class from config predicted_class = model.config.id2label[str(predicted_class_idx)] prob_percentages = [round(p.item() * 100, 2) for p in probs[0]] # Output print(f"Texte : {text}") print(f"Classe prédite : {predicted_class}") print(f"Probabilités (%) : {prob_percentages}") ``` --- # 📋 Notes - `model.config.id2label` is automatically loaded from the model's configuration (`config.json`). - If you want to process multiple texts at once, simply pass a list of texts to the tokenizer. --- # ✅ Ready for Inference!