File size: 5,215 Bytes

c50d1ea
 
 
 
 
 
 
6c61ee9
 
c50d1ea
fc44d3e
ccd9a70
 
 
fee1744
ccd9a70
c50d1ea
ccd9a70
fc44d3e
ccd9a70
 
 
 
 
 
 
 
fc44d3e
 
ccd9a70
 
 
 
 
fee1744
ccd9a70
c50d1ea
ccd9a70
 
 
 
 
 
 
 
 
 
 
 
c50d1ea
ccd9a70
c50d1ea
ccd9a70
 
 
 
 
 
 
 
 
 
 
 
c50d1ea
ccd9a70
 
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
a4f4769
ccd9a70
 
 
 
c50d1ea
ccd9a70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c50d1ea
ccd9a70
c50d1ea
ccd9a70
 
 
 
 
c50d1ea
ccd9a70

---
license: cc0-1.0
base_model:
- Ultralytics/YOLO26
pipeline_tag: image-segmentation
tags:
- document-layout
- yolo
- document-layout-analysis
---
# **Fiche Technique : Modèles YOLOv26n,x (Ultralytics)**

*Version : 1.0 | Date : 22/05/2026*

---

## **Informations générales**

Modèles **YOLOv26** spécialisés pour la **segmentation de régions de texte (TextRegion) et de lignes de texte (TextLine)** dans des documents.

**Cas d'usage** : OCR, transcription automatique, analyse de documents patrimoniaux.

| **Champ**               | **SegN**                          | **SegX**                          
|-------------------------|-----------------------------------|-----------------------------------
| **Nom du modèle**       | `Yolo-Seg-TextRegion-TextLine-Typed-SegN.pt` | `Yolo-Seg-TextRegion-TextLine-Typed-SegX.pt` 
| **Librairie**           | Ultralytics YOLOv8.4.49           | Ultralytics YOLOv8.4.49           
| **Type**                | Segmentation d’instances         | Segmentation d’instances         
| **Architecture**        | YOLOv26n (nano)                    | YOLOv26x (extra-large)             
| **Taille des poids**    | ~6.6 Mo                           | ~141 Mo                           
| **Résolution d’entrée** | 640x640 (par défaut)              | 640x640 (par défaut)              
| **Classes**             | 2 (TextRegion, TextLine)          | 2 (TextRegion, TextLine)          
| **Framework**           | PyTorch ≥ 2.0                     | PyTorch ≥ 2.0                     
| **Licence**             | CC-Zero              | CC-Zero              

---

## **Performances**


| **Métrique**   | **Valeur SegN** | **Valeur SegX** |
| -------------- | ---------- | ---------- |
| mAP50-95 (val) | 73.62%     | 76.97%    |
| mAP50 (val)    | 94.37%     | 95.30%    |
| Précision (P)  | 0.930      | 0.95383   |
| Rappel (R)     | 0.912      | 0.93317   |
| F1-Score       | 0.921 | 0.943 |
---


## **Dataset**

### **Composition**

- **Taille totale** : **4 956 images** (4 460 train / 496 val).
- **Sources** : **16 datasets publics** (voir [liste complète](#-sources-des-datasets)).
- **Langues couvertes** : Français, Anglais, Espagnol, Italien, Allemand, Latin, Corse, etc.
- **Périodes** : XVIe–XXe siècles (majorité : XVIIIe–XIXe).
- **Format d’annotation** : YOLO (fichiers `.txt` avec masques de segmentation).
- **Répartition des classes** :
  - TextRegion : 9% (23195)
  - TextLine : 91% (234634)

---

### **Sources des Datasets**

[Ehri-dataset](https://github.com/FloChiff/ehri-dataset)

[CORDEL-CA-PRINT-19](https://github.com/FoNDUE-HTR/CORDEL-CA-PRINT-19/)

[CORDEL-ES-PRINT-19](https://github.com/FoNDUE-HTR/CORDEL-ES-PRINT-19)

[FONDUE-EN-PRINT-20](https://github.com/FoNDUE-HTR/FONDUE-EN-PRINT-20)

[FONDUE-ES-PRINT-19](https://github.com/FoNDUE-HTR/FONDUE-ES-PRINT-19)

[FONDUE-FR-PRINT-20](https://github.com/FoNDUE-HTR/FONDUE-FR-PRINT-20)

[FONDUE-IT-PRINT-20](https://github.com/FoNDUE-HTR/FONDUE-IT-PRINT-20)

[FONDUE-MLT-ART](https://github.com/FoNDUE-HTR/FONDUE-MLT-ART)

[FONDUE-MLT-CAT](https://github.com/FoNDUE-HTR/FONDUE-MLT-CAT/)

[Kat_57-SE-MSS-20](https://github.com/FoNDUE-HTR/Kat_57-SE-MSS-20)

[HTR-imprime-18e-siecle](https://github.com/Gallicorpora/HTR-imprime-18e-siecle)

[Cremma-16-17-print](https://github.com/HTR-United/cremma-16-17-print)

[Dahncorpus](https://github.com/HTR-United/dahncorpus)

[Tapuscorpus](https://github.com/HTR-United/tapuscorpus)

[NuBIS-OCR](https://github.com/ksefil/NuBIS-OCR)

[HN2021-OCR-Poesie-Corse](https://github.com/PSL-Chartes-HTR-Students/HN2021-OCR-Poesie-Corse)

[TNAH-2021-ArgusDesBrevets](https://github.com/PSL-Chartes-HTR-Students/TNAH-2021-ArgusDesBrevets)

---

## **Courbes et visualisations**

- **Courbe PR (Precision-Recall)** :  

Courbe PR pour le modèle SegN

![Courbe PR SegN](Yolo-Seg-TextRegion-TextLine-Typed-SegN/BoxPR_curve.png)

Courbe PR pour le modèle SegX

![Courbe PR SegX](Yolo-Seg-TextRegion-TextLine-Typed-SegX/BoxPR_curve.png)

- **Matrice de confusion** :  

Confusion Matrix pour le modèle SegN

![Confusion Matrix SegN](Yolo-Seg-TextRegion-TextLine-Typed-SegN/confusion_matrix_normalized.png)

Confusion Matrix pour le modèle SegX

![Confusion Matrix SegN](Yolo-Seg-TextRegion-TextLine-Typed-SegX/confusion_matrix_normalized.png)


- **Exemple de détection** :  

Exemple sur la thèse **Relief, érosion différentielle et morphogenèse dans un bourrelet montagneux de haute latitude : Lofoten-vesteralen et Sogn-Jotun (Norvège)** de Jean-Pierre Peulvast disponible sur le site de la [Nubis](https://nubis.bis-sorbonne.fr/ark:/15733/nt07?view=full)

![Exemple](ark_15733_nt07_BIS_00_01362_V03_0007.jpg)

---

## **Utilisation****

### **Installation**

```bash
pip install ultralytics==8.4.49 torch torchvision
```

### **Inférence**

```python
from ultralytics import YOLO

# Charger le modèle
model = YOLO("Yolo-Seg-TextRegion-TextLine-Typed-SegX.pt")

# Prédiction sur une image
results = model.predict("document.jpg", conf=0.5, show=True)

# Accéder aux masques de segmentation
for result in results:
    masks = result.masks.data  # Tensor des masques
    boxes = result.boxes.data   # Boîtes englobantes