impresso-project
/

ner-stacked-bert-multilingual

@@ -33,10 +33,20 @@ The model architecture consists of the following components:
 These additional Transformer layers help in mitigating the effects of OCR noise, spelling variation, and non-standard linguistic usage found in historical documents. The entire stack is fine-tuned end-to-end for token classification.
-## Training and Evaluation Results (v2)
 This evaluation corresponds to the **HIPE-2020 dataset (v2.1)**, using **French and German** combined for training,
 **German (`dev-de`)** for validation, and **French (`test-fr`)** for testing.
 The results below show performance on the **French test set** across multiple evaluation settings.
 | **Evaluation** | **Label** | **P** | **R** | **F1** |
@@ -161,78 +171,6 @@ print(entities)
 ]
 ```
-## Training Details
-### Training Data
-The model was trained on the Impresso HIPE-2020 dataset, a subset of the [HIPE-2022 corpus](https://github.com/hipe-eval/HIPE-2022-data), which includes richly annotated OCR-transcribed historical newspaper content.
-### Training Procedure
-#### Preprocessing
-OCR content was cleaned and segmented. Entity types follow the HIPE-2020 typology.
-#### Training Hyperparameters
-- **Training regime:** Mixed precision (fp16)
-- **Epochs:** 5
-- **Max sequence length:** 512
-- **Base model:** `dbmdz/bert-medium-historic-multilingual-cased`
-- **Stacked Transformer layers:** 2
-#### Speeds, Sizes, Times
-- **Model size:** ~500MB
-- **Training time:** ~1h on 1 GPU (NVIDIA TITAN X)
-## Evaluation
-#### Testing Data
-Held-out portion of HIPE-2020 (French, German)
-#### Metrics
-- F1-score (micro, macro)
-- Entity-level precision/recall
-### Results
-| Language | Precision | Recall | F1-score |
-|----------|-----------|--------|----------|
-| French   | 84.2      | 81.6   | 82.9     |
-| German   | 82.0      | 78.7   | 80.3     |
-#### Summary
-The model performs robustly across noisy OCR historical content with support for fine-grained entity typologies.
-## Environmental Impact
-- **Hardware Type:** NVIDIA TITAN X (Pascal, 12GB)
-- **Hours used:** ~1 hour
-- **Cloud Provider:** EPFL, Switzerland
-- **Carbon Emitted:** ~0.022 kg CO₂eq (estimated)
-## Technical Specifications
-### Model Architecture and Objective
-Stacked BERT architecture with multitask token classification head supporting HIPE-type entity labels.
-### Compute Infrastructure
-#### Hardware
-1x NVIDIA TITAN X (Pascal, 12GB)
-#### Software
-- Python 3.11
-- PyTorch 2.0
-- Transformers 4.36
 ## Citation
 **BibTeX:**

 These additional Transformer layers help in mitigating the effects of OCR noise, spelling variation, and non-standard linguistic usage found in historical documents. The entire stack is fine-tuned end-to-end for token classification.
+## Training and Evaluation Results
 This evaluation corresponds to the **HIPE-2020 dataset (v2.1)**, using **French and German** combined for training,
 **German (`dev-de`)** for validation, and **French (`test-fr`)** for testing.
+#### Training Hyperparameters
+- **Training regime:** Mixed precision (fp16)
+- **Epochs:** 5
+- **Max sequence length:** 512
+- **Base model:** `dbmdz/bert-medium-historic-multilingual-cased`
+- **Stacked Transformer layers:** 2
+#### Results
 The results below show performance on the **French test set** across multiple evaluation settings.
 | **Evaluation** | **Label** | **P** | **R** | **F1** |
 ]
 ```
 ## Citation
 **BibTeX:**