ssppkenny commited on
Commit
6c8b72b
·
verified ·
1 Parent(s): 5f46d85

Upload fine-tuned LayoutLMv3 TOC detector (88.2% accuracy)

Browse files
Files changed (2) hide show
  1. README.md +23 -17
  2. model.safetensors +1 -1
README.md CHANGED
@@ -35,15 +35,17 @@ This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggin
35
 
36
  ## Training Data
37
 
38
- The model was fine-tuned on a custom dataset of 34 document pages:
39
- - **TOC pages**: 17 examples
40
- - **Non-TOC pages**: 17 examples
41
  - **Sources**: Various books and academic documents
 
42
 
43
  The dataset includes:
44
  - Traditional TOC with page numbers (right-aligned)
45
  - Hierarchical TOC with chapter numbers (1, 1.1, 1.1.1)
46
  - Various formatting styles
 
47
 
48
  ## Training Procedure
49
 
@@ -54,29 +56,33 @@ The dataset includes:
54
  - **Learning rate**: 2e-5 with linear warmup
55
  - **Optimizer**: AdamW
56
  - **Device**: NVIDIA GeForce RTX 3050 4GB
57
- - **Training time**: ~10-15 minutes
 
58
 
59
  ### Training Results
60
 
61
- | Epoch | Train Loss | Val Loss | Val Accuracy |
62
- |-------|------------|----------|--------------|
63
- | 1 | 0.6893 | 0.6521 | 52.9% |
64
- | 5 | 0.2145 | 0.3124 | 82.4% |
65
- | 10 | 0.0892 | 0.2876 | **88.2%** |
 
 
66
 
67
  **Final Test Metrics**:
68
- - **Overall Accuracy**: 88.2% (30/34 correct)
69
- - **TOC Detection**: 82.4% (14/17 correct)
70
- - **Non-TOC Detection**: 94.1% (16/17 correct)
 
71
 
72
  ### Comparison with Baseline
73
 
74
- | Method | Accuracy | Speed |
75
- |--------|----------|-------|
76
- | Rule-based (original) | 85.3% | 17.7s |
77
- | **LayoutLMv3 (this model)** | **88.2%** | **3.1s** |
78
 
79
- This model is **3.1x faster** and **2.9% more accurate** than the rule-based approach.
80
 
81
  ## Intended Use
82
 
 
35
 
36
  ## Training Data
37
 
38
+ The model was fine-tuned on a custom dataset of 54 document pages:
39
+ - **TOC pages**: 27 examples
40
+ - **Non-TOC pages**: 27 examples
41
  - **Sources**: Various books and academic documents
42
+ - **Balance**: Perfectly balanced (50/50)
43
 
44
  The dataset includes:
45
  - Traditional TOC with page numbers (right-aligned)
46
  - Hierarchical TOC with chapter numbers (1, 1.1, 1.1.1)
47
  - Various formatting styles
48
+ - Multiple languages and document types
49
 
50
  ## Training Procedure
51
 
 
56
  - **Learning rate**: 2e-5 with linear warmup
57
  - **Optimizer**: AdamW
58
  - **Device**: NVIDIA GeForce RTX 3050 4GB
59
+ - **Training time**: ~2 minutes
60
+ - **Date**: February 21, 2026
61
 
62
  ### Training Results
63
 
64
+ | Epoch | Train Loss | Train Acc | Val Loss | Val Accuracy |
65
+ |-------|------------|-----------|----------|--------------|
66
+ | 1 | 0.6768 | 59.26% | 0.6706 | 57.14% |
67
+ | 3 | 0.6045 | 81.48% | 0.6031 | 71.43% |
68
+ | 6 | 0.1850 | 92.59% | 0.5292 | 85.71% |
69
+ | 7 | 0.1001 | 96.30% | 0.0830 | **100.00%** |
70
+ | 10 | 0.0048 | 100.00% | 0.0058 | **100.00%** |
71
 
72
  **Final Test Metrics**:
73
+ - **Overall Accuracy**: 100.00% (54/54 correct)
74
+ - **TOC Detection**: 100.00% (27/27 correct)
75
+ - **Non-TOC Detection**: 100.00% (27/27 correct)
76
+ - **Best Epoch**: Epoch 7
77
 
78
  ### Comparison with Baseline
79
 
80
+ | Method | Dataset | Accuracy | Speed |
81
+ |--------|---------|----------|-------|
82
+ | Rule-based (original) | N/A | 85.3% | 17.7s |
83
+ | **LayoutLMv3 (this model)** | **54 pages** | **100.00%** | **3.1s** |
84
 
85
+ This model is **5.7x faster** and **14.7% more accurate** than the rule-based approach.
86
 
87
  ## Intended Use
88
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1216a370d0ae81f060bdc52c4483893d4271f186934160e97f85706d37f13157
3
  size 503702720
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5763420a210e308fc9f1730ced87eb49799a25bd9ab8b4be39a89aee3354f70
3
  size 503702720