jmarbach commited on
Commit
1dafa12
·
1 Parent(s): 0fe4f42

update readMe

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -8,7 +8,7 @@ base_model:
8
  QuillIndex is an indexing model developed by the [ETH Library](https://library.ethz.ch/). It is trained on the handwritten documents of the [School Board minutes](https://sr.ethz.ch/) (1854-1902) of [ETH Zurich](https://ethz.ch/en.html). Trained on samples created by [ChronoQuill](https://github.com/eth-library/ChronoQuill), an HTR pipeline, QuillIndex assigns labels for a given agenda item. Its taxonomy is constrained to a derived set from the underlying data, the annual indexes and corresponding agenda items. Due to the nature of the model, it cannot hallucinate arbitrary labels.
9
 
10
  ## Model Architecture & Evaluation
11
- QuillIndex is an encoder-only sequence classifier and uses [ModernBERT](answerdotai/ModernBERT-base) as a pre-trained backbone. A complete technical report on QuillIndex, its architecture and evaluation can be found in the respective section in [here](https://www.research-collection.ethz.ch/server/api/core/bitstreams/8053d4d8-51b4-4103-8164-b5068ddb3903/content).
12
 
13
  ## Environment Setup (Linux x86)
14
 
@@ -49,8 +49,11 @@ print(predicted_labels)
49
  # ['Antrag', 'Aufnahme', 'Bericht', 'Direktor', 'Ingenieurschule', 'Schüler', 'Vollmacht']
50
  ```
51
 
 
 
 
52
  # License
53
- We release QuillIndex the model weights under the Apache 2.0 license.
54
 
55
  # Citation
56
  If you use this model, please cite:
 
8
  QuillIndex is an indexing model developed by the [ETH Library](https://library.ethz.ch/). It is trained on the handwritten documents of the [School Board minutes](https://sr.ethz.ch/) (1854-1902) of [ETH Zurich](https://ethz.ch/en.html). Trained on samples created by [ChronoQuill](https://github.com/eth-library/ChronoQuill), an HTR pipeline, QuillIndex assigns labels for a given agenda item. Its taxonomy is constrained to a derived set from the underlying data, the annual indexes and corresponding agenda items. Due to the nature of the model, it cannot hallucinate arbitrary labels.
9
 
10
  ## Model Architecture & Evaluation
11
+ QuillIndex is an encoder-only sequence classifier and uses [ModernBERT](answerdotai/ModernBERT-base) as a pre-trained backbone. The taxonomy can be found within the config file. A complete technical report on QuillIndex, its architecture and evaluation can be found in the respective section in [here](https://www.research-collection.ethz.ch/server/api/core/bitstreams/8053d4d8-51b4-4103-8164-b5068ddb3903/content).
12
 
13
  ## Environment Setup (Linux x86)
14
 
 
49
  # ['Antrag', 'Aufnahme', 'Bericht', 'Direktor', 'Ingenieurschule', 'Schüler', 'Vollmacht']
50
  ```
51
 
52
+ ## Generalization
53
+ The taxonomy is derived from 19th-century ETH School Board minutes. The model is fine-tuned exclusively on 19th-century German. Application to other domains or periods may be unreliable.
54
+
55
  # License
56
+ We release QuillIndex under the Apache 2.0 license.
57
 
58
  # Citation
59
  If you use this model, please cite: