Add paper link and improve model card metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +36 -41
README.md CHANGED
@@ -1,57 +1,52 @@
1
  ---
2
- language:
3
- - en
4
- tags:
5
- - financial NLP
6
- - named entity recognition
7
- - sequence labeling
8
- - structured extraction
9
- - hierarchical taxonomy
10
- - XBRL
11
- - iXBRL
12
- - SEC filings
13
- - financial-information-extraction
14
  datasets:
15
- - AAU-NLP/HiFi-KPI
16
- model_name: "Pre-BERT-SL1000"
17
- library_name: "transformers"
18
- pipeline_tag: "token-classification"
19
- base_model: "bert-base-uncased"
20
- task_categories:
21
- - token-classification
22
- task_ids:
23
- - named-entity-recognition
24
- - financial-information-extraction
25
- pretty_name: "Pre-BERT-SL1000: Sequence Labeling for Presentation Taxonomy KPI Extraction"
26
- size_categories: "1M<n<10M"
27
- languages:
28
- - en
29
- dataset_name: "HiFi-KPI"
30
- model_description: |
31
- Pre-BERT-SL1000 is a **BERT-based sequence labeling model** fine-tuned on the **HiFi-KPI dataset** for extracting
32
- **financial key performance indicators (KPIs)** from **SEC earnings filings (10-K & 10-Q)**. It specializes in identifying
33
- entities that are one level up the **presentation taxonomy**, such as revenueAbstract, earnings, and financial ratios, using **token classification**.
34
-
35
- This model is trained specifically on n=1 with the **presentation taxonomy labels** from **HiFi-KPI**, focusing on entity identification.
36
-
37
- dataset_link: "https://huggingface.co/datasets/AAU-NLP/HiFi-KPI"
38
- repo_link: "https://github.com/rasmus393/HiFi-KPI"
39
  ---
40
 
41
  ## **Pre-BERT-SL1000**
42
 
 
 
43
  ### **Model Description**
44
- Pre-BERT-SL1000 is a **BERT-based sequence labeling model** fine-tuned on the **[HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)** for extracting **financial key performance indicators (KPIs)** from **SEC earnings filings (10-K & 10-Q)**. It specializes in identifying entities, such as revenue, earnings, etc.
45
 
46
- This model is trained on the [HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI) and is focused on the **presentation layer taxonomy** with **n=1**.
47
 
48
  ### **Use Cases**
49
  - Extracting **financial KPIs** using **iXBRL presentation taxonomy**
50
  - **Financial document parsing** with entity recognition
51
 
52
  ### **Performance**
53
- - Trained on **1,000 most frequent labels** from the **[HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)** with n=1 in the **presentation taxonomy**
54
 
55
- ### **Dataset & Code**
 
56
  - **Dataset**: [HiFi-KPI on Hugging Face](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)
57
- - **Code example**: [HiFi-KPI GitHub Repository](https://github.com/rasmus393/HiFi-KPI)
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: bert-base-uncased
 
 
 
 
 
 
 
 
 
 
 
3
  datasets:
4
+ - AAU-NLP/HiFi-KPI
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ pipeline_tag: token-classification
9
+ tags:
10
+ - financial NLP
11
+ - named entity recognition
12
+ - sequence labeling
13
+ - structured extraction
14
+ - hierarchical taxonomy
15
+ - XBRL
16
+ - iXBRL
17
+ - SEC filings
18
+ - financial-information-extraction
19
+ license: cc-by-4.0
 
 
 
 
 
 
 
 
20
  ---
21
 
22
  ## **Pre-BERT-SL1000**
23
 
24
+ This model was presented in the paper [HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings](https://huggingface.co/papers/2502.15411).
25
+
26
  ### **Model Description**
27
+ Pre-BERT-SL1000 is a **BERT-based sequence labeling model** fine-tuned on the **[HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)** for extracting **financial key performance indicators (KPIs)** from **SEC earnings filings (10-K & 10-Q)**. It specializes in identifying entities that are one level up the **presentation taxonomy**, such as revenueAbstract, earnings, and financial ratios, using **token classification**.
28
 
29
+ This model is trained specifically on n=1 with the **presentation taxonomy labels** from **HiFi-KPI**, focusing on entity identification.
30
 
31
  ### **Use Cases**
32
  - Extracting **financial KPIs** using **iXBRL presentation taxonomy**
33
  - **Financial document parsing** with entity recognition
34
 
35
  ### **Performance**
36
+ - Trained on **1,000 most frequent labels** from the **[HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)** with n=1 in the **presentation taxonomy**.
37
 
38
+ ### **Resources**
39
+ - **Paper**: [HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings](https://huggingface.co/papers/2502.15411)
40
  - **Dataset**: [HiFi-KPI on Hugging Face](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)
41
+ - **Code**: [HiFi-KPI GitHub Repository](https://github.com/aaunlp/HiFi-KPI)
42
+
43
+ ### **Citation**
44
+ If you use this model or dataset, please cite:
45
+ ```bibtex
46
+ @article{aavang2025hifikpi,
47
+ title={HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings},
48
+ author={Aavang, Rasmus and Rizzi, Giovanni and B{\o}ggild, Rasmus and Iolov, Alexandre and Zhang, Mike and Bjerva, Johannes},
49
+ journal={arXiv preprint arXiv:2502.15411},
50
+ year={2025}
51
+ }
52
+ ```