Instructions to use FinanceInc/finbert-pretrain with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FinanceInc/finbert-pretrain with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="FinanceInc/finbert-pretrain")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("FinanceInc/finbert-pretrain") model = AutoModelForMaskedLM.from_pretrained("FinanceInc/finbert-pretrain") - Notebooks
- Google Colab
- Kaggle
Commit ·
e02aba1
1
Parent(s): ab93dff
Updated model card
Browse files
README.md
CHANGED
|
@@ -19,8 +19,6 @@ It is trained on the following three financial communication corpus. The total c
|
|
| 19 |
- Corporate Reports 10-K & 10-Q: 2.5B tokens
|
| 20 |
- Earnings Call Transcripts: 1.3B tokens
|
| 21 |
- Analyst Reports: 1.1B tokens
|
| 22 |
-
- Demo.org Proprietary Reports
|
| 23 |
-
- Additional purchased data from Factset
|
| 24 |
|
| 25 |
The entire training is done using an **NVIDIA DGX-1** machine. The server has 4 Tesla P100 GPUs, providing a total of 128 GB of GPU memory. This machine enables us to train the BERT models using a batch size of 128. We utilize Horovord framework for multi-GPU training. Overall, the total time taken to perform pretraining for one model is approximately **2 days**.
|
| 26 |
|
|
|
|
| 19 |
- Corporate Reports 10-K & 10-Q: 2.5B tokens
|
| 20 |
- Earnings Call Transcripts: 1.3B tokens
|
| 21 |
- Analyst Reports: 1.1B tokens
|
|
|
|
|
|
|
| 22 |
|
| 23 |
The entire training is done using an **NVIDIA DGX-1** machine. The server has 4 Tesla P100 GPUs, providing a total of 128 GB of GPU memory. This machine enables us to train the BERT models using a batch size of 128. We utilize Horovord framework for multi-GPU training. Overall, the total time taken to perform pretraining for one model is approximately **2 days**.
|
| 24 |
|