Upload README.md with huggingface_hub

b7376f0 verified 12 months ago

3.81 kB

license: cc-by-nc-sa-4.0
datasets:
  - gtfintechlab/bank_of_japan
language:
  - en
metrics:
  - accuracy
  - f1
  - precision
  - recall
base_model:
  - roberta-base
pipeline_tag: text-classification
library_name: transformers

World of Central Banks Model

Model Name: Bank of Japan Temporal Classification Model

Model Type: Text Classification

Language: English

License: CC-BY-NC-SA 4.0

Base Model: roberta-base

Dataset Used for Training: gtfintechlab/bank_of_japan

Model Overview

Bank of Japan Temporal Classification Model is a fine-tuned roberta-base model designed to classify text data on Temporal Classification. This label is annotated in the bank_of_japan dataset, which focuses on meeting minutes for the Bank of Japan.

Intended Use

This model is intended for researchers and practitioners working on subjective text classification for the Bank of Japan, particularly within financial and economic contexts. It is specifically designed to assess the Temporal Classification label, aiding in the analysis of subjective content in financial and economic communications.

How to Use

To utilize this model, load it using the Hugging Face transformers library:

from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification, AutoConfig

# Load tokenizer, model, and configuration
tokenizer = AutoTokenizer.from_pretrained("gtfintechlab/bank_of_japan", do_lower_case=True, do_basic_tokenize=True)
model = AutoModelForSequenceClassification.from_pretrained("gtfintechlab/bank_of_japan", num_labels=2)
config = AutoConfig.from_pretrained("gtfintechlab/bank_of_japan")

# Initialize text classification pipeline
classifier = pipeline('text-classification', model=model, tokenizer=tokenizer, config=config, framework="pt")

# Classify Temporal Classification
sentences = [
    "[Sentence 1]",
    "[Sentence 2]"
]
results = classifier(sentences, batch_size=128, truncation="only_first")

print(results)

In this script:

Tokenizer and Model Loading:
Loads the pre-trained tokenizer and model from gtfintechlab/bank_of_japan.
Configuration:
Loads model configuration parameters, including the number of labels.
Pipeline Initialization:
Initializes a text classification pipeline with the model, tokenizer, and configuration.
Classification:
Labels sentences based on Temporal Classification.

Ensure your environment has the necessary dependencies installed.

Label Interpretation

LABEL_0: Forward-looking; the sentence discusses future economic events or decisions.
LABEL_1: Not forward-looking; the sentence discusses past or current economic events or decisions.

Training Data

The model was trained on the bank_of_japan dataset, comprising annotated sentences from the Bank of Japan meeting minutes, labeled by Temporal Classification. The dataset includes training, validation, and test splits.

Citation

If you use this model in your research, please cite the bank_of_japan:

@article{WCBShahSukhaniPardawala,
  title={Words That Unite The World: A Unified Framework for Deciphering Global Central Bank Communications},
  author={Agam Shah, Siddhant Sukhani, Huzaifa Pardawala et al.},
  year={2025}
}

For more details, refer to the bank_of_japan dataset documentation.

Contact

For any bank_of_japan related issues and questions, please contact:

Huzaifa Pardawala: huzaifahp7[at]gatech[dot]edu
Siddhant Sukhani: ssukhani3[at]gatech[dot]edu
Agam Shah: ashah482[at]gatech[dot]edu