Financial Language Model

Custom-trained Transformer for financial text generation.

Model Details

  • Architecture: 6-layer Transformer
  • Parameters: ~12M
  • Vocabulary: 20,000 words
  • Training Data: 1 GB balanced financial corpus (168M words)
  • Validation Loss: 4.01
  • Modern Content: 87%

Training Data Composition

  • Financial news (2015-2024): 750 MB (87%)
  • Classical economics: 35 MB (4%)
  • Wikipedia/Academic: 15 MB (2%)

Usage

from huggingface_hub import hf_hub_download
import torch
import pickle

# Download files
model_path = hf_hub_download(repo_id="Nikilesh9/financial-language-model", filename="transformer_1gb_balanced_best.pth")
dataset_path = hf_hub_download(repo_id="Nikilesh9/financial-language-model", filename="mega_word_dataset.pkl")

# Load dataset
with open(dataset_path, 'rb') as f:
    dataset = pickle.load(f)

# Load model
checkpoint = torch.load(model_path, map_location='cpu')
# ... create and load model ...

Files

  • transformer_1gb_balanced_best.pth - Model checkpoint (50 MB)
  • mega_word_dataset.pkl - Preprocessed dataset (2.2 GB)

Training Details

  • Hardware: Google Colab TPU v2
  • Training Time: 7.5 hours
  • Epochs: 30
  • Batch Size: 512
  • Learning Rate: 0.0003 (adaptive)

Project

Full project: https://github.com/Nikilesh9/language-model-evolution

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support