Financial Language Model

Custom-trained Transformer for financial text generation.

Model Details

Architecture: 6-layer Transformer
Parameters: ~12M
Vocabulary: 20,000 words
Training Data: 1 GB balanced financial corpus (168M words)
Validation Loss: 4.01
Modern Content: 87%

Training Data Composition

Financial news (2015-2024): 750 MB (87%)
Classical economics: 35 MB (4%)
Wikipedia/Academic: 15 MB (2%)

Usage

from huggingface_hub import hf_hub_download
import torch
import pickle

# Download files
model_path = hf_hub_download(repo_id="Nikilesh9/financial-language-model", filename="transformer_1gb_balanced_best.pth")
dataset_path = hf_hub_download(repo_id="Nikilesh9/financial-language-model", filename="mega_word_dataset.pkl")

# Load dataset
with open(dataset_path, 'rb') as f:
    dataset = pickle.load(f)

# Load model
checkpoint = torch.load(model_path, map_location='cpu')
# ... create and load model ...

Files

transformer_1gb_balanced_best.pth - Model checkpoint (50 MB)
mega_word_dataset.pkl - Preprocessed dataset (2.2 GB)

Training Details

Hardware: Google Colab TPU v2
Training Time: 7.5 hours
Epochs: 30
Batch Size: 512
Learning Rate: 0.0003 (adaptive)

Project

Full project: https://github.com/Nikilesh9/language-model-evolution

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support