Financial Language Model
Custom-trained Transformer for financial text generation.
Model Details
- Architecture: 6-layer Transformer
- Parameters: ~12M
- Vocabulary: 20,000 words
- Training Data: 1 GB balanced financial corpus (168M words)
- Validation Loss: 4.01
- Modern Content: 87%
Training Data Composition
- Financial news (2015-2024): 750 MB (87%)
- Classical economics: 35 MB (4%)
- Wikipedia/Academic: 15 MB (2%)
Usage
from huggingface_hub import hf_hub_download
import torch
import pickle
# Download files
model_path = hf_hub_download(repo_id="Nikilesh9/financial-language-model", filename="transformer_1gb_balanced_best.pth")
dataset_path = hf_hub_download(repo_id="Nikilesh9/financial-language-model", filename="mega_word_dataset.pkl")
# Load dataset
with open(dataset_path, 'rb') as f:
dataset = pickle.load(f)
# Load model
checkpoint = torch.load(model_path, map_location='cpu')
# ... create and load model ...
Files
transformer_1gb_balanced_best.pth- Model checkpoint (50 MB)mega_word_dataset.pkl- Preprocessed dataset (2.2 GB)
Training Details
- Hardware: Google Colab TPU v2
- Training Time: 7.5 hours
- Epochs: 30
- Batch Size: 512
- Learning Rate: 0.0003 (adaptive)
Project
Full project: https://github.com/Nikilesh9/language-model-evolution
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support