File size: 5,815 Bytes
d6a353c 5f991a5 d6a353c 5f991a5 d6a353c 5f991a5 d6a353c 5f991a5 d6a353c 5f991a5 d6a353c 5f991a5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
---
base_model: unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- finance
- banking
- rag
- conversational-ai
- lora
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
---
# Banking AI Assistant - Llama 3.2 1B Fine-tuned
<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>
A specialized banking and financial AI assistant fine-tuned on the T2-RAGBench dataset for conversational RAG tasks. This model excels at analyzing financial documents, answering banking-related questions, and providing detailed insights from financial reports.
## Model Details
- **Developed by:** Akhenaton
- **Model Type:** Causal Language Model (Llama 3.2 1B)
- **License:** Apache 2.0
- **Base Model:** unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Training Framework:** Unsloth + Hugging Face TRL
- **Quantization:** 4-bit (BitsAndBytes)
## Training Details
### Dataset
- **Source:** [G4KMU/t2-ragbench](https://huggingface.co/datasets/G4KMU/t2-ragbench) (ConvFinQA subset)
- **Size:** 32,908 context-independent QA pairs from 9,000+ financial documents
- **Domains:** FinQA, ConvFinQA, VQAonBD, TAT-DQA
- **Focus:** Financial documents with text and tables from SEC filings
### Training Configuration
```yaml
LoRA Parameters:
r: 16
lora_alpha: 16
lora_dropout: 0
target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
Training Setup:
max_seq_length: 2048
per_device_train_batch_size: 2
gradient_accumulation_steps: 4
max_steps: 60
learning_rate: 2e-4
optimizer: adamw_8bit
lr_scheduler_type: cosine
weight_decay: 0.01
```
## Intended Use
### Primary Use Cases
- **Financial Document Analysis:** Extract insights from financial reports, SEC filings, and earnings statements
- **Banking Q&A:** Answer questions about financial concepts, regulations, and banking operations
- **Conversational RAG:** Provide context-aware responses based on financial document context
- **Financial Research:** Assist with financial research and analysis tasks
### Conversation Format
```
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a specialized banking AI assistant. Analyze financial documents and provide accurate, detailed answers based on the given context. Focus on numerical accuracy and financial terminology.<|eot_id|><|start_header_id|>user<|end_header_id|>
Financial Document Context:
{context}
Question: {question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{response}<|eot_id|>
```
## Usage
### Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("Akhenaton/sft_banking_model")
tokenizer = AutoTokenizer.from_pretrained("Akhenaton/sft_banking_model")
# Prepare conversation
messages = [
{"role": "user", "content": "Explain the key financial metrics in quarterly earnings."}
]
# Generate response
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=128, temperature=1.5, min_p=0.1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
### With Unsloth (Recommended - 2x faster)
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
"Akhenaton/sft_banking_model",
max_seq_length=2048,
dtype=None,
load_in_4bit=True
)
FastLanguageModel.for_inference(model) # Enable fast inference
```
## Available Formats
This model is available in multiple quantization formats:
- **q4_k_m**: Recommended for most use cases
- **q8_0**: Higher quality, more resource intensive
- **q5_k_m**: Balanced quality and efficiency
- **f16**: Full precision for maximum accuracy
## Performance
- **Training Speed:** 2x faster with Unsloth optimization
- **Memory Efficiency:** 4-bit quantization reduces VRAM requirements
- **Inference Speed:** Optimized for fast response generation
- **Accuracy:** Specialized for financial domain with >80% context-independent Q&A capability
## Limitations
- **Domain Specific:** Optimized for financial/banking content, may have reduced performance on general topics
- **Training Size:** Limited to 60 training steps - further training may improve performance
- **Context Length:** Maximum sequence length of 2048 tokens
- **Language:** English only
- **Numerical Reasoning:** While improved for financial calculations, complex mathematical operations may require verification
## Ethical Considerations
- **Financial Advice:** This model should not be used as a substitute for professional financial advice
- **Data Source:** Trained on public SEC filings and financial documents
- **Bias:** May reflect biases present in financial reporting and documentation
- **Verification:** Always verify numerical calculations and financial information from authoritative sources
## Citation
If you use this model in your research or applications, please consider citing:
```bibtex
@misc{akhenaton2025sft_banking_model,
author = {Akhenaton},
title = {Banking AI Assistant - Llama 3.2 1B Fine-tuned},
year = {2025},
url = {https://huggingface.co/Akhenaton/sft_banking_model},
note = {Fine-tuned with Unsloth on T2-RAGBench dataset}
}
```
## Acknowledgments
- **Unsloth Team** for the optimized training framework
- **Meta AI** for the Llama 3.2 base model
- **G4KMU** for the T2-RAGBench dataset
- **Hugging Face** for the transformers library and model hosting
---
*This model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library.* |