README.md · Akhenaton/sft_banking

sft_banking_model / README.md

Akhenaton

Update README.md

5f991a5 verified 5 months ago

preview code

raw

history blame contribute delete

5.82 kB

	---
	base_model: unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- finance
	- banking
	- rag
	- conversational-ai
	- lora
	license: apache-2.0
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	---

	# Banking AI Assistant - Llama 3.2 1B Fine-tuned

	<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>

	A specialized banking and financial AI assistant fine-tuned on the T2-RAGBench dataset for conversational RAG tasks. This model excels at analyzing financial documents, answering banking-related questions, and providing detailed insights from financial reports.

	## Model Details

	- Developed by: Akhenaton
	- Model Type: Causal Language Model (Llama 3.2 1B)
	- License: Apache 2.0
	- Base Model: unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- Training Framework: Unsloth + Hugging Face TRL
	- Quantization: 4-bit (BitsAndBytes)

	## Training Details

	### Dataset
	- Source: [G4KMU/t2-ragbench](https://huggingface.co/datasets/G4KMU/t2-ragbench) (ConvFinQA subset)
	- Size: 32,908 context-independent QA pairs from 9,000+ financial documents
	- Domains: FinQA, ConvFinQA, VQAonBD, TAT-DQA
	- Focus: Financial documents with text and tables from SEC filings

	### Training Configuration
	```yaml
	LoRA Parameters:
	r: 16
	lora_alpha: 16
	lora_dropout: 0
	target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]

	Training Setup:
	max_seq_length: 2048
	per_device_train_batch_size: 2
	gradient_accumulation_steps: 4
	max_steps: 60
	learning_rate: 2e-4
	optimizer: adamw_8bit
	lr_scheduler_type: cosine
	weight_decay: 0.01
	```

	## Intended Use

	### Primary Use Cases
	- Financial Document Analysis: Extract insights from financial reports, SEC filings, and earnings statements
	- Banking Q&A: Answer questions about financial concepts, regulations, and banking operations
	- Conversational RAG: Provide context-aware responses based on financial document context
	- Financial Research: Assist with financial research and analysis tasks

	### Conversation Format
	```
	<\|begin_of_text\|><\|start_header_id\|>system<\|end_header_id\|>

	You are a specialized banking AI assistant. Analyze financial documents and provide accurate, detailed answers based on the given context. Focus on numerical accuracy and financial terminology.<\|eot_id\|><\|start_header_id\|>user<\|end_header_id\|>

	Financial Document Context:
	{context}

	Question: {question}<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>

	{response}<\|eot_id\|>
	```

	## Usage

	### Quick Start
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load model and tokenizer
	model = AutoModelForCausalLM.from_pretrained("Akhenaton/sft_banking_model")
	tokenizer = AutoTokenizer.from_pretrained("Akhenaton/sft_banking_model")

	# Prepare conversation
	messages = [
	{"role": "user", "content": "Explain the key financial metrics in quarterly earnings."}
	]

	# Generate response
	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
	outputs = model.generate(inputs, max_new_tokens=128, temperature=1.5, min_p=0.1)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	### With Unsloth (Recommended - 2x faster)
	```python
	from unsloth import FastLanguageModel

	model, tokenizer = FastLanguageModel.from_pretrained(
	"Akhenaton/sft_banking_model",
	max_seq_length=2048,
	dtype=None,
	load_in_4bit=True
	)
	FastLanguageModel.for_inference(model) # Enable fast inference
	```

	## Available Formats

	This model is available in multiple quantization formats:
	- q4_k_m: Recommended for most use cases
	- q8_0: Higher quality, more resource intensive
	- q5_k_m: Balanced quality and efficiency
	- f16: Full precision for maximum accuracy

	## Performance

	- Training Speed: 2x faster with Unsloth optimization
	- Memory Efficiency: 4-bit quantization reduces VRAM requirements
	- Inference Speed: Optimized for fast response generation
	- Accuracy: Specialized for financial domain with >80% context-independent Q&A capability

	## Limitations

	- Domain Specific: Optimized for financial/banking content, may have reduced performance on general topics
	- Training Size: Limited to 60 training steps - further training may improve performance
	- Context Length: Maximum sequence length of 2048 tokens
	- Language: English only
	- Numerical Reasoning: While improved for financial calculations, complex mathematical operations may require verification

	## Ethical Considerations

	- Financial Advice: This model should not be used as a substitute for professional financial advice
	- Data Source: Trained on public SEC filings and financial documents
	- Bias: May reflect biases present in financial reporting and documentation
	- Verification: Always verify numerical calculations and financial information from authoritative sources

	## Citation

	If you use this model in your research or applications, please consider citing:

	```bibtex
	@misc{akhenaton2025sft_banking_model,
	author = {Akhenaton},
	title = {Banking AI Assistant - Llama 3.2 1B Fine-tuned},
	year = {2025},
	url = {https://huggingface.co/Akhenaton/sft_banking_model},
	note = {Fine-tuned with Unsloth on T2-RAGBench dataset}
	}
	```

	## Acknowledgments

	- Unsloth Team for the optimized training framework
	- Meta AI for the Llama 3.2 base model
	- G4KMU for the T2-RAGBench dataset
	- Hugging Face for the transformers library and model hosting

	---

	This model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library.