---
language:
- en
- uz
- ru
license: llama3.2
base_model: meta-llama/Llama-3.2-1B-Instruct
tags:
- payment-extraction
- financial-nlp
- lora
- fine-tuned
- llama-3.2
library_name: peft
---

# Payment Extraction Model (Llama 3.2-1B)

Fine-tuned Llama 3.2-1B-Instruct for extracting payment information from multilingual text (English, Uzbek, Russian).

## Model Details

- **Base Model**: `meta-llama/Llama-3.2-1B-Instruct`
- **Training Data**: 4,082 examples
- **Training Duration**: 5 epochs
- **Method**: LoRA (Low-Rank Adaptation)
- **Best Checkpoint**: Step 900 (validation loss: 0.384)
- **Trainable Parameters**: 0.9% (11.27M / 1.24B)

## Capabilities

Extracts structured payment information:
- **amount**: Payment amount
- **receiver_name**: Recipient name
- **receiver_inn**: Tax identification number
- **receiver_account**: Bank account number
- **mfo**: Bank code
- **payment_purpose**: Purpose of payment
- **purpose_code**: Payment purpose code
- **intent**: Classification (create_transaction, partial_create_transaction, list_transaction)

## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-1B-Instruct",
    device_map="auto",
    torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(base_model, "primel/aibama")
tokenizer = AutoTokenizer.from_pretrained("primel/aibama")

# Extract payment info
text = "Transfer 500000 to LLC Technopark, INN 123456789"
prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a payment extraction assistant. Extract payment information from text and return ONLY valid JSON.<|eot_id|><|start_header_id|>user<|end_header_id|>

{text}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

## Training Data Distribution

- **create_transaction**: 36.7% (1,500 examples)
- **partial_create_transaction**: 52.6% (2,148 examples)  
- **list_transaction**: 10.6% (434 examples)

## Performance

| Metric | Value |
|--------|-------|
| Training Loss | 0.3785 |
| Validation Loss | 0.3844 |
| Mean Token Accuracy | 92.59% |
| Entropy | 0.424 |

## Limitations

- Optimized for payment-related text in English, Uzbek, and Russian
- May require base model access (Llama 3.2 license)
- Best performance on structured payment instructions

## Citation
```bibtex
@misc{payment-extractor-llama32,
  author = {Your Name},
  title = {Payment Extraction Model},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/primel/aibama}
}
```