--- language: - en - uz - ru license: llama3.2 base_model: meta-llama/Llama-3.2-1B-Instruct tags: - payment-extraction - financial-nlp - lora - fine-tuned - llama-3.2 library_name: peft --- # Payment Extraction Model (Llama 3.2-1B) Fine-tuned Llama 3.2-1B-Instruct for extracting payment information from multilingual text (English, Uzbek, Russian). ## Model Details - **Base Model**: `meta-llama/Llama-3.2-1B-Instruct` - **Training Data**: 4,082 examples - **Training Duration**: 5 epochs - **Method**: LoRA (Low-Rank Adaptation) - **Best Checkpoint**: Step 900 (validation loss: 0.384) - **Trainable Parameters**: 0.9% (11.27M / 1.24B) ## Capabilities Extracts structured payment information: - **amount**: Payment amount - **receiver_name**: Recipient name - **receiver_inn**: Tax identification number - **receiver_account**: Bank account number - **mfo**: Bank code - **payment_purpose**: Purpose of payment - **purpose_code**: Payment purpose code - **intent**: Classification (create_transaction, partial_create_transaction, list_transaction) ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel import torch # Load model base_model = AutoModelForCausalLM.from_pretrained( "meta-llama/Llama-3.2-1B-Instruct", device_map="auto", torch_dtype=torch.bfloat16 ) model = PeftModel.from_pretrained(base_model, "primel/aibama") tokenizer = AutoTokenizer.from_pretrained("primel/aibama") # Extract payment info text = "Transfer 500000 to LLC Technopark, INN 123456789" prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a payment extraction assistant. Extract payment information from text and return ONLY valid JSON.<|eot_id|><|start_header_id|>user<|end_header_id|> {text}<|eot_id|><|start_header_id|>assistant<|end_header_id|> """ inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.1) response = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ## Training Data Distribution - **create_transaction**: 36.7% (1,500 examples) - **partial_create_transaction**: 52.6% (2,148 examples) - **list_transaction**: 10.6% (434 examples) ## Performance | Metric | Value | |--------|-------| | Training Loss | 0.3785 | | Validation Loss | 0.3844 | | Mean Token Accuracy | 92.59% | | Entropy | 0.424 | ## Limitations - Optimized for payment-related text in English, Uzbek, and Russian - May require base model access (Llama 3.2 license) - Best performance on structured payment instructions ## Citation ```bibtex @misc{payment-extractor-llama32, author = {Your Name}, title = {Payment Extraction Model}, year = {2024}, publisher = {HuggingFace}, url = {https://huggingface.co/primel/aibama} } ```