| | --- |
| | base_model: Qwen/Qwen2.5-Coder-32B-Instruct |
| | tags: |
| | - Rust |
| | - Hyperswitch |
| | - LoRA |
| | - CPT |
| | - Fine-Tuned |
| | - Causal-LM |
| | pipeline_tag: text-generation |
| | language: |
| | - en |
| | --- |
| | |
| | # Qwen2.5-Coder-32B-Instruct-CPT-LoRA-Adapter-HyperSwitch |
| |
|
| | A LoRA fine-tuned model based on **Qwen/Qwen2.5-Coder-32B-Instruct** specialized for the [Hyperswitch](https://github.com/juspay/hyperswitch) Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices. |
| |
|
| | ## π― Model Description |
| |
|
| | This LoRA adapter was trained on **16,731 samples** extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain. |
| |
|
| | - **Base Model**: Qwen/Qwen2.5-Coder-32B-Instruct |
| | - **Training Type**: Causal Language Modeling (CLM) with LoRA |
| | - **Domain**: Payment Processing, Rust Development |
| | - **Specialization**: Hyperswitch codebase patterns and architecture |
| |
|
| | ## π Training Details |
| |
|
| | ### Dataset Composition |
| | - **Total Samples**: 16,731 |
| | - **File-level samples**: 2,120 complete files |
| | - **Granular samples**: 14,611 extracted components |
| | - Functions: 4,121 |
| | - Structs: 5,710 |
| | - Traits: 223 |
| | - Implementations: 4,296 |
| | - Modules: 261 |
| |
|
| | ### LoRA Configuration |
| | ```yaml |
| | r: 64 # LoRA rank |
| | alpha: 128 # LoRA alpha (2*r) |
| | dropout: 0.05 # LoRA dropout |
| | target_modules: # Applied to all linear layers |
| | - q_proj, k_proj, v_proj, o_proj |
| | - gate_proj, up_proj, down_proj |
| | ``` |
| |
|
| | ### Training Hyperparameters |
| | - **Epochs**: 5 |
| | - **Batch Size**: 2 per device (16 effective with gradient accumulation) |
| | - **Learning Rate**: 5e-5 (cosine schedule) |
| | - **Max Context**: 8,192 tokens |
| | - **Hardware**: 2x NVIDIA H200 (80GB each) |
| | - **Training Time**: ~4 hours (2,355 steps) |
| |
|
| | ### Training Results |
| | ``` |
| | Final Loss: 0.48 (from 1.63) |
| | Perplexity: 1.59 (from 5.12) |
| | Accuracy: 89% (from 61%) |
| | ``` |
| |
|
| | ## π Usage |
| |
|
| | ### Quick Start |
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | from peft import PeftModel |
| | import torch |
| | |
| | # Load base model |
| | base_model = AutoModelForCausalLM.from_pretrained( |
| | "Qwen/Qwen2.5-Coder-32B-Instruct", |
| | dtype=torch.bfloat16, |
| | device_map="auto" |
| | ) |
| | |
| | # Load tokenizer |
| | tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct") |
| | |
| | # Load LoRA adapter |
| | model = PeftModel.from_pretrained(base_model, "juspay/Qwen2.5-Coder-32B-Instruct-CPT-LoRA-Adapter-HyperSwitch") |
| | |
| | # Generate code |
| | prompt = """// Hyperswitch payment processing |
| | pub fn validate_payment_method(""" |
| | |
| | inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| | outputs = model.generate( |
| | **inputs, |
| | max_new_tokens=200, |
| | temperature=0.2, # Lower temperature for code generation |
| | do_sample=True, |
| | pad_token_id=tokenizer.eos_token_id |
| | ) |
| | |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ### Recommended Settings |
| | - **Temperature**: 0.2-0.3 for code generation |
| | - **Temperature**: 0.5-0.7 for explanations and documentation |
| | - **Max tokens**: 512-1024 for most tasks |
| |
|
| | ## π οΈ Technical Specifications |
| |
|
| | - **Context Window**: 8,192 tokens |
| | - **Precision**: bfloat16 |
| | - **Memory Usage**: ~78GB VRAM (32B base model) |
| | - **Inference Speed**: Optimized with Flash Attention 2 |
| |
|
| |
|
| |
|
| | ## π Acknowledgments |
| |
|
| | - **Qwen Team** for the excellent Qwen2.5-Coder base model |
| | - **Hyperswitch Team** for the open-source payment processing platform |
| | - **Hugging Face** for the transformers and PEFT libraries |
| |
|
| | ## π Citation |
| |
|
| | ```bibtex |
| | @misc{hyperswitch-qwen-lora-2024, |
| | title={Qwen2.5-Coder-32B-Instruct-CPT-LoRA-Adapter-HyperSwitch}, |
| | author={Juspay}, |
| | year={2024}, |
| | publisher={Hugging Face}, |
| | url={https://huggingface.co/juspay/Qwen2.5-Coder-32B-Instruct-CPT-LoRA-Adapter-HyperSwitch} |
| | } |
| | ``` |