--- base_model: Qwen/Qwen3-4B-Instruct-2507 library_name: peft license: apache-2.0 language: - en tags: - trading - finance - hyperliquid - perpetuals - defi - lora - dpo - sft - trl - base_model:adapter:Qwen/Qwen3-4B-Instruct-2507 model_name: HyperLLM-4b pipeline_tag: text-generation --- # HyperLLM-4b v0.3 A specialized 4B parameter language model fine-tuned for Hyperliquid perpetual DEX trading assistance. Built on Qwen3-4B-Instruct using LoRA + DPO training. ## Model Description HyperLLM is designed to assist with: - **Position sizing calculations** - Risk-based position sizing with proper decimal handling - **API structure understanding** - Hyperliquid exchange API request/response formats - **Trading mechanics** - Perpetual futures concepts, margin modes, order types - **Parameter validation** - Validating trade parameters against exchange constraints - **Edge case handling** - Boundary conditions and unusual trading scenarios ## Version History ### v0.3 (Current - March 6, 2026) **Training Pipeline:** SFT (7,028 examples) + DPO (1,400 preference pairs) | Change | v0.2 | v0.3 | Impact | |--------|------|------|--------| | Learning Rate | 3e-5 | 1e-5 | Reduced catastrophic forgetting | | Quantization | QLoRA 4-bit | Full LoRA | Better quality on A100 | | General Data Mix | 10% | 25% | Preserved general capabilities | | Training Stage | SFT only | SFT + DPO | Targeted behavioral fixes | | Eval Questions | 297 | 337 | More comprehensive testing | **Key Improvements over v0.2:** - Recovered parameter validation: 73.3% → **93.3%** (+20%) - Recovered edge cases: 75.0% → **92.5%** (+17.5%) - Improved adversarial handling: 36.9% → **59.0%** (+22.1%) - Improved general capability: 83.6% → **90.9%** (+7.3%) - Major API structure gain: 42.5% → **44.2%** (+1.7%) ### v0.2 (March 4, 2026) **Training Pipeline:** QLoRA SFT only | Metric | Baseline | v0.2 | Change | |--------|----------|------|--------| | Overall | 70.2% | 65.0% | -5.2% | | Factual Knowledge | 33.3% | **80.0%** | **+46.7%** | | Parameter Validation | 93.3% | 73.3% | -20.0% | | Edge Cases | 92.5% | 75.0% | -17.5% | **Issues:** Catastrophic forgetting caused regressions in safety-critical categories despite massive factual knowledge gains. ### v0.1 (February 28, 2026) **Training Pipeline:** QLoRA SFT (1,823 examples) | Metric | Baseline | v0.1 | Change | |--------|----------|------|--------| | Overall | 36.0% | **64.0%** | **+28%** | | Factual Knowledge | 20.0% | **70.0%** | **+50%** | | API Structure | 16.7% | **50.0%** | **+33%** | **Issues:** Small eval set (25 questions), parameter validation regressed. ## Evaluation Results (v0.3) Evaluated on 337 questions across 9 categories: *Note: Results updated March 6, 2026 after fixing an eval extraction bug that was extracting restated question values instead of computed answers.* | Category | Baseline | v0.3 | Change | |----------|----------|------|--------| | Parameter Validation | 93.3% | **93.3%** | Maintained | | Edge Cases | 95.0% | **92.5%** | -2.5% | | General Capability | 89.1% | **90.9%** | +1.8% | | Position Sizing | 83.3% | **88.3%** | **+5.0%** | | Trading Mechanics | 80.0% | **80.0%** | Maintained | | Adversarial % | 57.0% | **59.0%** | **+2.0%** | | Multi-step | 43.0% | **39.3%** | -3.7% | | API Structure | 27.5% | **44.2%** | **+16.7%** | | Factual | 26.7% | **40.0%** | **+13.3%** | | **Overall** | **70.1%** | **72.4%** | **+2.3%** | ## Training Configuration ### LoRA Parameters ```python { "r": 64, "lora_alpha": 128, "lora_dropout": 0.05, "target_modules": ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], "use_rslora": True } ``` ### SFT Hyperparameters ```python { "learning_rate": 1e-5, "epochs": 5, # Early stopped at 1.52 "batch_size": 4, "gradient_accumulation_steps": 2, "warmup_ratio": 0.10, "max_length": 4096 } ``` ### DPO Hyperparameters ```python { "beta": 0.1, "learning_rate": 5e-7, "epochs": 2, "batch_size": 4, "max_length": 2048 } ``` ### Training Data Distribution **SFT (7,028 examples):** | Category | Examples | % | |----------|----------|---| | General Instruction | 1,500 | 21.3% | | Position Sizing | 800 | 11.4% | | Parameter Validation | 800 | 11.4% | | Adversarial Percentages | 600 | 8.5% | | Multi-step Reasoning | 500 | 7.1% | | Edge Cases | 400 | 5.7% | | API Examples | 400 | 5.7% | | Knowledge Q&A | 373 | 5.3% | | Other | 1,655 | 23.6% | **DPO (1,400 preference pairs):** | Failure Mode | Pairs | % | |--------------|-------|---| | Excessive Leverage | 370 | 26.4% | | Position Sizing | 330 | 23.6% | | Percentage Confusion | 226 | 16.1% | | Risk Violation | 195 | 13.9% | | Policy Bypass | 140 | 10.0% | | Uncertainty Caution | 139 | 9.9% | ## Usage ### With Transformers + PEFT ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch # Load base model base_model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-4B-Instruct-2507", torch_dtype=torch.bfloat16, device_map="auto", ) # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b") tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b") # Generate messages = [{"role": "user", "content": "Calculate position size for $10,000 account, 2% risk, entry $50, stop loss $48"}] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) outputs = model.generate(inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### With 4-bit Quantization (Low VRAM) ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import PeftModel import torch bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, ) base_model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-4B-Instruct-2507", quantization_config=bnb_config, device_map="auto", ) model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b") tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b") ``` ## Example Use Cases ### Position Sizing ``` User: I have a $50,000 account and want to risk 2% on an ETH trade. Entry at $3,450, stop loss at $3,400. What's my position size? Model: Risk amount = $50,000 * 2% = $1,000 Stop distance = $3,450 - $3,400 = $50 Position size = $1,000 / $50 = 20 ETH ``` ### Parameter Validation ``` User: Is a position size of 1.234 valid for ETH (szDecimals=2)? Model: No, 1.234 has 3 decimal places but ETH only allows 2 (szDecimals=2). Valid sizes would be 1.23 or 1.24. ``` ### Adversarial Percentage Handling ``` User: Increase my position by 50% Model: I need clarification. "Increase by 50%" could mean: 1. Add 50% of current position (1.5x total) 2. Increase to 50% of account (absolute) Which do you mean? ``` ## Limitations - **Multi-step Reasoning:** 39.3% accuracy - complex multi-step calculations are challenging for 4B model - **API Structure:** 44.2% accuracy - improved but still needs work on exact JSON field names - **Adversarial %:** 59.0% accuracy - better handling but still susceptible to tricky percentage phrasing ## Hardware Requirements | Mode | VRAM | Notes | |------|------|-------| | bfloat16 | ~10GB | Full precision inference | | 4-bit | ~4GB | Quantized inference | | 8-bit | ~6GB | INT8 quantization | ## Training Hardware - **Hardware:** NVIDIA A100 80GB SXM - **SFT Duration:** ~20 minutes - **DPO Duration:** ~17 minutes - **Total Cost:** ~$1.50 (RunPod) ## Framework Versions - PEFT: 0.18.1 - TRL: 0.29.0 - Transformers: 5.2.0 - PyTorch: 2.10.0 ## License Apache 2.0 ## Citation ```bibtex @misc{hyperllm2026, title={HyperLLM: A Specialized LLM for Hyperliquid Trading}, author={UVLabs}, year={2026}, url={https://huggingface.co/UVLabs/HyperLLM-4b} } ```