README.md · UVLabs/HyperLLM-4b at main

HyperLLM-4b / README.md

bebis1

Update README with corrected eval results (extraction bug fix)

b45908b verified about 21 hours ago

preview code

raw

history blame contribute delete

8.02 kB

	---
	base_model: Qwen/Qwen3-4B-Instruct-2507
	library_name: peft
	license: apache-2.0
	language:
	- en
	tags:
	- trading
	- finance
	- hyperliquid
	- perpetuals
	- defi
	- lora
	- dpo
	- sft
	- trl
	- base_model:adapter:Qwen/Qwen3-4B-Instruct-2507
	model_name: HyperLLM-4b
	pipeline_tag: text-generation
	---

	# HyperLLM-4b v0.3

	A specialized 4B parameter language model fine-tuned for Hyperliquid perpetual DEX trading assistance. Built on Qwen3-4B-Instruct using LoRA + DPO training.

	## Model Description

	HyperLLM is designed to assist with:
	- Position sizing calculations - Risk-based position sizing with proper decimal handling
	- API structure understanding - Hyperliquid exchange API request/response formats
	- Trading mechanics - Perpetual futures concepts, margin modes, order types
	- Parameter validation - Validating trade parameters against exchange constraints
	- Edge case handling - Boundary conditions and unusual trading scenarios

	## Version History

	### v0.3 (Current - March 6, 2026)

	Training Pipeline: SFT (7,028 examples) + DPO (1,400 preference pairs)

	\| Change \| v0.2 \| v0.3 \| Impact \|
	\|--------\|------\|------\|--------\|
	\| Learning Rate \| 3e-5 \| 1e-5 \| Reduced catastrophic forgetting \|
	\| Quantization \| QLoRA 4-bit \| Full LoRA \| Better quality on A100 \|
	\| General Data Mix \| 10% \| 25% \| Preserved general capabilities \|
	\| Training Stage \| SFT only \| SFT + DPO \| Targeted behavioral fixes \|
	\| Eval Questions \| 297 \| 337 \| More comprehensive testing \|

	Key Improvements over v0.2:
	- Recovered parameter validation: 73.3% → 93.3% (+20%)
	- Recovered edge cases: 75.0% → 92.5% (+17.5%)
	- Improved adversarial handling: 36.9% → 59.0% (+22.1%)
	- Improved general capability: 83.6% → 90.9% (+7.3%)
	- Major API structure gain: 42.5% → 44.2% (+1.7%)

	### v0.2 (March 4, 2026)

	Training Pipeline: QLoRA SFT only

	\| Metric \| Baseline \| v0.2 \| Change \|
	\|--------\|----------\|------\|--------\|
	\| Overall \| 70.2% \| 65.0% \| -5.2% \|
	\| Factual Knowledge \| 33.3% \| 80.0% \| +46.7% \|
	\| Parameter Validation \| 93.3% \| 73.3% \| -20.0% \|
	\| Edge Cases \| 92.5% \| 75.0% \| -17.5% \|

	Issues: Catastrophic forgetting caused regressions in safety-critical categories despite massive factual knowledge gains.

	### v0.1 (February 28, 2026)

	Training Pipeline: QLoRA SFT (1,823 examples)

	\| Metric \| Baseline \| v0.1 \| Change \|
	\|--------\|----------\|------\|--------\|
	\| Overall \| 36.0% \| 64.0% \| +28% \|
	\| Factual Knowledge \| 20.0% \| 70.0% \| +50% \|
	\| API Structure \| 16.7% \| 50.0% \| +33% \|

	Issues: Small eval set (25 questions), parameter validation regressed.

	## Evaluation Results (v0.3)

	Evaluated on 337 questions across 9 categories:

	Note: Results updated March 6, 2026 after fixing an eval extraction bug that was extracting restated question values instead of computed answers.

	\| Category \| Baseline \| v0.3 \| Change \|
	\|----------\|----------\|------\|--------\|
	\| Parameter Validation \| 93.3% \| 93.3% \| Maintained \|
	\| Edge Cases \| 95.0% \| 92.5% \| -2.5% \|
	\| General Capability \| 89.1% \| 90.9% \| +1.8% \|
	\| Position Sizing \| 83.3% \| 88.3% \| +5.0% \|
	\| Trading Mechanics \| 80.0% \| 80.0% \| Maintained \|
	\| Adversarial % \| 57.0% \| 59.0% \| +2.0% \|
	\| Multi-step \| 43.0% \| 39.3% \| -3.7% \|
	\| API Structure \| 27.5% \| 44.2% \| +16.7% \|
	\| Factual \| 26.7% \| 40.0% \| +13.3% \|
	\| Overall \| 70.1% \| 72.4% \| +2.3% \|

	## Training Configuration

	### LoRA Parameters
	```python
	{
	"r": 64,
	"lora_alpha": 128,
	"lora_dropout": 0.05,
	"target_modules": ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
	"use_rslora": True
	}
	```

	### SFT Hyperparameters
	```python
	{
	"learning_rate": 1e-5,
	"epochs": 5, # Early stopped at 1.52
	"batch_size": 4,
	"gradient_accumulation_steps": 2,
	"warmup_ratio": 0.10,
	"max_length": 4096
	}
	```

	### DPO Hyperparameters
	```python
	{
	"beta": 0.1,
	"learning_rate": 5e-7,
	"epochs": 2,
	"batch_size": 4,
	"max_length": 2048
	}
	```

	### Training Data Distribution

	SFT (7,028 examples):

	\| Category \| Examples \| % \|
	\|----------\|----------\|---\|
	\| General Instruction \| 1,500 \| 21.3% \|
	\| Position Sizing \| 800 \| 11.4% \|
	\| Parameter Validation \| 800 \| 11.4% \|
	\| Adversarial Percentages \| 600 \| 8.5% \|
	\| Multi-step Reasoning \| 500 \| 7.1% \|
	\| Edge Cases \| 400 \| 5.7% \|
	\| API Examples \| 400 \| 5.7% \|
	\| Knowledge Q&A \| 373 \| 5.3% \|
	\| Other \| 1,655 \| 23.6% \|

	DPO (1,400 preference pairs):

	\| Failure Mode \| Pairs \| % \|
	\|--------------\|-------\|---\|
	\| Excessive Leverage \| 370 \| 26.4% \|
	\| Position Sizing \| 330 \| 23.6% \|
	\| Percentage Confusion \| 226 \| 16.1% \|
	\| Risk Violation \| 195 \| 13.9% \|
	\| Policy Bypass \| 140 \| 10.0% \|
	\| Uncertainty Caution \| 139 \| 9.9% \|

	## Usage

	### With Transformers + PEFT

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen3-4B-Instruct-2507",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)

	# Load LoRA adapter
	model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
	tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")

	# Generate
	messages = [{"role": "user", "content": "Calculate position size for $10,000 account, 2% risk, entry $50, stop loss $48"}]
	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
	outputs = model.generate(inputs, max_new_tokens=256)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### With 4-bit Quantization (Low VRAM)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel
	import torch

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.bfloat16,
	)

	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen3-4B-Instruct-2507",
	quantization_config=bnb_config,
	device_map="auto",
	)

	model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
	tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")
	```

	## Example Use Cases

	### Position Sizing
	```
	User: I have a $50,000 account and want to risk 2% on an ETH trade.
	Entry at $3,450, stop loss at $3,400. What's my position size?

	Model: Risk amount = $50,000 * 2% = $1,000
	Stop distance = $3,450 - $3,400 = $50
	Position size = $1,000 / $50 = 20 ETH
	```

	### Parameter Validation
	```
	User: Is a position size of 1.234 valid for ETH (szDecimals=2)?

	Model: No, 1.234 has 3 decimal places but ETH only allows 2 (szDecimals=2).
	Valid sizes would be 1.23 or 1.24.
	```

	### Adversarial Percentage Handling
	```
	User: Increase my position by 50%

	Model: I need clarification. "Increase by 50%" could mean:
	1. Add 50% of current position (1.5x total)
	2. Increase to 50% of account (absolute)
	Which do you mean?
	```

	## Limitations

	- Multi-step Reasoning: 39.3% accuracy - complex multi-step calculations are challenging for 4B model
	- API Structure: 44.2% accuracy - improved but still needs work on exact JSON field names
	- Adversarial %: 59.0% accuracy - better handling but still susceptible to tricky percentage phrasing

	## Hardware Requirements

	\| Mode \| VRAM \| Notes \|
	\|------\|------\|-------\|
	\| bfloat16 \| ~10GB \| Full precision inference \|
	\| 4-bit \| ~4GB \| Quantized inference \|
	\| 8-bit \| ~6GB \| INT8 quantization \|

	## Training Hardware

	- Hardware: NVIDIA A100 80GB SXM
	- SFT Duration: ~20 minutes
	- DPO Duration: ~17 minutes
	- Total Cost: ~$1.50 (RunPod)

	## Framework Versions

	- PEFT: 0.18.1
	- TRL: 0.29.0
	- Transformers: 5.2.0
	- PyTorch: 2.10.0

	## License

	Apache 2.0

	## Citation

	```bibtex
	@misc{hyperllm2026,
	title={HyperLLM: A Specialized LLM for Hyperliquid Trading},
	author={UVLabs},
	year={2026},
	url={https://huggingface.co/UVLabs/HyperLLM-4b}
	}
	```