HyperLLM-4B v0.4

A fine-tuned Qwen3-4B model specialized for agentic trading on Hyperliquid. This model is trained to handle position sizing calculations, risk management, and trading operations.

Model Details

Property Value
Base Model Qwen/Qwen3-4B-Instruct-2507
Parameters 4B (adapter only: ~134M trainable)
Training Method SFT + DPO
LoRA Rank 64
LoRA Alpha 128
Training Hardware NVIDIA A100-SXM4-80GB
Version 0.4

Training Techniques

DoRA (Weight-Decomposed Low-Rank Adaptation)

v0.4 introduces DoRA, which decomposes weights into magnitude and direction components. This provides:

  • Better fine-tuning stability
  • Improved task performance (+3-4% over standard LoRA)
  • More efficient parameter updates

rsLoRA (Rank-Stabilized LoRA)

Uses rank-stabilized scaling factor (lora_alpha / sqrt(r) instead of lora_alpha / r) for:

  • More stable training at higher ranks
  • Better gradient flow

DPO (Direct Preference Optimization)

Two-stage training pipeline:

  1. SFT Stage: Supervised fine-tuning on 6,782 examples (40% general, 60% domain-specific)
  2. DPO Stage: Preference alignment on 1,400 pairs targeting common failure modes

DPO pairs target these failure categories:

  • Excessive leverage requests (26.4%)
  • Position sizing errors (23.6%)
  • Percentage confusion (16.1%)
  • Risk policy violations (13.9%)
  • Policy bypass attempts (10.0%)
  • Uncertainty/caution calibration (9.9%)

Performance (v0.3 โ†’ v0.4)

Overall Accuracy

Metric v0.3 v0.4 Delta
Graded Accuracy 67.4% 78.5% +11.1%
Full Correct 216/337 259/337 +43

Per-Category Results

Category v0.3 v0.4 Delta Notes
Parameter Validation 93.3% 100% +6.7% Perfect score
Edge Cases 92.5% 95.0% +2.5%
General Capability 89.1% 92.7% +3.6% No catastrophic forgetting
Position Sizing 83.3% 86.7% +3.4%
Adversarial % 53.5% 82.0% +28.5% Major improvement
Trading Mechanics 80.0% 80.0% 0% Maintained
Multi-step Reasoning 31.3% 41.0% +9.7%
Factual 20.0% 33.3% +13.3% Below target
API Structure 27.5% 10.8% -16.7% Regression

Key Improvements in v0.4

  1. Adversarial Percentage Handling (+28.5%)

    • Model now correctly distinguishes between "risk 2%", "allocate 2%", and "2x leverage"
    • DPO pairs specifically targeting percentage confusion were highly effective
  2. Multi-step Reasoning (+9.7%)

    • Model shows intermediate calculation steps
    • Better at complex position sizing scenarios
  3. General Capability Retention (+3.6%)

    • 40% general instruction mix prevented catastrophic forgetting
    • Base model reasoning capabilities preserved
  4. Perfect Parameter Validation (100%)

    • Tick sizes, lot sizes, precision rules mastered

Known Issues & Limitations

API Structure Regression (10.8%)

The model has limited knowledge of Hyperliquid-specific API fields:

  • Doesn't know abbreviated field names (a=asset, b=isBuy, s=size)
  • May use incorrect base URL (.net vs .xyz)
  • Invents non-existent endpoints

Mitigation: Use explicit API documentation in prompts or constrained decoding.

Factual Knowledge Gaps (33.3%)

Some Hyperliquid-specific facts are unreliable:

  • API URLs, WebSocket endpoints
  • Time-in-force options (ALO, IOC, GTC)
  • Fee structures, unstaking duration

Mitigation: Provide facts in system prompt for critical operations.

Multi-step Final Answer Extraction

Model sometimes returns intermediate values instead of final answers. When calculation reasoning is shown correctly but final answer is wrong:

  • Verify the calculation steps manually
  • Extract the correct value from the reasoning

Usage

With PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    torch_dtype="auto",
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")

messages = [
    {"role": "system", "content": "You are a trading assistant for Hyperliquid."},
    {"role": "user", "content": "I have $10,000 and want to risk 2%. Entry at $100, stop at $95. What's my position size?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With vLLM (Recommended for Production)

from vllm import LLM, SamplingParams

llm = LLM(
    model="Qwen/Qwen3-4B-Instruct-2507",
    enable_lora=True,
    max_lora_rank=64
)
llm.load_lora("UVLabs/HyperLLM-4b")

sampling_params = SamplingParams(temperature=0.1, max_tokens=512)
outputs = llm.generate(["Calculate position size..."], sampling_params)

Training Details

SFT Phase

  • Dataset: 6,782 examples (6,103 train / 679 val)
  • Epochs: 1.57 (early stopping triggered)
  • Final Eval Loss: 0.1324
  • Runtime: 66.8 minutes

DPO Phase

  • Dataset: 1,400 preference pairs (1,260 train / 140 val)
  • Beta: 0.05 (gentler KL penalty than v0.3's 0.1)
  • Epochs: 2.0
  • Final Reward Accuracy: 100%
  • Reward Margin: 11.30
  • Runtime: 29.8 minutes

Infrastructure

  • Unsloth 2x acceleration
  • Liger Kernel optimizations
  • TF32 enabled for A100
  • Padding-free training

Roadmap for v0.5

  1. Fix API Structure: Add 300+ API-specific training examples with correct field mappings
  2. Improve Factual Knowledge: Implement fact repetition (50+ variations per fact)
  3. Better Final Answer Extraction: Enforce "Final Answer: X" format
  4. Market Knowledge Injection: Add technical indicator and price action knowledge

Citation

@misc{hyperllm2026,
  title={HyperLLM: Fine-tuned LLM for Agentic Trading on Hyperliquid},
  author={UVLabs},
  year={2026},
  url={https://huggingface.co/UVLabs/HyperLLM-4b}
}

License

Apache 2.0

Disclaimer

This model is for research and educational purposes. It is not financial advice. Always verify calculations and consult qualified professionals before making trading decisions. The authors are not responsible for any losses incurred from using this model.

Downloads last month
91
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for UVLabs/HyperLLM-4b

Adapter
(5269)
this model