Upload HyperLLM v0.2

Browse files

Files changed (11) hide show

README.md +140 -89
adapter_config.json +38 -45
adapter_model.safetensors +2 -2
added_tokens.json +28 -0
chat_template.jinja +60 -60
merges.txt +0 -0
special_tokens_map.json +31 -0
tokenizer.json +2 -2
tokenizer_config.json +239 -29
training_args.bin +2 -2
vocab.json +0 -0

README.md CHANGED Viewed

@@ -1,140 +1,191 @@
 ---
 base_model: Qwen/Qwen3-4B-Instruct-2507
 library_name: peft
-license: apache-2.0
-pipeline_tag: text-generation
 tags:
-- lora
-- sft
 - trading
 - hyperliquid
-- position-sizing
-- finance
-language:
-- en
 ---
-# HyperLLM-4b-0.1
-A LoRA fine-tune of [Qwen3-4B-Instruct](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) specialized for position sizing calculations, parameter validation, and API operations on the [Hyperliquid](https://hyperliquid.xyz) decentralized exchange.
-> **Note:** This is an experimental pre-release (0.1). Known issues include parameter validation regression. See [Limitations](#limitations).
 ## Model Description
-HyperLLM-4b-0.1 is a domain-adapted assistant for cryptocurrency trading on Hyperliquid. It is designed to:
-- **Calculate position sizes** based on account balance, risk percentage, entry price, and stop loss
-- **Validate trading parameters** against exchange constraints (lot sizes, leverage limits, price precision)
-- **Generate correctly formatted API calls** for Hyperliquid's REST and WebSocket APIs
-- **Answer questions** about Hyperliquid mechanics, margin modes, and order types
-This model is designed to sit within a harness - It provides recommendations that should be validated by application-layer safety checks and human oversight.
-## Intended Use
-- Position sizing calculations for risk-managed trading
-- Parameter validation before order submission
-- API call generation for Hyperliquid integration
-- Educational tool for understanding Hyperliquid mechanics
-## Out-of-Scope Use
-- Autonomous trade execution without human oversight
-- Financial advice or investment recommendations
-- Use with exchanges other than Hyperliquid (parameters are exchange-specific)
-## Training Results
-| Category | Baseline | Trained | Change |
-|----------|----------|---------|--------|
-| Overall | 36% | 64% | +28% |
-| Trading Mechanics | 20% | 80% | +60% |
-| Factual Knowledge | 60% | 80% | +20% |
-| API Structure | 17% | 50% | +33% |
-| Parameter Validation | 75% | 50% | -25% |
-> **Note:** Parameter validation regressed due to data imbalance in v1 training. This is addressed in subsequent versions.
 ## Training Details
-- **Base Model:** Qwen/Qwen3-4B-Instruct-2507
-- **Method:** QLoRA (4-bit quantization + LoRA)
-- **LoRA Rank:** 32
-- **LoRA Alpha:** 64
-- **Training Examples:** ~2,400 synthetic examples
-- **Categories:** Position sizing, API examples, knowledge Q&A, adversarial percentages
-- **Hardware:** NVIDIA RTX 3080 (12GB VRAM)
-- **Training Time:** ~3 hours
-## How to Use
 ```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
 from peft import PeftModel
-# Load base model
 base_model = AutoModelForCausalLM.from_pretrained(
     "Qwen/Qwen3-4B-Instruct-2507",
-    torch_dtype="auto",
     device_map="auto",
-    trust_remote_code=True
-)
-tokenizer = AutoTokenizer.from_pretrained(
-    "Qwen/Qwen3-4B-Instruct-2507",
-    trust_remote_code=True
 )
 # Load LoRA adapter
-model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b-0.1")
-# Example: Position sizing calculation
 messages = [
-    {"role": "system", "content": "You are a trading assistant for Hyperliquid. Calculate position sizes accurately and validate parameters against exchange constraints."},
-    {"role": "user", "content": "Account: $50,000, Risk: 2%, Entry: $45,000, Stop: $43,000. What is the correct position size for BTC?"}
 ]
 text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
 inputs = tokenizer(text, return_tensors="pt").to(model.device)
-outputs = model.generate(**inputs, max_new_tokens=512)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
-## Limitations
-- **Exchange-specific:** Trained on Hyperliquid data only; parameters may not apply to other exchanges
-- **Parameter validation regression:** v1 showed decreased accuracy on parameter validation (addressed in later versions)
-- **Not financial advice:** Outputs should be validated before use in live trading
-- **Requires safety layer:** Should be deployed with application-level position limits and human confirmation for large trades
-## Risks and Recommendations
-1. **Always validate calculations** before executing trades
-2. **Implement hard limits** in your application layer (e.g., max 10% position size, max 10x leverage)
-3. **Require human confirmation** for trades exceeding risk thresholds
-4. **Paper trade first** before using with real capital
-## License
-This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0), consistent with the base Qwen3 model license.
-### Qwen3 License Notice
-This model is a derivative of Qwen3-4B-Instruct, which is released under Apache 2.0 by Alibaba Cloud. See the [Qwen3 model card](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) for full details.
 ## Citation
 ```bibtex
 @misc{hyperllm2026,
-  title={HyperLLM: Domain-Adapted LLM for Hyperliquid Trading},
-  author={HyperLLM Team},
   year={2026},
-  url={https://huggingface.co/UVLabs/HyperLLM-4b-0.1}
 }
 ```
-## Version History
-| Version | Date | Notes |
-|---------|------|-------|
-| 0.1 | Feb 2026 | Experimental release, 36%→64% overall accuracy, parameter validation regression |

 ---
 base_model: Qwen/Qwen3-4B-Instruct-2507
 library_name: peft
+license: mit
+language:
+- en
 tags:
 - trading
 - hyperliquid
+- perpetuals
+- defi
+- lora
+- qlora
+datasets:
+- custom
+pipeline_tag: text-generation
 ---
+# HyperLLM-4b v0.2
+A specialized trading assistant fine-tuned for [Hyperliquid](https://hyperliquid.xyz), a perpetual futures DEX. Built on Qwen3-4B-Instruct using QLoRA.
 ## Model Description
+HyperLLM is designed to assist with Hyperliquid perpetual trading tasks including:
+- Position sizing calculations with proper risk management
+- Hyperliquid API request/response formatting
+- Parameter validation for trades
+- Hyperliquid-specific knowledge (order types, leverage limits, API endpoints)
+**This is a LoRA adapter** - you need to load it on top of the base model.
+## What's New in v0.2 (vs v0.1)
+| Change | v0.1 | v0.2 |
+|--------|------|------|
+| **Hardware** | Local consumer GPU | A100 80GB (RunPod) |
+| **Max Sequence Length** | 2048 | 4096 |
+| **Batch Size** | 1 | 4 |
+| **rsLoRA** | No | Yes |
+| **Flash Attention** | No | Yes |
+| **Early Stopping** | No | Yes (patience=3) |
+| **Training Precision** | fp16 | bf16 |
+| **Evaluation** | Basic | Comprehensive (297 questions) |
+### Key Improvements
+- **+46.7% factual knowledge**: Hyperliquid-specific facts improved from 33.3% → 80.0%
+- **+6.7% API structure**: Better at formatting Hyperliquid API requests
+- **+3.3% position sizing**: Core trading calculation improvements
+- **Longer context**: 4096 tokens vs 2048 for complex multi-step reasoning
+- **rsLoRA**: Rank-stabilized LoRA for better training stability
+### Known Regressions
+v0.2 exhibits some catastrophic forgetting compared to the base model:
+- Parameter validation: -20% (73.3% vs 93.3% baseline)
+- Edge case handling: -17.5% (75.0% vs 92.5% baseline)
+- Adversarial percentage questions: -12.5% (36.9% vs 49.4% baseline)
+These will be addressed in v0.3 with replay data and DPO training.
 ## Training Details
+| Parameter | Value |
+|-----------|-------|
+| Base Model | Qwen/Qwen3-4B-Instruct-2507 |
+| LoRA Rank | 64 |
+| LoRA Alpha | 128 |
+| Dropout | 0.05 |
+| Learning Rate | 3e-5 |
+| Effective Batch Size | 8 |
+| Training Loss | 0.159 |
+| Token Accuracy | 95.5% |
+| Training Time | 26 minutes |
+| Hardware | NVIDIA A100 80GB |
+| Quantization | 4-bit NF4 (QLoRA) |
+### Target Modules
+- q_proj, k_proj, v_proj, o_proj (attention)
+- gate_proj, up_proj, down_proj (MLP)
+## Evaluation Results
+Tested on 297 questions across 9 categories:
+| Category | Score | vs Baseline |
+|----------|-------|-------------|
+| Factual Knowledge | 80.0% | **+46.7%** |
+| API Structure | 42.5% | +6.7% |
+| Position Sizing | 83.3% | +3.3% |
+| Trading Mechanics | 70.0% | -10.0% |
+| Parameter Validation | 73.3% | -20.0% |
+| Edge Cases | 75.0% | -17.5% |
+| General Capability | 83.6% | -7.3% |
+| Adversarial % | 36.9% | -12.5% |
+| Multi-step Reasoning | 24.0% | -3.0% |
+| **Overall** | **65.0%** | -5.2% |
+## Usage
+### With Transformers + PEFT
 ```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
 from peft import PeftModel
+import torch
+# Load base model with 4-bit quantization
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch.bfloat16,
+)
 base_model = AutoModelForCausalLM.from_pretrained(
     "Qwen/Qwen3-4B-Instruct-2507",
+    quantization_config=bnb_config,
     device_map="auto",
 )
 # Load LoRA adapter
+model = PeftModel.from_pretrained(
+    base_model,
+    "UVLabs/HyperLLM-4b",
+    revision="v0.2"
+)
+tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b", revision="v0.2")
+# Example: Position sizing
 messages = [
+    {"role": "user", "content": "I have $10,000 and want to risk 2% on a BTC long at $50,000 with a stop at $48,000. What position size?"}
 ]
 text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
 inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
+### Without Quantization (More VRAM)
+```python
+from transformers import AutoModelForCausalLM
+from peft import PeftModel
+base_model = AutoModelForCausalLM.from_pretrained(
+    "Qwen/Qwen3-4B-Instruct-2507",
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b", revision="v0.2")
+```
+## Intended Use
+- Assisting with Hyperliquid perpetual trading calculations
+- Learning Hyperliquid API structure and parameters
+- Position sizing with risk management
+- Understanding Hyperliquid-specific concepts
+## Limitations
+- **Not financial advice**: This model is for educational/informational purposes only
+- **Verify calculations**: Always double-check position sizes and risk calculations
+- **Catastrophic forgetting**: Some general capabilities regressed vs base model
+- **Adversarial inputs**: Model can be confused by tricky percentage questions
+## License
+MIT
 ## Citation
 ```bibtex
 @misc{hyperllm2026,
+  title={HyperLLM: A Specialized Trading Assistant for Hyperliquid},
+  author={UVLabs},
   year={2026},
+  publisher={Hugging Face},
+  url={https://huggingface.co/UVLabs/HyperLLM-4b}
 }
 ```
+## Framework Versions
+- PEFT: 0.15.0
+- Transformers: 4.52.0
+- PyTorch: 2.7.0
+- bitsandbytes: 0.45.4

adapter_config.json CHANGED Viewed

@@ -1,46 +1,39 @@
-{
-  "alora_invocation_tokens": null,
-  "alpha_pattern": {},
-  "arrow_config": null,
-  "auto_mapping": null,
-  "base_model_name_or_path": "Qwen/Qwen3-4B-Instruct-2507",
-  "bias": "none",
-  "corda_config": null,
-  "ensure_weight_tying": false,
-  "eva_config": null,
-  "exclude_modules": null,
-  "fan_in_fan_out": false,
-  "inference_mode": true,
-  "init_lora_weights": true,
-  "layer_replication": null,
-  "layers_pattern": null,
-  "layers_to_transform": null,
-  "loftq_config": {},
-  "lora_alpha": 64,
-  "lora_bias": false,
-  "lora_dropout": 0.05,
-  "megatron_config": null,
-  "megatron_core": "megatron.core",
-  "modules_to_save": null,
-  "peft_type": "LORA",
-  "peft_version": "0.18.1",
-  "qalora_group_size": 16,
-  "r": 32,
-  "rank_pattern": {},
-  "revision": null,
-  "target_modules": [
-    "q_proj",
-    "down_proj",
-    "v_proj",
-    "gate_proj",
-    "up_proj",
-    "o_proj",
-    "k_proj"
-  ],
-  "target_parameters": null,
-  "task_type": "CAUSAL_LM",
-  "trainable_token_indices": null,
-  "use_dora": false,
-  "use_qalora": false,
-  "use_rslora": false
 }

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen3-4B-Instruct-2507",
+  "bias": "none",
+  "corda_config": null,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 128,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "up_proj",
+    "v_proj",
+    "k_proj",
+    "gate_proj",
+    "q_proj",
+    "down_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_rslora": true
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fc10d854fabecd20bd82bc7314e73317dbd4dab1ac04996b51967232bea39c3e
-size 132188392

 version https://git-lfs.github.com/spec/v1
+oid sha256:e09b554fe2bded98e640b169e10f78a2bcb75946bdd6631f3786dde799ffb390
+size 528550256

added_tokens.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "</think>": 151668,
+  "</tool_call>": 151658,
+  "</tool_response>": 151666,
+  "<think>": 151667,
+  "<tool_call>": 151657,
+  "<tool_response>": 151665,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}

chat_template.jinja CHANGED Viewed

@@ -1,61 +1,61 @@
-{%- if tools %}
-    {{- '<|im_start|>system\n' }}
-    {%- if messages[0].role == 'system' %}
-        {{- messages[0].content + '\n\n' }}
-    {%- endif %}
-    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
-    {%- for tool in tools %}
-        {{- "\n" }}
-        {{- tool | tojson }}
-    {%- endfor %}
-    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
-{%- else %}
-    {%- if messages[0].role == 'system' %}
-        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
-    {%- endif %}
-{%- endif %}
-{%- for message in messages %}
-    {%- if message.content is string %}
-        {%- set content = message.content %}
-    {%- else %}
-        {%- set content = '' %}
-    {%- endif %}
-    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
-        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
-    {%- elif message.role == "assistant" %}
-        {{- '<|im_start|>' + message.role + '\n' + content }}
-        {%- if message.tool_calls %}
-            {%- for tool_call in message.tool_calls %}
-                {%- if (loop.first and content) or (not loop.first) %}
-                    {{- '\n' }}
-                {%- endif %}
-                {%- if tool_call.function %}
-                    {%- set tool_call = tool_call.function %}
-                {%- endif %}
-                {{- '<tool_call>\n{"name": "' }}
-                {{- tool_call.name }}
-                {{- '", "arguments": ' }}
-                {%- if tool_call.arguments is string %}
-                    {{- tool_call.arguments }}
-                {%- else %}
-                    {{- tool_call.arguments | tojson }}
-                {%- endif %}
-                {{- '}\n</tool_call>' }}
-            {%- endfor %}
-        {%- endif %}
-        {{- '<|im_end|>\n' }}
-    {%- elif message.role == "tool" %}
-        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
-            {{- '<|im_start|>user' }}
-        {%- endif %}
-        {{- '\n<tool_response>\n' }}
-        {{- content }}
-        {{- '\n</tool_response>' }}
-        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
-            {{- '<|im_end|>\n' }}
-        {%- endif %}
-    {%- endif %}
-{%- endfor %}
-{%- if add_generation_prompt %}
-    {{- '<|im_start|>assistant\n' }}
 {%- endif %}

+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0].role == 'system' %}
+        {{- messages[0].content + '\n\n' }}
+    {%- endif %}
+    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- for message in messages %}
+    {%- if message.content is string %}
+        {%- set content = message.content %}
+    {%- else %}
+        {%- set content = '' %}
+    {%- endif %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- if message.tool_calls %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if (loop.first and content) or (not loop.first) %}
+                    {{- '\n' }}
+                {%- endif %}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<tool_call>\n{"name": "' }}
+                {{- tool_call.name }}
+                {{- '", "arguments": ' }}
+                {%- if tool_call.arguments is string %}
+                    {{- tool_call.arguments }}
+                {%- else %}
+                    {{- tool_call.arguments | tojson }}
+                {%- endif %}
+                {{- '}\n</tool_call>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
 {%- endif %}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:be75606093db2094d7cd20f3c2f385c212750648bd6ea4fb2bf507a6a4c55506
-size 11422650

 version https://git-lfs.github.com/spec/v1
+oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
+size 11422654

tokenizer_config.json CHANGED Viewed

@@ -1,29 +1,239 @@
-{
-  "add_prefix_space": false,
-  "backend": "tokenizers",
-  "bos_token": null,
-  "clean_up_tokenization_spaces": false,
-  "eos_token": "<|im_end|>",
-  "errors": "replace",
-  "extra_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "is_local": false,
-  "model_max_length": 1010000,
-  "pad_token": "<|endoftext|>",
-  "split_special_tokens": false,
-  "tokenizer_class": "Qwen2Tokenizer",
-  "unk_token": null
-}

+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151665": {
+      "content": "<tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151666": {
+      "content": "</tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151667": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151668": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": {},
+  "model_max_length": 1010000,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9fc482d1a7a435ff60ed479644c45fe7c1409ffca63ecaa9803840a33769fd74
-size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:7cf6dcfb7fa01a043537d453752ebabff6db298fb689b4df160bb4e3b59dd414
+size 5688

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff