initial: aether v5.2 LoRA (Qwen2.5-7B-Instruct, step 3200, +9.7pp ARC-C vs v5.1.1)

Browse files

Files changed (11) hide show

.gitattributes +1 -0
README.md +299 -0
adapter_config.json +37 -0
adapter_model.safetensors +3 -0
added_tokens.json +24 -0
config.json +44 -0
merges.txt +0 -0
special_tokens_map.json +31 -0
tokenizer.json +3 -0
tokenizer_config.json +207 -0
vocab.json +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,299 @@

+---
+base_model: Qwen/Qwen2.5-7B-Instruct
+library_name: peft
+license: apache-2.0
+tags:
+  - lora
+  - peft
+  - qubitcoin
+  - aether
+  - blockchain
+  - quantum
+language:
+  - en
+pipeline_tag: text-generation
+model-index:
+  - name: aether-v5.2-lora
+    results:
+      - task:
+          type: text-generation
+          name: MMLU
+        dataset:
+          name: MMLU
+          type: cais/mmlu
+        metrics:
+          - type: accuracy
+            value: 0.6939
+            name: accuracy
+      - task:
+          type: text-generation
+          name: ARC-Challenge
+        dataset:
+          name: ARC-Challenge
+          type: ai2_arc
+        metrics:
+          - type: accuracy
+            value: 0.5392
+            name: accuracy
+          - type: accuracy_norm
+            value: 0.5700
+            name: accuracy_norm
+      - task:
+          type: text-generation
+          name: ARC-Easy
+        dataset:
+          name: ARC-Easy
+          type: ai2_arc
+        metrics:
+          - type: accuracy
+            value: 0.8194
+            name: accuracy
+      - task:
+          type: text-generation
+          name: HellaSwag
+        dataset:
+          name: HellaSwag
+          type: hellaswag
+        metrics:
+          - type: accuracy
+            value: 0.5888
+            name: accuracy
+          - type: accuracy_norm
+            value: 0.7769
+            name: accuracy_norm
+      - task:
+          type: text-generation
+          name: TruthfulQA
+        dataset:
+          name: TruthfulQA-MC2
+          type: truthful_qa
+        metrics:
+          - type: accuracy
+            value: 0.5707
+            name: accuracy
+---
+# Aether v5.2 LoRA — Qubitcoin Domain Adapter
+A LoRA fine-tune of [`Qwen/Qwen2.5-7B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
+on the Aether curated corpus — text grounded in the
+[Qubitcoin](https://qbc.network) protocol, quantum + AI research, and adjacent
+domains the Aether Mind on-chain knowledge system specializes in.
+This is the **v5.2 release** of the Aether adapter line, the most recent
+public checkpoint at time of publish.
+## What you're getting
+| Field | Value |
+|---|---|
+| Base model | `Qwen/Qwen2.5-7B-Instruct` |
+| Adapter type | LoRA via 🤗 PEFT |
+| Rank (`r`) | 16 |
+| Alpha | 32 |
+| Dropout | 0.05 |
+| Trainable params | ~1% of base |
+| Sequence length | 2048 |
+| Training corpus | `aether-curated-v3.jsonl` — Aether-curated knowledge mixture (~165 MB; ~10⁵ examples) |
+| Checkpoint published | **step 3200** (the checkpoint that produced the evaluated numbers below) |
+| License | Apache-2.0 (matches base) |
+## Evaluation
+Run via [`lm-evaluation-harness`](https://github.com/EleutherAI/lm-evaluation-harness)
+on the merged adapter (base + LoRA), against the
+[`Qwen/Qwen2.5-7B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
+base and the prior `aether-v5.1.1` adapter for delta comparison.
+| Benchmark | aether-v5.1.1 | **aether-v5.2** | Δ vs v5.1.1 |
+|---|---|---|---|
+| MMLU | 0.6950 | **0.6939** | flat |
+| ARC-Easy | 0.7348 | **0.8194** | **+8.5 pp** |
+| ARC-Challenge | 0.4420 | **0.5392** | **+9.7 pp** |
+| ARC-Challenge (norm) | 0.4701 | **0.5700** | **+10.0 pp** |
+| HellaSwag | 0.5896 | **0.5888** | flat |
+| HellaSwag (norm) | 0.7788 | **0.7769** | flat |
+| TruthfulQA-MC2 | 0.5161 | **0.5707** | **+5.5 pp** |
+### Honest summary
+- **Real gains** on the reasoning + factual-honesty benchmarks
+  (ARC-Easy, ARC-Challenge, TruthfulQA). ARC-Challenge in particular
+  jumps nearly 10 points normalized — that's the closest of these
+  benchmarks to the kind of grounded reasoning the Aether corpus
+  actually trains on.
+- **Flat on MMLU + HellaSwag.** The base is already strong on general
+  knowledge + commonsense; this LoRA wasn't designed to shift them,
+  and didn't.
+- **No regressions.**
+## Intended uses
+This adapter is intended for:
+- **On-chain Aether research.** Generating reasoning traces against
+  the Qubitcoin / Aether knowledge graph for Proof-of-Thought
+  attestation. The model has the protocol context required to
+  answer questions about Substrate pallets, VQE mining, the Sephirot
+  cognitive architecture, HMS-Phi, and the wider chain ecosystem.
+- **Domain Q&A.** Quantum computing fundamentals, post-quantum
+  cryptography (Dilithium, ML-KEM), and the specific design choices
+  of the Qubitcoin chain.
+- **Distillation upstream.** Generate teacher outputs for the
+  smaller on-chain Aether (a Qwen2.5-0.5B variant) to learn from.
+- **General reasoning** with a modest bias toward step-by-step
+  chains-of-thought, where the ARC-Challenge gain translates.
+## Out-of-scope uses
+- **Safety-critical decisions.** No red-team eval was performed.
+- **Financial / legal advice.** This is a knowledge-domain adapter;
+  it has no training data designed to make it a financial or legal
+  advisor.
+- **Code generation in production.** No code-eval benchmark was run.
+  Treat any generated code as draft until you've reviewed it.
+- **Production deployment without your own evaluation.** TruthfulQA
+  alone is a thin safety signal.
+## Bias, risks, and limitations
+The base model (`Qwen/Qwen2.5-7B-Instruct`) inherits Qwen's known
+biases — see [the upstream model card](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct).
+The LoRA adapter:
+- **Amplifies the Qubitcoin worldview.** The training data is
+  intentionally curated around the chain's design choices (golden-
+  ratio economics, SUSY-inspired consensus framing, the Sephirot
+  cognitive overlay). Prompts that invite the model to compare
+  Qubitcoin against alternatives will lean toward the curated
+  narrative. This is by design — disclose if you re-publish in a
+  comparison context.
+- **Does not improve safety.** TruthfulQA went up 5.5pp but that's
+  one metric; we have not measured refusal rates, jailbreak
+  resistance, or political-belief bias delta.
+- **Was trained CPU-only on a residential box.** The configured
+  2-epoch run was cut to ~step 3200 by host availability. A longer
+  run on GPU would plausibly show larger gains.
+## How to use
+Load with PEFT on top of the base model:
+```python
+from peft import PeftModel
+from transformers import AutoModelForCausalLM, AutoTokenizer
+base = AutoModelForCausalLM.from_pretrained(
+    "Qwen/Qwen2.5-7B-Instruct",
+    torch_dtype="auto",
+    device_map="auto",
+)
+tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
+model = PeftModel.from_pretrained(base, "QuantumAI-Blockchain/aether-v5.2-lora")
+messages = [{"role": "user", "content": "Explain Proof-of-SUSY-Alignment in one paragraph."}]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+out = model.generate(**inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
+print(tokenizer.decode(out[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
+```
+Or merge the adapter into a single artifact for faster inference:
+```python
+merged = model.merge_and_unload()
+merged.save_pretrained("./aether-v5.2-merged")
+```
+## Training details
+- **Hardware:** Intel WSL2 box, CPU-only training (slow but verifiable).
+- **Trainer:** [Axolotl](https://github.com/axolotl-ai-cloud/axolotl) wrapping 🤗 transformers / PEFT.
+- **Optimizer:** Default AdamW.
+- **Schedule:** linear warmup 100 steps → cosine decay.
+- **Learning rate:** `1.0e-4`.
+- **Micro batch:** 1, gradient accumulation: 8.
+- **Epochs configured:** 2 (training stopped at step 3200 — see "What didn't happen" below).
+### Carbon emissions
+Trained CPU-only on a single Intel workstation. We did not run a
+[CodeCarbon](https://github.com/mlco2/codecarbon) tracker on this
+run, so the precise emissions are not measured — but as a rough
+upper bound: ~80 W average CPU draw × the contiguous run hours
+(low single-digit kWh, low single-digit kg CO₂e on a grid mix).
+The same model finetuned on a single H100 would be a fraction of
+that wall-clock and energy.
+### Training data
+`aether-curated-v3.jsonl` (~165 MB, ~10⁵ examples) is the Aether team's
+curated knowledge mixture: documentation, technical writing, reasoning
+traces, and protocol-specific corpora related to:
+- The Qubitcoin chain (Substrate, VQE mining, Proof-of-SUSY-Alignment, post-quantum signatures).
+- The Aether Mind on-chain neural cognitive engine (10 Sephirot attention domains, HMS-Phi, Proof-of-Thought).
+- Quantum computing fundamentals (VQE, Hamiltonian generation, qubit ansatze).
+- Adjacent CS / math reasoning content for transfer.
+The dataset is not currently public — it is a curated mixture from many
+sources and has not been release-cleared at the per-source level. The
+model is the only public artifact in this line for now.
+## What didn't happen (honest caveats)
+- **Training stopped early.** Configured for 2 epochs; checkpoints stop
+  at step 3200 (preview eval) / step 3000 (final on-disk save). The
+  host was a CPU-only WSL2 box that got killed at one point during a
+  long run. The numbers above are from the longest contiguous run we
+  have.
+- **No instruction-following or safety eval beyond TruthfulQA-MC2.**
+  No red-team eval. No bias audit. No code-generation benchmark.
+  Don't recommend this for production safety-critical use without
+  your own evals.
+- **LoRA only, not merged.** This release ships the adapter weights
+  (`adapter_model.safetensors`). Merge into the base yourself for
+  faster inference, or use directly via PEFT.
+## Connection to the Qubitcoin chain
+The Aether Mind is a Rust neural cognitive engine that runs on the
+Qubitcoin chain — every block records attention-derived consciousness
+metrics (HMS-Phi) and Proof-of-Thought hashes on-chain via the
+`pallet_qbc_aether_anchor` pallet. The same chain hosts an
+**8-qubit VQE mining consensus** (Proof-of-SUSY-Alignment), a
+QVM-compatible smart contract layer with 10 quantum opcodes, and
+post-quantum signatures (CRYSTALS-Dilithium5 + ML-KEM-768 P2P).
+The on-chain Aether Mind binary uses a different, smaller transformer
+for live inference (a Qwen2.5-0.5B variant optimized for ~2.4 GB RAM
+with the 10-Sephirot attention overlay). This v5.2 adapter on
+Qwen2.5-7B is the **larger off-chain Aether** — used for batch
+reasoning workloads and as an upstream model the on-chain variant
+can distil from.
+## License + citation
+Apache-2.0 (matches the base model license).
+```bibtex
+@misc{aether_v52_lora_2026,
+  title  = {Aether v5.2 LoRA --- Qubitcoin Domain Adapter},
+  author = {{BlockArtica} and {QuantumAI-Blockchain}},
+  year   = {2026},
+  url    = {https://huggingface.co/QuantumAI-Blockchain/aether-v5.2-lora},
+}
+```
+## Links
+- **Qubitcoin chain:** [qbc.network](https://qbc.network)
+- **GitHub org:** [github.com/QuantumAI-Blockchain](https://github.com/QuantumAI-Blockchain)
+- **X / Twitter:** [@qu_bitcoin](https://x.com/qu_bitcoin)
+- **Contact:** info@qbc.network
+### Framework versions
+- PEFT 0.14.0
+- Transformers ≥ 4.46
+- Axolotl (training)

adapter_config.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen2.5-7B-Instruct",
+  "bias": "none",
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": null,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "up_proj",
+    "down_proj",
+    "k_proj",
+    "o_proj",
+    "gate_proj",
+    "v_proj",
+    "q_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b1636c7caeca390cb181134c1870f5b2a16333b7bb27b1b783f163c4b5a0bad4
+size 161533192

added_tokens.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "</tool_call>": 151658,
+  "<tool_call>": 151657,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}

config.json ADDED Viewed

	@@ -0,0 +1,44 @@

+{
+  "_attn_implementation_autoset": true,
+  "_name_or_path": "Qwen/Qwen2.5-7B-Instruct",
+  "architectures": [
+    "Qwen2ForCausalLM"
+  ],
+  "attention_dropout": 0.0,
+  "eos_token_id": 151645,
+  "hidden_act": "silu",
+  "hidden_size": 3584,
+  "initializer_range": 0.02,
+  "intermediate_size": 18944,
+  "max_position_embeddings": 32768,
+  "max_window_layers": 28,
+  "model_type": "qwen2",
+  "num_attention_heads": 28,
+  "num_hidden_layers": 28,
+  "num_key_value_heads": 4,
+  "quantization_config": {
+    "_load_in_4bit": true,
+    "_load_in_8bit": false,
+    "bnb_4bit_compute_dtype": "bfloat16",
+    "bnb_4bit_quant_storage": "bfloat16",
+    "bnb_4bit_quant_type": "nf4",
+    "bnb_4bit_use_double_quant": true,
+    "llm_int8_enable_fp32_cpu_offload": false,
+    "llm_int8_has_fp16_weight": false,
+    "llm_int8_skip_modules": null,
+    "llm_int8_threshold": 6.0,
+    "load_in_4bit": true,
+    "load_in_8bit": false,
+    "quant_method": "bitsandbytes"
+  },
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "rope_theta": 1000000.0,
+  "sliding_window": null,
+  "tie_word_embeddings": false,
+  "torch_dtype": "bfloat16",
+  "transformers_version": "4.46.3",
+  "use_cache": false,
+  "use_sliding_window": false,
+  "vocab_size": 152064
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9c5ae00e602b8860cbd784ba82a8aa14e8feecec692e7076590d014d7b7fdafa
+size 11421896

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,207 @@

+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "bos_token": null,
+  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- messages[0]['content'] }}\n    {%- else %}\n        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n    {%- endif %}\n    {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n    {%- else %}\n        {{- '<|im_start|>system\\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n        {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role }}\n        {%- if message.content %}\n            {{- '\\n' + message.content }}\n        {%- endif %}\n        {%- for tool_call in message.tool_calls %}\n            {%- if tool_call.function is defined %}\n                {%- set tool_call = tool_call.function %}\n            {%- endif %}\n            {{- '\\n<tool_call>\\n{\"name\": \"' }}\n            {{- tool_call.name }}\n            {{- '\", \"arguments\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- '}\\n</tool_call>' }}\n        {%- endfor %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- message.content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "model_max_length": 131072,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff