fallen2

Passthrough merge of TheDrummer/Fallen-Llama-3.3-R1-70B-v1 with itself.

Merge Details

Merge Method

This model was merged using the Passthrough merge method.

Models Merged

The following models were included in the merge:

  • /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213

Configuration

The following YAML configuration was used to produce this model:

dtype: bfloat16
merge_method: passthrough
modules:
  default:
    slices:
    - sources:
      - layer_range: [0, 4]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [2, 4]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [4, 8]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [6, 8]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [8, 12]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [10, 12]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [12, 16]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [14, 16]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [16, 20]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [18, 20]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [20, 24]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [22, 24]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [24, 28]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [26, 28]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [28, 32]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [30, 32]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [32, 36]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [34, 36]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [36, 40]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [38, 40]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [40, 44]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [42, 44]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [44, 48]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [46, 48]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [48, 52]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [50, 52]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [52, 56]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [54, 56]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [56, 60]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [58, 60]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [60, 64]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [62, 64]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [64, 68]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [66, 68]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [68, 72]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [70, 72]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [72, 76]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [74, 76]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
    - sources:
      - layer_range: [76, 80]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
    - sources:
      - layer_range: [78, 80]
        model: /workspace/cache/models--TheDrummer--Fallen-Llama-3.3-R1-70B-v1/snapshots/c88ee563196321458e6e46031231143c86394213
        parameters:
          scale:
          - filter: o_proj
            value: 0.0
          - filter: down_proj
            value: 0.0
          - value: 1.0
Downloads last month
-
Safetensors
Model size
105B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bruhzair/Fallen-x-Fallen-L3.3-105b

Quantizations
2 models