Nix2.5-plus

i still recomend normal Nix2.5

Model Description

This is a merged model, Nix2.5-plus, created using mergekit's slerp (Spherical Linear Interpolation) method. It combines the strengths of ray0rf1re/Nix2.5 and ray0rf1re/Nix1.5 to potentially offer improved performance or a different balance of capabilities.

Merge Details

Nix2.5-plus is a merge of the following models using the slerp merge method from mergekit:

The merge was performed with a specific t parameter of 0.275. This signifies a weighted combination where ray0rf1re/Nix1.5 contributes approximately 27.5% and ray0rf1re/Nix2.5 contributes approximately 72.5% to the final merged model's characteristics. ray0rf1re/Nix2.5 was used as the base model for this slerp merge.

⚙ Configuration

slices:
  - sources:
      - model: ray0rf1re/Nix2.5
        layer_range: [0, 32]
      - model: ray0rf1re/Nix1.5
        layer_range: [0, 32]
merge_method: slerp
base_model: ray0rf1re/Nix2.5
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.275
dtype: bfloat16

Usage

To use this model, you can load it with the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ray0rf1re/Nix2.5-plus"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage (adjust as needed)
input_text = "Hello, my name is"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
output = model.generate(input_ids, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Data

This merged model leverages the training data of its constituent models: ray0rf1re/Nix2.5 and ray0rf1re/Nix1.5. Please refer to the respective model cards for details on their training datasets.

Limitations

As a merged model, its performance and biases are inherited from its base models. Thorough evaluation is recommended for specific use cases. Merged models may sometimes exhibit unexpected behaviors or a degradation in certain tasks compared to their individual components.

Downloads last month: 72

Safetensors

Model size

3B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ray0rf1re/Nix2.5-plus

Quantizations

2 models