Nix2.5-plus
i still recomend normal Nix2.5
Model Description
This is a merged model, Nix2.5-plus, created using mergekit's slerp (Spherical Linear Interpolation) method. It combines the strengths of ray0rf1re/Nix2.5 and ray0rf1re/Nix1.5 to potentially offer improved performance or a different balance of capabilities.
Merge Details
Nix2.5-plus is a merge of the following models using the slerp merge method from mergekit:
The merge was performed with a specific t parameter of 0.275. This signifies a weighted combination where ray0rf1re/Nix1.5 contributes approximately 27.5% and ray0rf1re/Nix2.5 contributes approximately 72.5% to the final merged model's characteristics. ray0rf1re/Nix2.5 was used as the base model for this slerp merge.
ββ Configuration
slices:
- sources:
- model: ray0rf1re/Nix2.5
layer_range: [0, 32]
- model: ray0rf1re/Nix1.5
layer_range: [0, 32]
merge_method: slerp
base_model: ray0rf1re/Nix2.5
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.275
dtype: bfloat16
Usage
To use this model, you can load it with the transformers library:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "ray0rf1re/Nix2.5-plus"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Example usage (adjust as needed)
input_text = "Hello, my name is"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
output = model.generate(input_ids, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training Data
This merged model leverages the training data of its constituent models: ray0rf1re/Nix2.5 and ray0rf1re/Nix1.5. Please refer to the respective model cards for details on their training datasets.
Limitations
As a merged model, its performance and biases are inherited from its base models. Thorough evaluation is recommended for specific use cases. Merged models may sometimes exhibit unexpected behaviors or a degradation in certain tasks compared to their individual components.
- Downloads last month
- 72