ray0rf1re
/

Nix2.5-plus

ray0rf1re/Nix2.5

ray0rf1re/Nix1.5

Model card Files Files and versions

Nix2.5-plus / README.md

ray0rf1re's picture

Update README.md

f49fe4d verified about 1 month ago

|

history blame contribute delete

2.5 kB

	---
	license: apache-2.0
	tags:
	- merge
	- mergekit
	- lazymergekit
	- ray0rf1re/Nix2.5
	- ray0rf1re/Nix1.5
	---

	# Nix2.5-plus
	i still recomend normal Nix2.5

	## Model Description

	This is a merged model, `Nix2.5-plus`, created using `mergekit`'s `slerp` (Spherical Linear Interpolation) method. It combines the strengths of `ray0rf1re/Nix2.5` and `ray0rf1re/Nix1.5` to potentially offer improved performance or a different balance of capabilities.

	## Merge Details

	`Nix2.5-plus` is a merge of the following models using the `slerp` merge method from [mergekit](https://github.com/cg123/mergekit):
	* [ray0rf1re/Nix2.5](https://huggingface.co/ray0rf1re/Nix2.5)
	* [ray0rf1re/Nix1.5](https://huggingface.co/ray0rf1re/Nix1.5)

	The merge was performed with a specific `t` parameter of `0.275`. This signifies a weighted combination where `ray0rf1re/Nix1.5` contributes approximately 27.5% and `ray0rf1re/Nix2.5` contributes approximately 72.5% to the final merged model's characteristics. `ray0rf1re/Nix2.5` was used as the base model for this slerp merge.

	## ⚙ Configuration

	```yaml
	slices:
	- sources:
	- model: ray0rf1re/Nix2.5
	layer_range: [0, 32]
	- model: ray0rf1re/Nix1.5
	layer_range: [0, 32]
	merge_method: slerp
	base_model: ray0rf1re/Nix2.5
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.275
	dtype: bfloat16

	```

	## Usage

	To use this model, you can load it with the `transformers` library:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "ray0rf1re/Nix2.5-plus"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	# Example usage (adjust as needed)
	input_text = "Hello, my name is"
	input_ids = tokenizer(input_text, return_tensors="pt").input_ids
	output = model.generate(input_ids, max_new_tokens=50)
	print(tokenizer.decode(output[0], skip_special_tokens=True))
	```

	## Training Data

	This merged model leverages the training data of its constituent models: `ray0rf1re/Nix2.5` and `ray0rf1re/Nix1.5`. Please refer to the respective model cards for details on their training datasets.

	## Limitations

	As a merged model, its performance and biases are inherited from its base models. Thorough evaluation is recommended for specific use cases. Merged models may sometimes exhibit unexpected behaviors or a degradation in certain tasks compared to their individual components.