| | --- |
| | license: apache-2.0 |
| | tags: |
| | - merge |
| | - mergekit |
| | - lazymergekit |
| | - ray0rf1re/Nix2.5 |
| | - ray0rf1re/Nix1.5 |
| | --- |
| | |
| | # Nix2.5-plus |
| | i still recomend normal Nix2.5 |
| |
|
| | ## Model Description |
| |
|
| | This is a merged model, `Nix2.5-plus`, created using `mergekit`'s `slerp` (Spherical Linear Interpolation) method. It combines the strengths of `ray0rf1re/Nix2.5` and `ray0rf1re/Nix1.5` to potentially offer improved performance or a different balance of capabilities. |
| |
|
| | ## Merge Details |
| |
|
| | `Nix2.5-plus` is a merge of the following models using the `slerp` merge method from [mergekit](https://github.com/cg123/mergekit): |
| | * [ray0rf1re/Nix2.5](https://huggingface.co/ray0rf1re/Nix2.5) |
| | * [ray0rf1re/Nix1.5](https://huggingface.co/ray0rf1re/Nix1.5) |
| |
|
| | The merge was performed with a specific `t` parameter of `0.275`. This signifies a weighted combination where `ray0rf1re/Nix1.5` contributes approximately 27.5% and `ray0rf1re/Nix2.5` contributes approximately 72.5% to the final merged model's characteristics. `ray0rf1re/Nix2.5` was used as the base model for this slerp merge. |
| |
|
| | ## ⚙ Configuration |
| |
|
| | ```yaml |
| | slices: |
| | - sources: |
| | - model: ray0rf1re/Nix2.5 |
| | layer_range: [0, 32] |
| | - model: ray0rf1re/Nix1.5 |
| | layer_range: [0, 32] |
| | merge_method: slerp |
| | base_model: ray0rf1re/Nix2.5 |
| | parameters: |
| | t: |
| | - filter: self_attn |
| | value: [0, 0.5, 0.3, 0.7, 1] |
| | - filter: mlp |
| | value: [1, 0.5, 0.7, 0.3, 0] |
| | - value: 0.275 |
| | dtype: bfloat16 |
| | |
| | ``` |
| |
|
| | ## Usage |
| |
|
| | To use this model, you can load it with the `transformers` library: |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model_name = "ray0rf1re/Nix2.5-plus" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForCausalLM.from_pretrained(model_name) |
| | |
| | # Example usage (adjust as needed) |
| | input_text = "Hello, my name is" |
| | input_ids = tokenizer(input_text, return_tensors="pt").input_ids |
| | output = model.generate(input_ids, max_new_tokens=50) |
| | print(tokenizer.decode(output[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ## Training Data |
| |
|
| | This merged model leverages the training data of its constituent models: `ray0rf1re/Nix2.5` and `ray0rf1re/Nix1.5`. Please refer to the respective model cards for details on their training datasets. |
| |
|
| | ## Limitations |
| |
|
| | As a merged model, its performance and biases are inherited from its base models. Thorough evaluation is recommended for specific use cases. Merged models may sometimes exhibit unexpected behaviors or a degradation in certain tasks compared to their individual components. |