|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: |
|
|
- openlm-research/open_llama_7b |
|
|
- stabilityai/StableBeluga-7B |
|
|
tags: |
|
|
- merge |
|
|
- mergekit |
|
|
- lazymergekit |
|
|
- open_llama |
|
|
- StableBeluga |
|
|
- slerp |
|
|
--- |
|
|
|
|
|
# OpenLlama-Stable-7B |
|
|
|
|
|
This is a merge of pre-trained language models created using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing), combining the foundational capabilities of OpenLM's Open Llama with StabilityAI's StableBeluga through an efficient SLERP fusion. |
|
|
|
|
|
## About Me |
|
|
|
|
|
I'm David Soeiro-Vuong, a third-year Computer Science student working as an apprentice at TW3 Partners, a company specialized in Generative AI. Passionate about artificial intelligence and language models optimization, I focus on creating efficient model merges that balance performance and capabilities. |
|
|
|
|
|
🔗 [Connect with me on LinkedIn](https://www.linkedin.com/in/david-soeiro-vuong-a28b582ba/) |
|
|
|
|
|
## Merge Details |
|
|
|
|
|
### Merge Method |
|
|
|
|
|
This model uses SLERP (Spherical Linear Interpolation) with carefully tuned parameters to achieve optimal performance balance: |
|
|
|
|
|
- **Attention Layers**: 0.7 interpolation value favoring StableBeluga's strong instruction-following capabilities |
|
|
- **MLP Layers**: 0.5 interpolation value creating an equal blend for balanced reasoning |
|
|
- **Other Parameters**: 0.6 interpolation value slightly favoring StableBeluga's refinements |
|
|
- **Format**: bfloat16 precision for efficient memory usage |
|
|
|
|
|
### Models Merged |
|
|
|
|
|
* [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) - An open-source reproduction of Meta's LLaMA that offers strong base capabilities |
|
|
* [stabilityai/StableBeluga-7B](https://huggingface.co/stabilityai/StableBeluga-7B) - StabilityAI's instruction-tuned variant offering improved instruction following and coherence |
|
|
|
|
|
### Configuration |
|
|
|
|
|
```yaml |
|
|
slices: |
|
|
- sources: |
|
|
- model: openlm-research/open_llama_7b |
|
|
layer_range: [0, 32] |
|
|
- model: stabilityai/StableBeluga-7B |
|
|
layer_range: [0, 32] |
|
|
merge_method: slerp |
|
|
base_model: openlm-research/open_llama_7b |
|
|
parameters: |
|
|
t: |
|
|
# Couches d'attention: préférence pour StableBeluga (0.7) |
|
|
- filter: self_attn |
|
|
value: 0.7 |
|
|
# Couches MLP: équilibrées |
|
|
- filter: mlp |
|
|
value: 0.5 |
|
|
# Tout le reste |
|
|
- value: 0.6 |
|
|
dtype: bfloat16 |
|
|
``` |
|
|
|
|
|
## Model Capabilities |
|
|
|
|
|
This merge combines: |
|
|
- Open Llama's strong foundational knowledge and reasoning |
|
|
- StableBeluga's improved instruction following and coherence |
|
|
- Fully open architecture with no usage restrictions |
|
|
|
|
|
The resulting model provides enhanced performance on tasks requiring both strong reasoning and good instruction following, such as: |
|
|
- Detailed explanations of complex concepts |
|
|
- Creative writing with coherent structure |
|
|
- Problem-solving with step-by-step reasoning |
|
|
- Balanced factual responses with nuanced perspectives |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
import torch |
|
|
|
|
|
model_id = "david-sv/OpenLlama-Stable-7B" # Replace with your actual HF username |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_id, |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# For chat completions |
|
|
prompt = """<human>: Explain the concept of spherical linear interpolation (SLERP) and why it's useful for merging language models. |
|
|
|
|
|
<assistant>:""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
output = model.generate( |
|
|
inputs["input_ids"], |
|
|
max_new_tokens=512, |
|
|
temperature=0.7, |
|
|
top_p=0.9, |
|
|
repetition_penalty=1.1 |
|
|
) |
|
|
|
|
|
print(tokenizer.decode(output[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Inherits limitations from both base models |
|
|
- May exhibit inconsistent behavior for certain complex reasoning tasks |
|
|
- No additional alignment or fine-tuning beyond the base models' training |
|
|
- Model was created through parameter merging without additional training data |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the Apache 2.0 license, consistent with the underlying models' licenses. |