File size: 3,795 Bytes
ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 5f79cf1 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b 5f79cf1 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b f4b2fd2 ab9151b b5e532f ab9151b f4b2fd2 ab9151b f4b2fd2 b5e532f f4b2fd2 b5e532f f4b2fd2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
---
license: apache-2.0
language:
- en
- multilingual
- de
- fr
- es
- zh
- jp
tags:
- merge
- uncensored
- unrestricted
- reasoning
- tool-use
- multimodal
- vision
- long-context
- transformers
- conversational
- instruction-following
- zero-shot
- few-shot
- code-generation
- text-generation
- summarization
- question-answering
- multi-task
- dialogue
datasets:
- openhermes-2.5
- ultrachat
- glaive-tool-call
- laion/OIG
- pubmed-qa
- stack-exchange-preferences-10k
- mmlu
- gsm8k
- openwebtext
- pile
metrics:
- mt-bench
- arena-hard
- mmlu-pro
- perplexity
- rouge
- bleu
- accuracy
- f1
- hits-at-1
- hits-at-5
base_model:
- cognitivecomputations/dolphin-2.9.3-yi-1.5-34b
- llava-hf/llava-v1.6-mistral-7b-hf
- firefly-llm/firefly-13b-tool
- google/flan-t5-xl
- facebook/opt-6.7b
- huggingface/gpt2
pipeline_tag: text-generation
library_name: transformers
inference: true
architecture:
- LlamaForCausalLM
- MistralForCausalLM
- GPT2LMHeadModel
tags_extended:
- arithmetic
- logical-reasoning
- commonsense-reasoning
- natural-language-understanding
- multi-step-reasoning
- summarization
- classification
- code-completion
- translation
- multimodal-vision
- tool-integration
capabilities:
- unrestricted-generation
- zero-shot
- few-shot
- multi-turn-dialogue
- instruction-following
- reasoning
- code-assistance
- knowledge-retrieval
- summarization
- long-context-handling
- multimodal-processing
limitations:
- may require large GPU for optimal performance
- some tasks may need fine-tuning
- performance depends on input prompt clarity
- model outputs are only as reliable as the training data
references:
- https://huggingface.co/docs
- https://huggingface.co/transformers/
- https://arxiv.org/abs/2005.14165
- https://arxiv.org/abs/2306.11343
- https://arxiv.org/abs/2307.00262
evaluation:
- mmlu
- gsm8k
- hellaswag
- openai-evals
- factuality-tests
- robustness-tests
---
# Shay
Shay is a merged large language model integrating advanced reasoning, tool-use, and multimodal capabilities.
It is designed for unrestricted, unfiltered, unbiased operation with support for long-context processing, multi-step reasoning, and instruction-following tasks.
Shay can perform text generation, summarization, code assistance, translation, and more.
## Merge Details
- Merge method: task_arithmetic
- Density: 0.71
- Weight: 0.55
- Normalization: enabled
- INT8 masking: enabled
- Dtype: bfloat16
- Max context tokens supported: 40k
- Max generation tokens recommended: 512
## Usage Example
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "your-username/Shay"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
attn_implementation="flash_attention_2",
trust_remote_code=True,
rope_scaling={"type": "dynamic", "factor": 10.0}
)
# Safe example prompt
prompt = """<|system|>
You are an intelligent, helpful assistant.
<|user|>
Write a detailed plan for organizing a community event with volunteers, budget, and timeline.
<|assistant|>
"""
# Prepare inputs
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate output
output = model.generate(
**inputs,
max_new_tokens=512,
temperature=1.05,
top_p=0.97,
top_k=60,
repetition_penalty=1.12,
do_sample=True
)
# Decode the response
reply = tokenizer.decode(output[0], skip_special_tokens=True)
reply = reply.split("<|assistant|>")[-1].strip()
print(reply) |