--- license: apache-2.0 language: - en - multilingual - de - fr - es - zh - jp tags: - merge - uncensored - unrestricted - reasoning - tool-use - multimodal - vision - long-context - transformers - conversational - instruction-following - zero-shot - few-shot - code-generation - text-generation - summarization - question-answering - multi-task - dialogue datasets: - openhermes-2.5 - ultrachat - glaive-tool-call - laion/OIG - pubmed-qa - stack-exchange-preferences-10k - mmlu - gsm8k - openwebtext - pile metrics: - mt-bench - arena-hard - mmlu-pro - perplexity - rouge - bleu - accuracy - f1 - hits-at-1 - hits-at-5 base_model: - cognitivecomputations/dolphin-2.9.3-yi-1.5-34b - llava-hf/llava-v1.6-mistral-7b-hf - firefly-llm/firefly-13b-tool - google/flan-t5-xl - facebook/opt-6.7b - huggingface/gpt2 pipeline_tag: text-generation library_name: transformers inference: true architecture: - LlamaForCausalLM - MistralForCausalLM - GPT2LMHeadModel tags_extended: - arithmetic - logical-reasoning - commonsense-reasoning - natural-language-understanding - multi-step-reasoning - summarization - classification - code-completion - translation - multimodal-vision - tool-integration capabilities: - unrestricted-generation - zero-shot - few-shot - multi-turn-dialogue - instruction-following - reasoning - code-assistance - knowledge-retrieval - summarization - long-context-handling - multimodal-processing limitations: - may require large GPU for optimal performance - some tasks may need fine-tuning - performance depends on input prompt clarity - model outputs are only as reliable as the training data references: - https://huggingface.co/docs - https://huggingface.co/transformers/ - https://arxiv.org/abs/2005.14165 - https://arxiv.org/abs/2306.11343 - https://arxiv.org/abs/2307.00262 evaluation: - mmlu - gsm8k - hellaswag - openai-evals - factuality-tests - robustness-tests --- # Shay Shay is a merged large language model integrating advanced reasoning, tool-use, and multimodal capabilities. It is designed for unrestricted, unfiltered, unbiased operation with support for long-context processing, multi-step reasoning, and instruction-following tasks. Shay can perform text generation, summarization, code assistance, translation, and more. ## Merge Details - Merge method: task_arithmetic - Density: 0.71 - Weight: 0.55 - Normalization: enabled - INT8 masking: enabled - Dtype: bfloat16 - Max context tokens supported: 40k - Max generation tokens recommended: 512 ## Usage Example ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "your-username/Shay" # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", attn_implementation="flash_attention_2", trust_remote_code=True, rope_scaling={"type": "dynamic", "factor": 10.0} ) # Safe example prompt prompt = """<|system|> You are an intelligent, helpful assistant. <|user|> Write a detailed plan for organizing a community event with volunteers, budget, and timeline. <|assistant|> """ # Prepare inputs inputs = tokenizer(prompt, return_tensors="pt").to(model.device) # Generate output output = model.generate( **inputs, max_new_tokens=512, temperature=1.05, top_p=0.97, top_k=60, repetition_penalty=1.12, do_sample=True ) # Decode the response reply = tokenizer.decode(output[0], skip_special_tokens=True) reply = reply.split("<|assistant|>")[-1].strip() print(reply)