| # Nyx: Core-Outline Transformer Model | |
| Nyx is a transformer-based language model designed for efficient text generation and understanding. This model is part of the Core-Outline project, focusing on providing high-quality text generation capabilities with a focus on financial, SaaS, social media, customer, and customer feedback analytics data. | |
| ## Model Architecture | |
| Nyx is built on a transformer decoder-only architecture with the following key components: | |
| - **Rotary Position Embeddings (RoPE)**: For better handling of sequence positions | |
| - **Multi-head Self-Attention**: With grouped-query attention for efficient inference | |
| - **SwiGLU Activation**: For the feed-forward networks | |
| - **RMSNorm**: For layer normalization | |
| - **Sliding Window Attention**: For handling longer sequences efficiently | |
| ### Model Specifications | |
| | Parameter | Value | | |
| |-----------|-------| | |
| | Hidden Size | 1024 | | |
| | Number of Layers | 24 | | |
| | Number of Attention Heads | 16 | | |
| | Number of Key-Value Heads | 16 | | |
| | Intermediate Size | 2816 | | |
| | Max Sequence Length | 32,768 tokens | | |
| | Vocabulary Size | 151,936 | | |
| | Activation | SwiGLU (SiLU) | | |
| ## Usage | |
| ### Prerequisites | |
| - Python 3.11+ | |
| - PyTorch 2.0+ | |
| - Transformers library | |
| - FastAPI (for API server) | |
| ### Loading the Model | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model_path = "core-outline/nyx" | |
| model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True) | |
| tokenizer = AutoTokenizer.from_pretrained("core-outline/nyx") # Using Qwen tokenizer | |
| ``` | |
| ### Text Generation | |
| ```python | |
| def generate_text(prompt, max_length=100, temperature=0.7): | |
| inputs = tokenizer(prompt, return_tensors="pt") | |
| outputs = model.generate( | |
| inputs.input_ids, | |
| max_length=max_length, | |
| temperature=temperature, | |
| do_sample=True, | |
| pad_token_id=tokenizer.eos_token_id | |
| ) | |
| return tokenizer.decode(outputs[0], skip_special_tokens=True) | |
| ``` | |
| ## Model Configuration | |
| The model uses the following key configuration parameters (from `config.json`): | |
| ```json | |
| { | |
| "hidden_size": 1024, | |
| "intermediate_size": 2816, | |
| "num_hidden_layers": 24, | |
| "num_attention_heads": 16, | |
| "num_key_value_heads": 16, | |
| "max_position_embeddings": 32768, | |
| "rms_norm_eps": 1e-6, | |
| "rope_theta": 1000000.0 | |
| } | |
| ``` | |
| ## Tokenizer | |
| The model uses the Qwen tokenizer, which is a BPE-based tokenizer with a vocabulary size of 151,936 tokens. | |
| ## Training Data | |
| The model has been trained on a diverse dataset including: | |
| - Financial analytics | |
| - SaaS metrics | |
| - Social media data | |
| - Customer data | |
| - Customer feedback | |
| ## License | |
| [Specify your license here] | |
| ## Acknowledgements | |
| - The model architecture is based on the Qwen/Llama architecture | |
| - Uses Rotary Position Embeddings (RoPE) for position encoding | |
| - Implements grouped-query attention for efficient inference |