--- license: apache-2.0 metrics: - accuracy - character --- license: apache-2.0 language: en Model Card for CoreX v0.1 CoreX v0.1 is a lightweight, decoder-only transformer built by Nexizan Company. It is designed to run efficiently on low-resource systems (~7 GB RAM) while supporting offline AI assistants, coding tutors, and sandbox experiments. Model Details Model Description Developed by: Nexizan Company Funded by : Self-funded Shared by : Nexizan inc *CoreX team* ( Faisal - *LitRush* ) Model type: Causal LM (transformer, decoder-only) Language(s): English License: Apache-2.0 Finetuned from model : None (trained from scratch) Model Sources Repository: to be added Paper: N/A Demo: Local CLI via chat_interface.py Uses Direct Use Chat-based assistant (offline/terminal) Text generation and summarization Code and math Q&A Educational or personal projects Downstream Use Domain-specific fine-tuning (education, productivity, private tools) Integration into offline AI platforms (e.g., NexIN prototype) Out-of-Scope Use Medical, financial, or legal advice Safety-critical or autonomous systems Content generation without moderation Bias, Risks, and Limitations Limited training size (~9.2M tokens) → restricted knowledge Biases from dataset may appear in responses Non-English performance is weak Risk of hallucinations or unsafe generations Recommendations Use a moderation/filtering layer in deployment Fine-tune with curated, domain-specific datasets Always keep a human-in-the-loop for sensitive applications How to Get Started Run the interactive chat interface: python chat_interface.py Or load directly in Python: from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("path/to/corex_tok.model") model = AutoModelForCausalLM.from_pretrained("path/to/final_model.pt") inputs = tokenizer("Hello CoreX!", return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=50) print(tokenizer.decode(outputs[0])) Training Details Training Data Samples: 34,559 Tokens: ~9.2M Avg length: ~266 tokens Max length: 1024 Tokenizer: SentencePiece unigram, vocab=32,000 Preprocessing Unicode normalization Special tokens (, , , ) Deduplication and filtering Training Hyperparameters Regime: Mixed precision (CPU/GPU optimized) Hidden size: 512 Layers: 8 Attention heads: 8 (2 KV heads) Intermediate size: 1365 (SwiGLU) Max positions: 2048 Learning rate: 5e-4 (cosine decay, warmup 1k steps) Optimizer: AdamW (β1=0.9, β2=0.95, wd=0.1) Batch size: 2 (effective 32 with accumulation) Steps: 50,000 Speeds, Sizes, Times Parameters: ~54.8M Checkpoint size: ~220MB Hardware target: 7 GB RAM systems Evaluation Testing Data Held-out samples from training corpus Factors Conversational text, code snippets, math expressions Metrics Perplexity (PPL), loss Results Training loss decreased steadily Early tests show coherent text and code generation Summary CoreX v0.1 achieves usable fluency for small-scale tasks. It is not comparable to large LLMs, but excels at lightweight, private, offline usage. Model Examination Architecture: 8-layer decoder, RoPE, SwiGLU, RMSNorm, GQA Tokenizer verified (32k vocab, unigram SentencePiece) Environmental Impact Hardware Type: Consumer GPU/CPU Training Time: Several days (low resource) Cloud Provider: None (local) Carbon Emitted: Minimal (small model) Technical Specifications Model Architecture and Objective Decoder-only transformer RoPE embeddings, SwiGLU MLP, RMSNorm Grouped Query Attention Compute Infrastructure Hardware: ~7 GB RAM system Software: PyTorch, SentencePiece Citation BibTeX: @misc{corex2025, title={CoreX v0.1: Lightweight Transformer Language Model}, author={Nexizan Company}, year={2025}, license={Apache-2.0} } APA: Nexizan inc (2025). CoreX v0.1: Lightweight Transformer Language Model. Glossary RoPE: Rotary Position Embeddings SwiGLU: Swish-Gated Linear Unit RMSNorm: Root Mean Square Norm GQA: Grouped Query Attention More Information CoreX v0.1 is the first milestone in the CoreX series, focused on offline-first, privacy-respecting AI systems. Future versions aim for larger datasets, more parameters, and better reasoning ability. Model Card Authors Nexizan inc — CoreX Team Model Card Contact N/A