lit69
/

CoreX_v0.1

Model card Files Files and versions

xet

Community

lit69 commited on Sep 10, 2025

Commit

487feec

verified ·

1 Parent(s): 20ee890

Update README.md

Browse files

Files changed (1) hide show

README.md +79 -65

README.md CHANGED Viewed

@@ -1,80 +1,89 @@
 Model Card for CoreX v0.1
-This model card documents CoreX v0.1, a lightweight transformer-based language model developed by Nexizan Company. CoreX is optimized for low-memory systems while enabling offline AI assistants, coding tutors, and sandbox research.
 Model Details
 Model Description
 Developed by: Nexizan Company
-Funded by [optional]: Self-funded
-Shared by [optional]: Nexizan Company CoreX team
-Model type: Decoder-only Transformer (causal LM)
-Language(s) (NLP): English
 License: Apache-2.0
-Finetuned from model [optional]: Trained from scratch
-Model Sources [optional]
-Repository: [To be added]
-Paper [optional]: N/A
-Demo [optional]: Local chat interface (chat_interface.py)
 Uses
 Direct Use
-Conversational assistant (terminal interface)
 Text generation and summarization
-Code and math assistance
-Educational / research sandbox
-Downstream Use [optional]
-Fine-tuning for domain-specific tasks (education, productivity, research)
-Integration into private offline-first AI platforms (e.g., NexIN)
 Out-of-Scope Use
-Medical, legal, or financial decision-making
-Fully autonomous deployment without human oversight
-Generating harmful or unsafe content
 Bias, Risks, and Limitations
-Trained on ~9.2M tokens → knowledge is limited compared to larger models
-Performance weaker in non-English languages
-May reproduce biases from the dataset
-Can generate hallucinated or incorrect facts
 Recommendations
-Always use human oversight for critical applications
-Apply filtering or moderation layers for safety
-Fine-tune with curated datasets for better domain performance
-How to Get Started with the Model
 python chat_interface.py
-Or in Python:
 from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -94,98 +103,103 @@ Tokens: ~9.2M
 Avg length: ~266 tokens
-Max length: 1024 tokens
-Tokenizer: SentencePiece unigram, vocab size 32,000
-Preprocessing [optional]
-Normalization and whitespace handling
-Special tokens for <pad>, <unk>, <s>, </s>
 Training Hyperparameters
-Training regime: Mixed precision (CPU/GPU optimized)
 Hidden size: 512
 Layers: 8
-Attention heads: 8 (2 key-value heads)
 Intermediate size: 1365 (SwiGLU)
-Max position embeddings: 2048
-Learning rate: 5e-4 (cosine schedule)
 Optimizer: AdamW (β1=0.9, β2=0.95, wd=0.1)
-Batch size: 2 (accumulated to 32)
 Steps: 50,000
-Speeds, Sizes, Times [optional]
 Parameters: ~54.8M
 Checkpoint size: ~220MB
-Optimized for: ~7GB RAM systems
 Evaluation
-Testing Data, Factors & Metrics
 Testing Data
-Evaluation with held-out samples from the same dataset
 Factors
-Tested on conversational, code, and math-style prompts
 Metrics
-Perplexity (PPL) and training loss
 Results
-PPL: decreasing across training (exact final values TBD)
-Baseline evaluation shows fluent short-text generation
 Summary
-CoreX v0.1 demonstrates solid performance for a lightweight model on low-resource hardware but is not competitive with large-scale LLMs.
-Model Examination [optional]
-Architecture verified with rotary embeddings, grouped query attention, SwiGLU, and RMSNorm.
 Environmental Impact
 Hardware Type: Consumer GPU/CPU
-Hours used: Few days of training
 Cloud Provider: None (local)
-Compute Region: Local system
-Carbon Emitted: Low (small model size)
-Technical Specifications [optional]
 Model Architecture and Objective
-Decoder-only transformer, 8 layers, SwiGLU, GQA, RoPE
 Compute Infrastructure
-Hardware: ~7GB RAM device (tested on consumer GPU/CPU)
 Software: PyTorch, SentencePiece
-Citation [optional]
 BibTeX:
@@ -198,25 +212,25 @@ BibTeX:
 APA:
-Nexizan Company. (2025). CoreX v0.1: Lightweight Transformer Language Model.
-Glossary [optional]
-RoPE: Rotary Position Embedding
 SwiGLU: Swish-Gated Linear Unit
-RMSNorm: Root Mean Square Normalization
 GQA: Grouped Query Attention
-More Information [optional]
-CoreX is intended as a stepping stone toward future versions with larger parameter counts and better datasets.
-Model Card Authors [optional]
-Nexizan Company CoreX Team
 Model Card Contact

+license: apache-2.0
+language:
+en
 Model Card for CoreX v0.1
+CoreX v0.1 is a lightweight, decoder-only transformer built by Nexizan Company. It is designed to run efficiently on low-resource systems (~7 GB RAM) while supporting offline AI assistants, coding tutors, and sandbox experiments.
 Model Details
 Model Description
 Developed by: Nexizan Company
+Funded by : Self-funded
+Shared by : Nexizan inc *CoreX team* ( Faisal - *LitRush* )
+Model type: Causal LM (transformer, decoder-only)
+Language(s): English
 License: Apache-2.0
+Finetuned from model : None (trained from scratch)
+Model Sources
+Repository: to be added
+Paper: N/A
+Demo: Local CLI via chat_interface.py
 Uses
 Direct Use
+Chat-based assistant (offline/terminal)
 Text generation and summarization
+Code and math Q&A
+Educational or personal projects
+Downstream Use
+Domain-specific fine-tuning (education, productivity, private tools)
+Integration into offline AI platforms (e.g., NexIN prototype)
 Out-of-Scope Use
+Medical, financial, or legal advice
+Safety-critical or autonomous systems
+Content generation without moderation
 Bias, Risks, and Limitations
+Limited training size (~9.2M tokens) → restricted knowledge
+Biases from dataset may appear in responses
+Non-English performance is weak
+Risk of hallucinations or unsafe generations
 Recommendations
+Use a moderation/filtering layer in deployment
+Fine-tune with curated, domain-specific datasets
+Always keep a human-in-the-loop for sensitive applications
+How to Get Started
+Run the interactive chat interface:
 python chat_interface.py
+Or load directly in Python:
 from transformers import AutoTokenizer, AutoModelForCausalLM
 Avg length: ~266 tokens
+Max length: 1024
+Tokenizer: SentencePiece unigram, vocab=32,000
+Preprocessing
+Unicode normalization
+Special tokens (<pad>, <unk>, <s>, </s>)
+Deduplication and filtering
 Training Hyperparameters
+Regime: Mixed precision (CPU/GPU optimized)
 Hidden size: 512
 Layers: 8
+Attention heads: 8 (2 KV heads)
 Intermediate size: 1365 (SwiGLU)
+Max positions: 2048
+Learning rate: 5e-4 (cosine decay, warmup 1k steps)
 Optimizer: AdamW (β1=0.9, β2=0.95, wd=0.1)
+Batch size: 2 (effective 32 with accumulation)
 Steps: 50,000
+Speeds, Sizes, Times
 Parameters: ~54.8M
 Checkpoint size: ~220MB
+Hardware target: 7 GB RAM systems
 Evaluation
 Testing Data
+Held-out samples from training corpus
 Factors
+Conversational text, code snippets, math expressions
 Metrics
+Perplexity (PPL), loss
 Results
+Training loss decreased steadily
+Early tests show coherent text and code generation
 Summary
+CoreX v0.1 achieves usable fluency for small-scale tasks. It is not comparable to large LLMs, but excels at lightweight, private, offline usage.
+Model Examination
+Architecture: 8-layer decoder, RoPE, SwiGLU, RMSNorm, GQA
+Tokenizer verified (32k vocab, unigram SentencePiece)
 Environmental Impact
 Hardware Type: Consumer GPU/CPU
+Training Time: Several days (low resource)
 Cloud Provider: None (local)
+Carbon Emitted: Minimal (small model)
+Technical Specifications
 Model Architecture and Objective
+Decoder-only transformer
+RoPE embeddings, SwiGLU MLP, RMSNorm
+Grouped Query Attention
 Compute Infrastructure
+Hardware: ~7 GB RAM system
 Software: PyTorch, SentencePiece
+Citation
 BibTeX:
 APA:
+Nexizan inc (2025). CoreX v0.1: Lightweight Transformer Language Model.
+Glossary
+RoPE: Rotary Position Embeddings
 SwiGLU: Swish-Gated Linear Unit
+RMSNorm: Root Mean Square Norm
 GQA: Grouped Query Attention
+More Information
+CoreX v0.1 is the first milestone in the CoreX series, focused on offline-first, privacy-respecting AI systems. Future versions aim for larger datasets, more parameters, and better reasoning ability.
+Model Card Authors
+Nexizan inc — CoreX Team
 Model Card Contact