Spaces:

TNSA
/

README

Configuration error

App Files Files Community

Thishyaketh commited on May 9, 2025

Commit

9eb4593

verified ·

1 Parent(s): d312d94

Update README.md

Browse files

Files changed (1) hide show

README.md +43 -56

README.md CHANGED Viewed

@@ -1,78 +1,65 @@
-# NGen3: Next-Generation Foundational Model
-NGen3 is a production-level foundational language model inspired by state-of-the-art architectures such as GPT-4, Claude-3, and Llama 2. It is designed to be highly modular, efficient, and accessible via a flexible command-line interface (CLI). NGen3 supports multiple model variants—from 7M parameters to 1B parameters—and offers a comprehensive suite of tools for:
-- **Tokenization:** Process text from local files, URLs, or Hugging Face datasets.
-- **Training:** Train the model on tokenized data.
-- **Sampling:** Generate text from trained models.
-- **Exporting:** Save models and minimal tokenizer configurations in formats compatible with Hugging Face.
-- **Knowledge Distillation:** Train a smaller student model using a larger teacher model.
-- **Fine-Tuning:** Adapt a distilled model on conversational data (from local sources or directly from Hugging Face).
-This repository provides a complete implementation of the NGen3 model along with detailed CLI commands to facilitate experimentation and research.
----
-## Table of Contents
-- [Model Overview](#model-overview)
-- [Architecture](#architecture)
-- [Installation](#installation)
-- [Usage](#usage)
-  - [Tokenization](#tokenization)
-  - [Training](#training)
-  - [Sampling](#sampling)
-  - [Exporting](#exporting)
-  - [Knowledge Distillation](#knowledge-distillation)
-  - [Fine-Tuning](#fine-tuning)
-    - [Local Fine-Tuning](#local-fine-tuning)
-    - [Hugging Face Fine-Tuning](#hugging-face-fine-tuning)
-- [Hyperparameters](#hyperparameters)
-- [Contributing](#contributing)
-- [License](#license)
-- [Acknowledgements](#acknowledgements)
 ---
-## Model Overview
-NGen3 is designed for rapid development and deployment of foundational language models. Its flexible CLI allows users to:
-- **Tokenize Text:** Convert raw text or datasets into tokenized binary format.
-- **Train Models:** Use various hyperparameter configurations based on the desired model size.
-- **Generate Samples:** Evaluate model performance and generate text samples.
-- **Export Models:** Easily export models in `safetensors` and JSON configurations for integration with Hugging Face tools.
-- **Distill Models:** Leverage knowledge distillation to compress larger models into efficient student variants.
-- **Fine-Tune on Conversations:** Adapt models to conversational data using both local and Hugging Face datasets.
 ---
-## Architecture
-NGen3’s architecture is built upon the transformer decoder design. Key components include:
-- **Token and Positional Embeddings:** Learnable embeddings that encode input tokens and their positions.
-- **Stack of Transformer Blocks:** Each block contains:
-  - **Causal Self-Attention:** With multi-head attention and masking to prevent information leakage.
-  - **MLP (Feed-Forward Network):** Utilizes GELU activation for non-linearity.
-  - **Residual Connections and Layer Normalization:** Stabilize training and improve convergence.
-- **Final Projection Layer:** Maps embeddings to logits over the vocabulary.
-The model supports variants with parameter counts ranging from 7M to 1B, making it adaptable for various research and production needs.
 ---
-## Installation
-Ensure you have Python 3.8+ installed along with the following packages:
-- PyTorch
-- transformers
-- datasets
-- tqdm
-- safetensors (for export functionality)
-Install the required packages using pip:
-```bash
-pip install torch transformers datasets tqdm safetensors

+# 🧠 TNSA AI
+**Welcome to TNSA AI on Hugging Face 🤖**
+---
+## 🌟 About Us
+**TNSA AI** is a frontier research group focused on building open, powerful, and human-aligned **Artificial General Intelligence (AGI)**. From cutting-edge language models to multimodal systems, our mission is to create intelligent systems that empower humanity — not just compete with it.
+We build from scratch, at scale, and with purpose.
 ---
+## 🏗️ Core Projects
+| 🔬 Project       | 🚀 Description                                                                 |
+|------------------|---------------------------------------------------------------------------------|
+| **NGen Series**  | High-performance transformer models (NGen1, NGen2, NGen3Jax_N, and more)        |
+| **Tokenize2**    | Our in-house tokenizer built on optimized Byte-Pair Encoding (BPE)              |
+| **DiffuseX2**    | Industry-level Stable Diffusion architecture for powerful image generation      |
+| **Adhyaapak-2**  | Personalized Teaching AI powered by LLM + vision/audio understanding            |
+| **Neura**        | An AGI-grade modular architecture with memory, deliberation & vision modules    |
+| **ARCH-X 9**     | ML framework architecture for scalable multimodal development                   |
+| **StellarTTS**   | Text-to-speech engine based on Wi LLM capable of human-like speech synthesis    |
+| **SongGEN**      | AI music generation engine inspired by OpenAI’s Jukebox                         |
 ---
+## 📦 Featured Models & Tools
+- `tnsa-ai/NGen2.1Base`
+- `tnsa-ai/Tokenize2`
+- `tnsa-ai/DiffuseX2-mini`
+- `tnsa-ai/Adhyaapak-2-Beta`
+- `tnsa-ai/StellarTTS-Nano`
+- `tnsa-ai/SongGEN-Core`
+> All models are trained or fine-tuned using our custom pipelines and follow the [TNSA Standard].
+---
+## 🔧 Dev Philosophy
+- ✅ Fully open-sourced codebases (no black boxes)
+- 🔁 Reinvent core components when needed
+- 🔬 Prioritize interpretability, capability, and safety
+- 🧪 Models designed for education, research, and real-world use
 ---
+## 🧑‍💻 Join Our Movement
+We’re not a company. We’re a **mission**.
+🚀 If you believe in building AGI for collective good — not just profit — you’re one of us.
+📫 Reach out: tnsaresearch@proton.me
+🔗 Website (Coming Soon): [TNSA.ai](https://tnsa.ai)
+🌐 Follow our open work here and contribute!
+---
+> “A mind once stretched by a new idea never returns to its original dimensions.”
+> — **TNSA AI**