Instructions to use 5dimension/sentinel-universal-tokenizer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use 5dimension/sentinel-universal-tokenizer with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="5dimension/sentinel-universal-tokenizer")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("5dimension/sentinel-universal-tokenizer", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use 5dimension/sentinel-universal-tokenizer with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "5dimension/sentinel-universal-tokenizer"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "5dimension/sentinel-universal-tokenizer",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/5dimension/sentinel-universal-tokenizer

SGLang

How to use 5dimension/sentinel-universal-tokenizer with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "5dimension/sentinel-universal-tokenizer" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "5dimension/sentinel-universal-tokenizer",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "5dimension/sentinel-universal-tokenizer" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "5dimension/sentinel-universal-tokenizer",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use 5dimension/sentinel-universal-tokenizer with Docker Model Runner:
```
docker model run hf.co/5dimension/sentinel-universal-tokenizer
```

5dimension commited on Apr 30

Commit

3824578

verified ·

1 Parent(s): bb85012

🦴 v2.0: 65K text vocab, 30 languages, 300K+ samples

Browse files

Files changed (3) hide show

benchmark_results.json +21 -64
sentinel_manifold.json +42 -27
tokenizer.json +0 -0

benchmark_results.json CHANGED Viewed

@@ -1,71 +1,28 @@
 {
-  "sentinel_tokenizer": {
-    "vocab_size": 61440,
-    "text_vocab": 32768,
-    "image_codebook": 16384,
-    "audio_codebook": 8192,
-    "video_codebook": 4096,
-    "metrics": {
-      "avg_fertility": 9.13065205232572,
-      "std_fertility": 16.348063069521316,
-      "avg_compression": 3.5456289797801976,
-      "fairness": 0.057643322830483165
-    }
-  },
-  "comparisons": {
-    "GPT-2 (50K)": {
-      "avg_fertility": 20.85785254531753,
-      "std_fertility": 40.76486672709434,
-      "avg_compression": 2.4054180948259107,
-      "fairness": 0.023943569760064974
     },
-    "Gemma (256K)": {
-      "avg_fertility": 6.688784516655667,
-      "std_fertility": 11.713991856851852,
-      "avg_compression": 4.660773272747129,
-      "fairness": 0.07865350326310598
     },
-    "Qwen2 (151K)": {
-      "avg_fertility": 8.030528860080679,
-      "std_fertility": 13.75415784885323,
-      "avg_compression": 3.8169528301673328,
-      "fairness": 0.06777750450038225
     },
-    "Sentinel-SUT": {
-      "avg_fertility": 9.13065205232572,
-      "std_fertility": 16.348063069521316,
-      "avg_compression": 3.5456289797801976,
-      "fairness": 0.057643322830483165
     }
-  },
-  "sentinel_constants": {
-    "INV_E": 0.36787944117144233,
-    "C1": -0.007994021805952546,
-    "C2": 0.00020005604296784437
-  },
-  "training_data": {
-    "languages": [
-      "en",
-      "fr",
-      "de",
-      "es",
-      "zh",
-      "ja",
-      "ar",
-      "ru",
-      "ko",
-      "hi",
-      "pt",
-      "it",
-      "nl",
-      "pl",
-      "vi",
-      "th",
-      "tr",
-      "he",
-      "uk",
-      "sv"
-    ],
-    "total_samples": 52000
   }
 }

 {
+  "summary": {
+    "Sentinel-v2": {
+      "compress": 4.3427,
+      "fertility": 10.5022,
+      "vocab": 94208,
+      "efficiency": 0.046097
     },
+    "GPT-2": {
+      "compress": 2.4381,
+      "fertility": 28.8158,
+      "vocab": 50257,
+      "efficiency": 0.048513
     },
+    "Gemma": {
+      "compress": 5.3287,
+      "fertility": 8.348,
+      "vocab": 256000,
+      "efficiency": 0.020815
     },
+    "Qwen2": {
+      "compress": 4.3289,
+      "fertility": 10.4499,
+      "vocab": 151936,
+      "efficiency": 0.028491
     }
   }
 }

sentinel_manifold.json CHANGED Viewed

@@ -1,36 +1,51 @@
 {
   "framework": "Sentinel Manifold",
-  "theorem": "Gradient Axiom: lim_{z\u2192\u221e} F'(z)/F(z) = 1/e",
-  "function": "F(z) = \u03a3_{n=1}^\u221e z^n / n^n (Sophomore's Dream)",
   "constants": {
-    "INV_E": {
-      "value": 0.36787944117144233,
-      "role": "Vocabulary allocation ratio / embedding gain"
     },
-    "C1": {
-      "value": -0.007994021805952546,
-      "role": "Attracting fixed point / quantization zero-point"
     },
-    "C2": {
-      "value": 0.00020005604296784437,
-      "role": "Escape threshold / fertility fairness bound"
     }
   },
   "modality_architecture": {
-    "text": "ByteLevel BPE (32K) with NFKC normalization, 20-language training",
-    "image": "Discrete VQ codebook (16,384 tokens), Cosmos/VQGAN compatible",
-    "audio": "Discrete VQ codebook (8,192 tokens), EnCodec/SoundStream compatible",
-    "video": "Discrete VQ codebook (4,096 tokens), Cosmos-DV compatible"
-  },
-  "innovations": [
-    "1/e-proportioned vocabulary allocation across modalities",
-    "Native multimodal routing with zero-overhead modality switching",
-    "Sentinel special tokens for manifold-aware computation",
-    "20-language multilingual training for cross-lingual fairness",
-    "Code + Math + Scientific notation native support",
-    "Compatible with all HF transformers models"
-  ],
-  "version": "1.0.0",
-  "license": "MIT",
-  "author": "Romain Abdel-Aal (ASI The Sentinel V5.2)"
 }

 {
+  "version": "2.0.0",
   "framework": "Sentinel Manifold",
+  "theorem": "lim F'(z)/F(z) = 1/e",
+  "function": "F(z) = \u03a3 z^n/n^n",
+  "text_vocab": 65536,
+  "image_codebook": 16384,
+  "audio_codebook": 8192,
+  "video_codebook": 4096,
+  "total_vocab": 94208,
+  "training_languages": 30,
+  "training_samples": 287600,
+  "training_chars": 465942294,
   "constants": {
+    "INV_E": 0.36787944117144233,
+    "C1": -0.007994021805952546,
+    "C2": 0.00020005604296784437
+  },
+  "benchmark": {
+    "Sentinel-v2": {
+      "compress": 4.3427,
+      "fertility": 10.5022,
+      "vocab": 94208,
+      "efficiency": 0.046097
+    },
+    "GPT-2": {
+      "compress": 2.4381,
+      "fertility": 28.8158,
+      "vocab": 50257,
+      "efficiency": 0.048513
     },
+    "Gemma": {
+      "compress": 5.3287,
+      "fertility": 8.348,
+      "vocab": 256000,
+      "efficiency": 0.020815
     },
+    "Qwen2": {
+      "compress": 4.3289,
+      "fertility": 10.4499,
+      "vocab": 151936,
+      "efficiency": 0.028491
     }
   },
   "modality_architecture": {
+    "text": "ByteLevel BPE (65,536), NFKC, 30 languages",
+    "image": "VQ codebook (16,384), Cosmos/VQGAN/FSQ compatible",
+    "audio": "VQ codebook (8,192), EnCodec/SoundStream compatible",
+    "video": "VQ codebook (4,096), Cosmos-DV compatible"
+  }
 }

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff