|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- vision |
|
|
- encoder |
|
|
- multimodal |
|
|
- self-supervised |
|
|
- video |
|
|
- execution |
|
|
- symbolic |
|
|
library_name: pytorch |
|
|
pipeline_tag: feature-extraction |
|
|
datasets: |
|
|
- Nine1Eight/vil-canonical-glyph-system |
|
|
--- |
|
|
|
|
|
# VIL Encoder v1.2 (GVL-P) |
|
|
|
|
|
**VIL Encoder v1.2** is a glyphmatic vision encoder trained using |
|
|
**GVL-P (Glyphmatic Video-Language Pretraining) v1.2**. |
|
|
|
|
|
This model learns **temporal execution structure** from canonical glyph |
|
|
sequences derived from text, code, binaries, and other data. |
|
|
|
|
|
> ⚠️ This model does **not tokenize language**. |
|
|
> All inputs are compiled into a **canonical glyph IR (base-111)**. |
|
|
|
|
|
--- |
|
|
|
|
|
## Architecture |
|
|
|
|
|
- **Vision Encoder:** GlyphVisionEncoder |
|
|
- **Temporal Head:** TemporalGlyphTransformer |
|
|
- **Embedding Dimension:** 768 |
|
|
- **Canon Size:** 111 |
|
|
- **Deterministic:** Yes |
|
|
|
|
|
--- |
|
|
|
|
|
## Training (GVL-P v1.2) |
|
|
|
|
|
Training is **fully self-supervised**: |
|
|
|
|
|
1. Arbitrary input (text, code, binary) |
|
|
2. Deterministic compilation → glyph indices |
|
|
3. Sliding temporal windows |
|
|
4. Next-step temporal consistency objective |
|
|
|
|
|
No labels, captions, or annotations were used. |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
- Execution-aware embeddings |
|
|
- Vision–language research |
|
|
- Glyph-based reasoning systems |
|
|
- Multimodal IR experiments |
|
|
|
|
|
This is **not** a language model. |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Requires canonical glyph compilation |
|
|
- No text generation |
|
|
- No decoding or execution |
|
|
|
|
|
--- |
|
|
|
|
|
## Weights |
|
|
|
|
|
File: |
|
|
vil-encoder-v1.2.pt |
|
|
Checkpoint contains: |
|
|
- `vision_encoder` |
|
|
- `temporal_head` |
|
|
- `embed_dim` |
|
|
- `canon_size` |
|
|
- `gvlp_version = 1.2` |
|
|
|
|
|
--- |
|
|
|
|
|
## Relationship to VIL |
|
|
|
|
|
Canonical dataset: |
|
|
https://huggingface.co/datasets/Nine1Eight/vil-canonical-glyph-system |
|
|
|
|
|
--- |
|
|
|
|
|
## Author |
|
|
|
|
|
Matthew Blake Ward (Nine1Eight) |
|
|
Tulsa, Oklahoma, USA |
|
|
|