Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -51,7 +51,7 @@ pipeline_tag: feature-extraction
|
|
| 51 |
| **Vocabulary** | 100,000 whole words |
|
| 52 |
| **Training data** | ~2B tokens (OpenWebText subset, DFSG-compliant) |
|
| 53 |
| **Training hardware** | Single NVIDIA RTX 3090 |
|
| 54 |
-
| **Training time** | ~
|
| 55 |
| **License** | GPL-3.0 |
|
| 56 |
| **Parameters** | 60M (30M target + 30M context embeddings) |
|
| 57 |
|
|
@@ -142,9 +142,9 @@ Full training code and visualizations: [github.com/ruapotato/Free-Language-Embed
|
|
| 142 |
|
| 143 |
## Interactive Visualizations
|
| 144 |
|
| 145 |
-
- [Embedding Spectrogram](https://ruapotato.github.io/
|
| 146 |
-
- [3D Semantic Directions](https://ruapotato.github.io/
|
| 147 |
-
- [Training Dashboard](https://ruapotato.github.io/
|
| 148 |
|
| 149 |
## Citation
|
| 150 |
|
|
|
|
| 51 |
| **Vocabulary** | 100,000 whole words |
|
| 52 |
| **Training data** | ~2B tokens (OpenWebText subset, DFSG-compliant) |
|
| 53 |
| **Training hardware** | Single NVIDIA RTX 3090 |
|
| 54 |
+
| **Training time** | ~4 days (2M steps) |
|
| 55 |
| **License** | GPL-3.0 |
|
| 56 |
| **Parameters** | 60M (30M target + 30M context embeddings) |
|
| 57 |
|
|
|
|
| 142 |
|
| 143 |
## Interactive Visualizations
|
| 144 |
|
| 145 |
+
- [Embedding Spectrogram](https://ruapotato.github.io/Free-Language-Embeddings/spectrogram.html) — PCA waves, sine fits, cosine surfaces
|
| 146 |
+
- [3D Semantic Directions](https://ruapotato.github.io/Free-Language-Embeddings/semantic_3d.html) — See how semantic axes align in the learned geometry
|
| 147 |
+
- [Training Dashboard](https://ruapotato.github.io/Free-Language-Embeddings/dashboard.html) — Loss curves and training metrics
|
| 148 |
|
| 149 |
## Citation
|
| 150 |
|