--- license: apache-2.0 base_model: - nomic-ai/nomic-bert-2048 tags: - symbolic - classification - text_masking - image_feature - text_feature - experimental - categorical - similarity - teacher --- A great deal of experimentation and testing has now been done on this model. It is more than capable of handling categorization, classification, text masking, similarity detection, similarity offset comparison, and many more tasks that I haven't listed. It's small, so the cracks show. I have plans for a much more diverse categorical array with a much larger set of symbolic tokens for a version 2 of this model with a more expansive set of masking processes to train it; each more carefully tuned to not damage the alternative pretrained pathways. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/6iTdKxUiUtzm5MWoZpOdM.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/nffcaOzBsC0ymG80NWudD.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/OKE94fmGlfFhRAKUpubrR.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/gLiYYH9nkvpXyDLF2ssGF.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/tl0bDNnFL_kdSDnhTMf56.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/kz4J3YTnBOaySdgf3-rzp.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/1KEiW_GS5l-BTt4VYb_Tt.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/YopKBUJoKPJVUXSgdl5mX.png) # SEMANTIC TOKEN STATISTICS: Average similarity between tokens: 0.232 Std dev of similarities: 0.043 Max similarity: 0.368 Min similarity: 0.082 Most similar token pairs: