--- license: mit tags: - embedding - text-embedding - crypto - nlp library_name: transformers --- # crypto-mini-embed **crypto-mini-embed** adalah contoh model mini embedding berbasis arsitektur sederhana untuk eksperimen NLP seperti: - text similarity - vector search - clustering - semantic tagging - crypto-topic classification Model ini merupakan **dummy model** untuk membantu pengguna memahami struktur repository model di HuggingFace. --- ## ⚙️ Arsitektur Model - Tipe model: `MiniEmbeddingModel` - Hidden size: 64 - Max length: 128 tokens - Framework: PyTorch - Format: Safetensors - Tokenizer: Basic CharTokenizer (dummy) --- ## 📦 File dalam Model | File | Fungsi | |------|--------| | `config.json` | Konfigurasi model | | `tokenizer.json` | Tokenizer sederhana | | `model.safetensors` | Parameter model | | `README.md` | Dokumentasi model | --- ## 🧪 Contoh Penggunaan ```python from transformers import AutoTokenizer, AutoModel import torch tok = AutoTokenizer.from_pretrained("0xcubin/crypto-mini-embed") model = AutoModel.from_pretrained("0xcubin/crypto-mini-embed") text = "Bitcoin is digital money" inputs = tok(text, return_tensors="pt") with torch.no_grad(): emb = model(**inputs).last_hidden_state.mean(dim=1) print(emb.shape) # contoh: (1, 64)