| --- |
| license: mit |
| datasets: |
| - stanfordnlp/sst2 |
| language: |
| - en |
| metrics: |
| - accuracy |
| tags: |
| - pytorch |
| - nlp |
| - text-classification |
| - sst2 |
| --- |
| # CLSE-v1 by Lloid |
|
|
| Custom Encoder Transformer trained on SST-2 sentiment classification from scratch. |
| Accuracy: 77% on SST-2 validation set. |
|
|
| ## Usage |
| ```python |
| import torch |
| from transformers import AutoTokenizer |
| |
| # Load tokenizer |
| tokenizer = AutoTokenizer.from_pretrained("lloid-labs/CLSE-v1") |
| |
| # Load model (copy the Model class from the repo) |
| model = Model(vocab_size=30522, d_model=256, n_heads=8, N_layers=4, T=128, out_features=2) |
| model.load_state_dict(torch.load("model.pth", map_location="cpu")) |
| model.eval() |
| |
| # Inference |
| sentence = "This movie was great!" |
| inputs = tokenizer(sentence, return_tensors="pt", padding="max_length", |
| truncation=True, max_length=128) |
| with torch.no_grad(): |
| logits = model(inputs["input_ids"]) |
| pred = torch.argmax(logits, dim=-1).item() |
| |
| print("Positive" if pred == 1 else "Negative") |
| ``` |
|
|
| ## Architecture |
| - 4 Encoder layers |
| - 8 attention heads |
| - d_model: 256 |
| - Trained from scratch on SST-2 |