- π Model Overview
- βοΈ How It Works (Real Use Case)
- Architecture Configuration
- Achieved an impressive 82.11% accuracy and 82.10%F1-score on the GLUE SST-2 validation benchmark,demonstrating highly balanced precision and recall across classes.
- π Quick Start (Google Colab / Local)
- π Training Details
- β οΈ Limitations & Bias
- π€ Acknowledgments & Citation
π Model Overview
VibeCheck-22M is a highly efficient, 22 million parameter language model built completely from scratch. Unlike standard out-of-the-box models, it features a custom Llama-style transformer architecture and has been specifically fine-tuned to perform highly accurate sentiment analysis (Positive/Negative) on English text.
- Model Type: Causal Language Model (Transformer Decoder)
- Primary Task: Sentiment Analysis / Text Classification
- Language: English
βοΈ How It Works (Real Use Case)
The model takes an English sentence as input, processes it through its custom neural network layers, and generates a precise sentiment prediction.
Architecture Configuration
This model relies on a modern, custom-built transformer architecture incorporating advanced techniques used in leading LLMs (like RoPE and SwiGLU).
| Hyperparameter | Value | Description |
|---|---|---|
| Vocab Size | 10,240 |
Custom trained Tokenizer |
| Dim (d_model) | 384 |
Embedding dimension |
| Layers | 6 |
Number of Transformer blocks |
| Heads | 6 |
Number of Attention heads |
| Max Context | 256 |
Maximum sequence length |
| Normalization | RMSNorm |
Root Mean Square Normalization |
| Activation | SwiGLU |
Advanced FeedForward Network |
Achieved an impressive 82.11% accuracy and 82.10%F1-score on the GLUE SST-2 validation benchmark,demonstrating highly balanced precision and recall across classes.
π Quick Start (Google Colab / Local)
Since this model uses a custom architecture, it requires downloading the provided python scripts (config.py, model.py) to run. You can easily test it in Google Colab in under a minute.
# 1. Install required libraries
!pip install -q huggingface_hub tokenizers torch
# 2. Download files from Hugging Face
from huggingface_hub import hf_hub_download
import torch
repo_id = "khairul5/VibeCheck-22M"
print("π₯ Downloading model files...")
for file in ["config.py", "model.py", "tokenizer.json", "vibecheck_22m.pt"]:
hf_hub_download(repo_id=repo_id, filename=file, local_dir=".")
# 3. Initialize Model and Tokenizer
from tokenizers import Tokenizer
from config import LMConfig
from model import TransformerLM
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = Tokenizer.from_file("tokenizer.json")
config = LMConfig()
model = TransformerLM(config).to(device)
# Load the trained weights
checkpoint = torch.load("vibecheck_22m.pt", map_location=device, weights_only=False)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()
# 4. Inference Function
@torch.no_grad()
def check_vibe(text):
prompt = f"Text: {text}\nSentiment: "
input_ids = tokenizer.encode(prompt).ids
x = torch.tensor([input_ids], dtype=torch.long).to(device)
generated_tokens = []
for _ in range(5):
logits, _ = model(x)
next_token = torch.argmax(logits[0, -1, :]).item()
generated_tokens.append(next_token)
x = torch.cat((x, torch.tensor([[next_token]], device=device)), dim=1)
current_text = tokenizer.decode(generated_tokens)
if "<|endoftext|>" in current_text or "\n" in current_text:
break
ans = tokenizer.decode(generated_tokens).replace("<|endoftext|>", "").strip()
return f"π Input: '{text}'\nπ€ Vibe: {ans}\n"
# π― Test it!
print(check_vibe("This custom model is absolutely fantastic!"))
# Output: Positive
π Training Details
The training process was divided into two distinct phases to ensure linguistic understanding and task accuracy:
- Phase 1 (Pre-training): The model was first pre-trained on the TinyStories dataset to learn English grammar, sentence structure, and basic reasoning.
- Phase 2 (Fine-tuning): It was then instruction-tuned using the SST-2 (Stanford Sentiment Treebank) dataset to specialize purely in sentiment classification.
β οΈ Limitations & Bias
- Context Limit: Optimized for short sentences and paragraphs (up to 256 tokens). Longer texts may be truncated.
- Task Specificity: Although built as an autoregressive causal LM, it is highly fine-tuned to only output sentiment labels. It is not designed for general chat or Q&A.
π€ Acknowledgments & Citation
Built with passion by Md. Khairul Islam. The architecture draws heavy inspiration from the Llama series by Meta.
If you use this model in your research or project, please credit the repository:
@misc{vibecheck22m,
author = {Islam, Md. Khairul},
title = {VibeCheck-22M: Custom Sentiment Analysis AI},
year = {2026},
publisher = {Hugging Face},
journal = {Hugging Face Repository},
howpublished = {\url{[https://huggingface.co/khairul5/VibeCheck-22M](https://huggingface.co/khairul5/VibeCheck-22M)}}
}