Slicky325
/

token-selector-model

Model card Files Files and versions

xet

Community

Slicky325 commited on Dec 16, 2025

Commit

4bafda2

verified ·

1 Parent(s): d9e4845

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +2 -77

README.md CHANGED Viewed

@@ -1,85 +1,10 @@
----
-tags:
-- token-importance
-- attention-classifier
-- llama
----
 # Token Importance Classifier
-This model is a single-layer attention-based classifier trained to predict token importance in sequences.
-## Model Details
-- **Architecture**: Single-layer self-attention network with RoPE positional embeddings
-- **Base Model**: meta-llama/Llama-3.1-8B
-- **Hidden Dimension**: 4096
-- **Number of Heads**: 32
-- **Max Sequence Length**: 131072
-## Training Configuration
-```yaml
-data:
-  max_seq_len: 131072
-  path: /root/workspace/data_generation/data/sample_output.jsonl
-  tokenizer_path: meta-llama/Llama-3.1-8B
-  valid_split: 0.1
-final_metrics:
-  accuracy: 0.8365938756296772
-  f1: 0.9094284550391643
-  precision: 0.8365938756296772
-  recall: 1.0
-huggingface:
-  private: false
-  push_to_hub: true
-  repo_id: Slicky325/token-selector-model
-model:
-  base_model_dir: meta-llama/Llama-3.1-8B
-  dropout: 0.1
-  hidden_dim: 4096
-  max_seq_len: 131072
-  num_heads: 32
-  rope_theta: 500000
-  save_embeddings: false
-  save_path: models/selector.pt
-  train_embeddings: false
-  use_positional: true
-system:
-  device: cuda
-  num_workers: 2
-training:
-  batch_size: 4
-  epochs: 1
-  grad_clip: 1.0
-  learning_rate: 0.001
-  seed: 42
-  weight_decay: 0.0
-```
-## Validation Metrics
-- **Accuracy**: 0.8365938756296772
-- **Precision**: 0.8365938756296772
-- **Recall**: 1.0
-- **F1 Score**: 0.9094284550391643
 ## Usage
 ```python
 import torch
-from pathlib import Path
-# Load the checkpoint
 checkpoint = torch.load('selector.pt')
-model_state = checkpoint['model_state_dict']
-config = checkpoint['config']
-# Initialize your model architecture and load the weights
-# model.load_state_dict(model_state)
 ```
-## Citation
-If you use this model in your research, please cite appropriately.

 # Token Importance Classifier
+Trained with F1: 0.9094, Accuracy: 0.8366
 ## Usage
 ```python
 import torch
 checkpoint = torch.load('selector.pt')
+model.load_state_dict(checkpoint['model_state_dict'])
 ```