Upload intent classifier v2 - Accuracy: 100.00%
Browse files- README.md +144 -0
- checkpoint-220/config.json +54 -0
- checkpoint-220/model.safetensors +3 -0
- checkpoint-220/optimizer.pt +3 -0
- checkpoint-220/rng_state.pth +3 -0
- checkpoint-220/scheduler.pt +3 -0
- checkpoint-220/special_tokens_map.json +7 -0
- checkpoint-220/tokenizer.json +0 -0
- checkpoint-220/tokenizer_config.json +56 -0
- checkpoint-220/trainer_state.json +119 -0
- checkpoint-220/training_args.bin +3 -0
- checkpoint-220/vocab.txt +0 -0
- checkpoint-275/config.json +54 -0
- checkpoint-275/model.safetensors +3 -0
- checkpoint-275/optimizer.pt +3 -0
- checkpoint-275/rng_state.pth +3 -0
- checkpoint-275/scheduler.pt +3 -0
- checkpoint-275/special_tokens_map.json +7 -0
- checkpoint-275/tokenizer.json +0 -0
- checkpoint-275/tokenizer_config.json +56 -0
- checkpoint-275/trainer_state.json +138 -0
- checkpoint-275/training_args.bin +3 -0
- checkpoint-275/vocab.txt +0 -0
- config.json +54 -0
- label_encoder.pkl +3 -0
- model.safetensors +3 -0
- special_tokens_map.json +7 -0
- tokenizer.json +0 -0
- tokenizer_config.json +56 -0
- training_args.bin +3 -0
- training_results.json +37 -0
- vocab.txt +0 -0
README.md
ADDED
|
@@ -0,0 +1,144 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
tags:
|
| 5 |
+
- text-classification
|
| 6 |
+
- intent-classification
|
| 7 |
+
- conversational-ai
|
| 8 |
+
- bert
|
| 9 |
+
- distilbert
|
| 10 |
+
datasets:
|
| 11 |
+
- custom
|
| 12 |
+
metrics:
|
| 13 |
+
- accuracy
|
| 14 |
+
- f1
|
| 15 |
+
model-index:
|
| 16 |
+
- name: intent-classifier-v2
|
| 17 |
+
results:
|
| 18 |
+
- task:
|
| 19 |
+
type: text-classification
|
| 20 |
+
name: Intent Classification
|
| 21 |
+
metrics:
|
| 22 |
+
- type: accuracy
|
| 23 |
+
value: 1.0000
|
| 24 |
+
name: Test Accuracy
|
| 25 |
+
- type: f1
|
| 26 |
+
value: 1.0000
|
| 27 |
+
name: Weighted F1
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
# DAPA Intent Classifier v2
|
| 31 |
+
|
| 32 |
+
## Model Description
|
| 33 |
+
|
| 34 |
+
This model classifies user intents for the DAPA AI conversational assistant system. It supports 13 different intents including 12 agentic workflows and 1 general Q&A fallback.
|
| 35 |
+
|
| 36 |
+
**Model Type:** DistilBERT for Sequence Classification
|
| 37 |
+
**Training Date:** 2025-10-25
|
| 38 |
+
**Accuracy:** 100.00%
|
| 39 |
+
**F1 Score:** 1.0000
|
| 40 |
+
|
| 41 |
+
## Supported Intents
|
| 42 |
+
|
| 43 |
+
The model classifies queries into 13 intents:
|
| 44 |
+
|
| 45 |
+
### Agentic Intents (12)
|
| 46 |
+
1. **generate-offer** - Generate job offers, NDAs, contracts
|
| 47 |
+
2. **schedule-interview** - Schedule candidate interviews
|
| 48 |
+
3. **update-employee-profile** - Update employee information
|
| 49 |
+
4. **access-employee-record** - Access employee records
|
| 50 |
+
5. **approve-expense** - Approve expense reports
|
| 51 |
+
6. **check-leave-balance** - Check leave balances
|
| 52 |
+
7. **confirm-training-completion** - Confirm training completion
|
| 53 |
+
8. **provide-candidate-feedback** - Provide candidate feedback
|
| 54 |
+
9. **request-leave** - Request time off
|
| 55 |
+
10. **request-training** - Request training enrollment
|
| 56 |
+
11. **review-contract** - Review contracts
|
| 57 |
+
12. **submit-expense** - Submit expense reports
|
| 58 |
+
|
| 59 |
+
### Q&A Intent (1)
|
| 60 |
+
13. **general-query** - Generic queries (email lookups, status checks, policy questions)
|
| 61 |
+
|
| 62 |
+
## Training Data
|
| 63 |
+
|
| 64 |
+
- **Total Samples:** 1,240
|
| 65 |
+
- **Training Split:** 70% (868 samples)
|
| 66 |
+
- **Validation Split:** 15% (186 samples)
|
| 67 |
+
- **Test Split:** 15% (186 samples)
|
| 68 |
+
- **Data Balance:** 80-200 examples per intent
|
| 69 |
+
|
| 70 |
+
## Performance
|
| 71 |
+
|
| 72 |
+
### Overall Metrics
|
| 73 |
+
- **Test Accuracy:** 100.00%
|
| 74 |
+
- **Weighted Precision:** 1.0000
|
| 75 |
+
- **Weighted Recall:** 1.0000
|
| 76 |
+
- **Weighted F1:** 1.0000
|
| 77 |
+
|
| 78 |
+
### Per-Intent Performance
|
| 79 |
+
|
| 80 |
+
| Intent | Precision | Recall | F1 | Support |
|
| 81 |
+
|--------|-----------|--------|-----|---------|
|
| 82 |
+
| access-employee-record | 1.000 | 1.000 | 1.000 | 12 |\n| approve-expense | 1.000 | 1.000 | 1.000 | 12 |\n| check-leave-balance | 1.000 | 1.000 | 1.000 | 12 |\n| confirm-training-completion | 1.000 | 1.000 | 1.000 | 12 |\n| general-query | 1.000 | 1.000 | 1.000 | 30 |\n| generate-offer | 1.000 | 1.000 | 1.000 | 18 |\n| provide-candidate-feedback | 1.000 | 1.000 | 1.000 | 12 |\n| request-leave | 1.000 | 1.000 | 1.000 | 12 |\n| request-training | 1.000 | 1.000 | 1.000 | 12 |\n| review-contract | 1.000 | 1.000 | 1.000 | 12 |\n| schedule-interview | 1.000 | 1.000 | 1.000 | 15 |\n| submit-expense | 1.000 | 1.000 | 1.000 | 12 |\n| update-employee-profile | 1.000 | 1.000 | 1.000 | 15 |\n
|
| 83 |
+
|
| 84 |
+
## Usage
|
| 85 |
+
|
| 86 |
+
```python
|
| 87 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
| 88 |
+
import torch
|
| 89 |
+
|
| 90 |
+
# Load model and tokenizer
|
| 91 |
+
tokenizer = AutoTokenizer.from_pretrained("SantmanKT/intent-classifier-v2")
|
| 92 |
+
model = AutoModelForSequenceClassification.from_pretrained("SantmanKT/intent-classifier-v2")
|
| 93 |
+
|
| 94 |
+
# Predict intent
|
| 95 |
+
query = "send offer to John"
|
| 96 |
+
context = "{domain: hr}"
|
| 97 |
+
input_text = f"{query} [context: {context}]"
|
| 98 |
+
|
| 99 |
+
inputs = tokenizer(input_text, return_tensors="pt", truncation=True, max_length=128)
|
| 100 |
+
outputs = model(**inputs)
|
| 101 |
+
probs = torch.softmax(outputs.logits, dim=-1)
|
| 102 |
+
confidence, predicted_idx = torch.max(probs, dim=-1)
|
| 103 |
+
|
| 104 |
+
intent = model.config.id2label[predicted_idx.item()]
|
| 105 |
+
print(f"Intent: {intent}, Confidence: {confidence.item():.2%}")
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
+
## Routing Logic
|
| 109 |
+
|
| 110 |
+
- **High Confidence (≥70%) + Agentic Intent** → Route to Domain Service
|
| 111 |
+
- **Low Confidence (<70%) OR general-query** → Route to Q&A Service
|
| 112 |
+
|
| 113 |
+
## Model Details
|
| 114 |
+
|
| 115 |
+
- **Base Model:** distilbert-base-uncased
|
| 116 |
+
- **Max Sequence Length:** 128 tokens
|
| 117 |
+
- **Training Epochs:** 5
|
| 118 |
+
- **Batch Size:** 16
|
| 119 |
+
- **Learning Rate:** 2e-5
|
| 120 |
+
- **Framework:** HuggingFace Transformers
|
| 121 |
+
|
| 122 |
+
## Limitations
|
| 123 |
+
|
| 124 |
+
- Optimized for English language only
|
| 125 |
+
- Requires context formatting: `[context: {...}]`
|
| 126 |
+
- Performance may degrade on queries significantly different from training data
|
| 127 |
+
|
| 128 |
+
## Citation
|
| 129 |
+
|
| 130 |
+
If you use this model, please cite:
|
| 131 |
+
|
| 132 |
+
```
|
| 133 |
+
@misc{dapa-intent-classifier-v2,
|
| 134 |
+
author = {SantmanKT},
|
| 135 |
+
title = {DAPA Intent Classifier v2},
|
| 136 |
+
year = {2025},
|
| 137 |
+
publisher = {HuggingFace},
|
| 138 |
+
url = {https://huggingface.co/SantmanKT/intent-classifier-v2}
|
| 139 |
+
}
|
| 140 |
+
```
|
| 141 |
+
|
| 142 |
+
## License
|
| 143 |
+
|
| 144 |
+
Apache 2.0
|
checkpoint-220/config.json
ADDED
|
@@ -0,0 +1,54 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"activation": "gelu",
|
| 3 |
+
"architectures": [
|
| 4 |
+
"DistilBertForSequenceClassification"
|
| 5 |
+
],
|
| 6 |
+
"attention_dropout": 0.1,
|
| 7 |
+
"dim": 768,
|
| 8 |
+
"dropout": 0.1,
|
| 9 |
+
"dtype": "float32",
|
| 10 |
+
"hidden_dim": 3072,
|
| 11 |
+
"id2label": {
|
| 12 |
+
"0": "LABEL_0",
|
| 13 |
+
"1": "LABEL_1",
|
| 14 |
+
"2": "LABEL_2",
|
| 15 |
+
"3": "LABEL_3",
|
| 16 |
+
"4": "LABEL_4",
|
| 17 |
+
"5": "LABEL_5",
|
| 18 |
+
"6": "LABEL_6",
|
| 19 |
+
"7": "LABEL_7",
|
| 20 |
+
"8": "LABEL_8",
|
| 21 |
+
"9": "LABEL_9",
|
| 22 |
+
"10": "LABEL_10",
|
| 23 |
+
"11": "LABEL_11",
|
| 24 |
+
"12": "LABEL_12"
|
| 25 |
+
},
|
| 26 |
+
"initializer_range": 0.02,
|
| 27 |
+
"label2id": {
|
| 28 |
+
"LABEL_0": 0,
|
| 29 |
+
"LABEL_1": 1,
|
| 30 |
+
"LABEL_10": 10,
|
| 31 |
+
"LABEL_11": 11,
|
| 32 |
+
"LABEL_12": 12,
|
| 33 |
+
"LABEL_2": 2,
|
| 34 |
+
"LABEL_3": 3,
|
| 35 |
+
"LABEL_4": 4,
|
| 36 |
+
"LABEL_5": 5,
|
| 37 |
+
"LABEL_6": 6,
|
| 38 |
+
"LABEL_7": 7,
|
| 39 |
+
"LABEL_8": 8,
|
| 40 |
+
"LABEL_9": 9
|
| 41 |
+
},
|
| 42 |
+
"max_position_embeddings": 512,
|
| 43 |
+
"model_type": "distilbert",
|
| 44 |
+
"n_heads": 12,
|
| 45 |
+
"n_layers": 6,
|
| 46 |
+
"pad_token_id": 0,
|
| 47 |
+
"problem_type": "single_label_classification",
|
| 48 |
+
"qa_dropout": 0.1,
|
| 49 |
+
"seq_classif_dropout": 0.2,
|
| 50 |
+
"sinusoidal_pos_embds": false,
|
| 51 |
+
"tie_weights_": true,
|
| 52 |
+
"transformers_version": "4.57.1",
|
| 53 |
+
"vocab_size": 30522
|
| 54 |
+
}
|
checkpoint-220/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7271f7943cae385cf6983f9b28593d0b455415275ce51c6d3e8d62c888103cf0
|
| 3 |
+
size 267866404
|
checkpoint-220/optimizer.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a881b17de8775bd85d9f400a6d7e46096a50a44b51a59286a041341eb75004dc
|
| 3 |
+
size 535796811
|
checkpoint-220/rng_state.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:80b137b2f227cdf353434f493e3c7ba766b954350d79b48a45c8463422ff4eff
|
| 3 |
+
size 14645
|
checkpoint-220/scheduler.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f734eabbfe87eaa3ac495e780d550b3e9bd337494d9a3caea639822a8fea66dc
|
| 3 |
+
size 1465
|
checkpoint-220/special_tokens_map.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"cls_token": "[CLS]",
|
| 3 |
+
"mask_token": "[MASK]",
|
| 4 |
+
"pad_token": "[PAD]",
|
| 5 |
+
"sep_token": "[SEP]",
|
| 6 |
+
"unk_token": "[UNK]"
|
| 7 |
+
}
|
checkpoint-220/tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
checkpoint-220/tokenizer_config.json
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"added_tokens_decoder": {
|
| 3 |
+
"0": {
|
| 4 |
+
"content": "[PAD]",
|
| 5 |
+
"lstrip": false,
|
| 6 |
+
"normalized": false,
|
| 7 |
+
"rstrip": false,
|
| 8 |
+
"single_word": false,
|
| 9 |
+
"special": true
|
| 10 |
+
},
|
| 11 |
+
"100": {
|
| 12 |
+
"content": "[UNK]",
|
| 13 |
+
"lstrip": false,
|
| 14 |
+
"normalized": false,
|
| 15 |
+
"rstrip": false,
|
| 16 |
+
"single_word": false,
|
| 17 |
+
"special": true
|
| 18 |
+
},
|
| 19 |
+
"101": {
|
| 20 |
+
"content": "[CLS]",
|
| 21 |
+
"lstrip": false,
|
| 22 |
+
"normalized": false,
|
| 23 |
+
"rstrip": false,
|
| 24 |
+
"single_word": false,
|
| 25 |
+
"special": true
|
| 26 |
+
},
|
| 27 |
+
"102": {
|
| 28 |
+
"content": "[SEP]",
|
| 29 |
+
"lstrip": false,
|
| 30 |
+
"normalized": false,
|
| 31 |
+
"rstrip": false,
|
| 32 |
+
"single_word": false,
|
| 33 |
+
"special": true
|
| 34 |
+
},
|
| 35 |
+
"103": {
|
| 36 |
+
"content": "[MASK]",
|
| 37 |
+
"lstrip": false,
|
| 38 |
+
"normalized": false,
|
| 39 |
+
"rstrip": false,
|
| 40 |
+
"single_word": false,
|
| 41 |
+
"special": true
|
| 42 |
+
}
|
| 43 |
+
},
|
| 44 |
+
"clean_up_tokenization_spaces": false,
|
| 45 |
+
"cls_token": "[CLS]",
|
| 46 |
+
"do_lower_case": true,
|
| 47 |
+
"extra_special_tokens": {},
|
| 48 |
+
"mask_token": "[MASK]",
|
| 49 |
+
"model_max_length": 512,
|
| 50 |
+
"pad_token": "[PAD]",
|
| 51 |
+
"sep_token": "[SEP]",
|
| 52 |
+
"strip_accents": null,
|
| 53 |
+
"tokenize_chinese_chars": true,
|
| 54 |
+
"tokenizer_class": "DistilBertTokenizer",
|
| 55 |
+
"unk_token": "[UNK]"
|
| 56 |
+
}
|
checkpoint-220/trainer_state.json
ADDED
|
@@ -0,0 +1,119 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"best_global_step": 220,
|
| 3 |
+
"best_metric": 0.9837687666378767,
|
| 4 |
+
"best_model_checkpoint": "intent_classifier_v2/checkpoint-220",
|
| 5 |
+
"epoch": 4.0,
|
| 6 |
+
"eval_steps": 500,
|
| 7 |
+
"global_step": 220,
|
| 8 |
+
"is_hyper_param_search": false,
|
| 9 |
+
"is_local_process_zero": true,
|
| 10 |
+
"is_world_process_zero": true,
|
| 11 |
+
"log_history": [
|
| 12 |
+
{
|
| 13 |
+
"epoch": 0.9090909090909091,
|
| 14 |
+
"grad_norm": 1.5899983644485474,
|
| 15 |
+
"learning_rate": 1.9600000000000003e-06,
|
| 16 |
+
"loss": 2.5642,
|
| 17 |
+
"step": 50
|
| 18 |
+
},
|
| 19 |
+
{
|
| 20 |
+
"epoch": 1.0,
|
| 21 |
+
"eval_accuracy": 0.1881720430107527,
|
| 22 |
+
"eval_f1": 0.10086428092694365,
|
| 23 |
+
"eval_loss": 2.5428154468536377,
|
| 24 |
+
"eval_precision": 0.0936228251427814,
|
| 25 |
+
"eval_recall": 0.1881720430107527,
|
| 26 |
+
"eval_runtime": 0.1839,
|
| 27 |
+
"eval_samples_per_second": 1011.503,
|
| 28 |
+
"eval_steps_per_second": 65.258,
|
| 29 |
+
"step": 55
|
| 30 |
+
},
|
| 31 |
+
{
|
| 32 |
+
"epoch": 1.8181818181818183,
|
| 33 |
+
"grad_norm": 3.272284984588623,
|
| 34 |
+
"learning_rate": 3.96e-06,
|
| 35 |
+
"loss": 2.5118,
|
| 36 |
+
"step": 100
|
| 37 |
+
},
|
| 38 |
+
{
|
| 39 |
+
"epoch": 2.0,
|
| 40 |
+
"eval_accuracy": 0.26344086021505375,
|
| 41 |
+
"eval_f1": 0.17357707395658062,
|
| 42 |
+
"eval_loss": 2.3705923557281494,
|
| 43 |
+
"eval_precision": 0.23162059134445975,
|
| 44 |
+
"eval_recall": 0.26344086021505375,
|
| 45 |
+
"eval_runtime": 0.202,
|
| 46 |
+
"eval_samples_per_second": 920.577,
|
| 47 |
+
"eval_steps_per_second": 59.392,
|
| 48 |
+
"step": 110
|
| 49 |
+
},
|
| 50 |
+
{
|
| 51 |
+
"epoch": 2.7272727272727275,
|
| 52 |
+
"grad_norm": 4.249340057373047,
|
| 53 |
+
"learning_rate": 5.9600000000000005e-06,
|
| 54 |
+
"loss": 2.303,
|
| 55 |
+
"step": 150
|
| 56 |
+
},
|
| 57 |
+
{
|
| 58 |
+
"epoch": 3.0,
|
| 59 |
+
"eval_accuracy": 0.8064516129032258,
|
| 60 |
+
"eval_f1": 0.7699332427341867,
|
| 61 |
+
"eval_loss": 1.9005680084228516,
|
| 62 |
+
"eval_precision": 0.8112076095947064,
|
| 63 |
+
"eval_recall": 0.8064516129032258,
|
| 64 |
+
"eval_runtime": 0.18,
|
| 65 |
+
"eval_samples_per_second": 1033.189,
|
| 66 |
+
"eval_steps_per_second": 66.657,
|
| 67 |
+
"step": 165
|
| 68 |
+
},
|
| 69 |
+
{
|
| 70 |
+
"epoch": 3.6363636363636362,
|
| 71 |
+
"grad_norm": 4.2065110206604,
|
| 72 |
+
"learning_rate": 7.960000000000002e-06,
|
| 73 |
+
"loss": 1.8712,
|
| 74 |
+
"step": 200
|
| 75 |
+
},
|
| 76 |
+
{
|
| 77 |
+
"epoch": 4.0,
|
| 78 |
+
"eval_accuracy": 0.9838709677419355,
|
| 79 |
+
"eval_f1": 0.9837687666378767,
|
| 80 |
+
"eval_loss": 1.1924996376037598,
|
| 81 |
+
"eval_precision": 0.984740928604229,
|
| 82 |
+
"eval_recall": 0.9838709677419355,
|
| 83 |
+
"eval_runtime": 0.1796,
|
| 84 |
+
"eval_samples_per_second": 1035.622,
|
| 85 |
+
"eval_steps_per_second": 66.814,
|
| 86 |
+
"step": 220
|
| 87 |
+
}
|
| 88 |
+
],
|
| 89 |
+
"logging_steps": 50,
|
| 90 |
+
"max_steps": 275,
|
| 91 |
+
"num_input_tokens_seen": 0,
|
| 92 |
+
"num_train_epochs": 5,
|
| 93 |
+
"save_steps": 500,
|
| 94 |
+
"stateful_callbacks": {
|
| 95 |
+
"EarlyStoppingCallback": {
|
| 96 |
+
"args": {
|
| 97 |
+
"early_stopping_patience": 2,
|
| 98 |
+
"early_stopping_threshold": 0.0
|
| 99 |
+
},
|
| 100 |
+
"attributes": {
|
| 101 |
+
"early_stopping_patience_counter": 0
|
| 102 |
+
}
|
| 103 |
+
},
|
| 104 |
+
"TrainerControl": {
|
| 105 |
+
"args": {
|
| 106 |
+
"should_epoch_stop": false,
|
| 107 |
+
"should_evaluate": false,
|
| 108 |
+
"should_log": false,
|
| 109 |
+
"should_save": true,
|
| 110 |
+
"should_training_stop": false
|
| 111 |
+
},
|
| 112 |
+
"attributes": {}
|
| 113 |
+
}
|
| 114 |
+
},
|
| 115 |
+
"total_flos": 27852593715744.0,
|
| 116 |
+
"train_batch_size": 16,
|
| 117 |
+
"trial_name": null,
|
| 118 |
+
"trial_params": null
|
| 119 |
+
}
|
checkpoint-220/training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7161c9b09c8630acc22aba2b488c51d515d356f1c8d06aae95857631447c08a5
|
| 3 |
+
size 5777
|
checkpoint-220/vocab.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
checkpoint-275/config.json
ADDED
|
@@ -0,0 +1,54 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"activation": "gelu",
|
| 3 |
+
"architectures": [
|
| 4 |
+
"DistilBertForSequenceClassification"
|
| 5 |
+
],
|
| 6 |
+
"attention_dropout": 0.1,
|
| 7 |
+
"dim": 768,
|
| 8 |
+
"dropout": 0.1,
|
| 9 |
+
"dtype": "float32",
|
| 10 |
+
"hidden_dim": 3072,
|
| 11 |
+
"id2label": {
|
| 12 |
+
"0": "LABEL_0",
|
| 13 |
+
"1": "LABEL_1",
|
| 14 |
+
"2": "LABEL_2",
|
| 15 |
+
"3": "LABEL_3",
|
| 16 |
+
"4": "LABEL_4",
|
| 17 |
+
"5": "LABEL_5",
|
| 18 |
+
"6": "LABEL_6",
|
| 19 |
+
"7": "LABEL_7",
|
| 20 |
+
"8": "LABEL_8",
|
| 21 |
+
"9": "LABEL_9",
|
| 22 |
+
"10": "LABEL_10",
|
| 23 |
+
"11": "LABEL_11",
|
| 24 |
+
"12": "LABEL_12"
|
| 25 |
+
},
|
| 26 |
+
"initializer_range": 0.02,
|
| 27 |
+
"label2id": {
|
| 28 |
+
"LABEL_0": 0,
|
| 29 |
+
"LABEL_1": 1,
|
| 30 |
+
"LABEL_10": 10,
|
| 31 |
+
"LABEL_11": 11,
|
| 32 |
+
"LABEL_12": 12,
|
| 33 |
+
"LABEL_2": 2,
|
| 34 |
+
"LABEL_3": 3,
|
| 35 |
+
"LABEL_4": 4,
|
| 36 |
+
"LABEL_5": 5,
|
| 37 |
+
"LABEL_6": 6,
|
| 38 |
+
"LABEL_7": 7,
|
| 39 |
+
"LABEL_8": 8,
|
| 40 |
+
"LABEL_9": 9
|
| 41 |
+
},
|
| 42 |
+
"max_position_embeddings": 512,
|
| 43 |
+
"model_type": "distilbert",
|
| 44 |
+
"n_heads": 12,
|
| 45 |
+
"n_layers": 6,
|
| 46 |
+
"pad_token_id": 0,
|
| 47 |
+
"problem_type": "single_label_classification",
|
| 48 |
+
"qa_dropout": 0.1,
|
| 49 |
+
"seq_classif_dropout": 0.2,
|
| 50 |
+
"sinusoidal_pos_embds": false,
|
| 51 |
+
"tie_weights_": true,
|
| 52 |
+
"transformers_version": "4.57.1",
|
| 53 |
+
"vocab_size": 30522
|
| 54 |
+
}
|
checkpoint-275/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bf565fcda041fac56c09c0e315084bcb47682925c0c57096a43a29342e94bdce
|
| 3 |
+
size 267866404
|
checkpoint-275/optimizer.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0f7b1d1db774ff8d88a88ffbaf3ffca6ad7a3a952031832762c71d415b791e69
|
| 3 |
+
size 535796811
|
checkpoint-275/rng_state.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:85a3f5aa2ce3b77be772ceec4ad0d0619c7aef5028d836b01431f1a826218479
|
| 3 |
+
size 14645
|
checkpoint-275/scheduler.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e29603040e49cd81de77c77d5b6453279762d8f6af23c4e143b794082920649e
|
| 3 |
+
size 1465
|
checkpoint-275/special_tokens_map.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"cls_token": "[CLS]",
|
| 3 |
+
"mask_token": "[MASK]",
|
| 4 |
+
"pad_token": "[PAD]",
|
| 5 |
+
"sep_token": "[SEP]",
|
| 6 |
+
"unk_token": "[UNK]"
|
| 7 |
+
}
|
checkpoint-275/tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
checkpoint-275/tokenizer_config.json
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"added_tokens_decoder": {
|
| 3 |
+
"0": {
|
| 4 |
+
"content": "[PAD]",
|
| 5 |
+
"lstrip": false,
|
| 6 |
+
"normalized": false,
|
| 7 |
+
"rstrip": false,
|
| 8 |
+
"single_word": false,
|
| 9 |
+
"special": true
|
| 10 |
+
},
|
| 11 |
+
"100": {
|
| 12 |
+
"content": "[UNK]",
|
| 13 |
+
"lstrip": false,
|
| 14 |
+
"normalized": false,
|
| 15 |
+
"rstrip": false,
|
| 16 |
+
"single_word": false,
|
| 17 |
+
"special": true
|
| 18 |
+
},
|
| 19 |
+
"101": {
|
| 20 |
+
"content": "[CLS]",
|
| 21 |
+
"lstrip": false,
|
| 22 |
+
"normalized": false,
|
| 23 |
+
"rstrip": false,
|
| 24 |
+
"single_word": false,
|
| 25 |
+
"special": true
|
| 26 |
+
},
|
| 27 |
+
"102": {
|
| 28 |
+
"content": "[SEP]",
|
| 29 |
+
"lstrip": false,
|
| 30 |
+
"normalized": false,
|
| 31 |
+
"rstrip": false,
|
| 32 |
+
"single_word": false,
|
| 33 |
+
"special": true
|
| 34 |
+
},
|
| 35 |
+
"103": {
|
| 36 |
+
"content": "[MASK]",
|
| 37 |
+
"lstrip": false,
|
| 38 |
+
"normalized": false,
|
| 39 |
+
"rstrip": false,
|
| 40 |
+
"single_word": false,
|
| 41 |
+
"special": true
|
| 42 |
+
}
|
| 43 |
+
},
|
| 44 |
+
"clean_up_tokenization_spaces": false,
|
| 45 |
+
"cls_token": "[CLS]",
|
| 46 |
+
"do_lower_case": true,
|
| 47 |
+
"extra_special_tokens": {},
|
| 48 |
+
"mask_token": "[MASK]",
|
| 49 |
+
"model_max_length": 512,
|
| 50 |
+
"pad_token": "[PAD]",
|
| 51 |
+
"sep_token": "[SEP]",
|
| 52 |
+
"strip_accents": null,
|
| 53 |
+
"tokenize_chinese_chars": true,
|
| 54 |
+
"tokenizer_class": "DistilBertTokenizer",
|
| 55 |
+
"unk_token": "[UNK]"
|
| 56 |
+
}
|
checkpoint-275/trainer_state.json
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"best_global_step": 275,
|
| 3 |
+
"best_metric": 1.0,
|
| 4 |
+
"best_model_checkpoint": "intent_classifier_v2/checkpoint-275",
|
| 5 |
+
"epoch": 5.0,
|
| 6 |
+
"eval_steps": 500,
|
| 7 |
+
"global_step": 275,
|
| 8 |
+
"is_hyper_param_search": false,
|
| 9 |
+
"is_local_process_zero": true,
|
| 10 |
+
"is_world_process_zero": true,
|
| 11 |
+
"log_history": [
|
| 12 |
+
{
|
| 13 |
+
"epoch": 0.9090909090909091,
|
| 14 |
+
"grad_norm": 1.5899983644485474,
|
| 15 |
+
"learning_rate": 1.9600000000000003e-06,
|
| 16 |
+
"loss": 2.5642,
|
| 17 |
+
"step": 50
|
| 18 |
+
},
|
| 19 |
+
{
|
| 20 |
+
"epoch": 1.0,
|
| 21 |
+
"eval_accuracy": 0.1881720430107527,
|
| 22 |
+
"eval_f1": 0.10086428092694365,
|
| 23 |
+
"eval_loss": 2.5428154468536377,
|
| 24 |
+
"eval_precision": 0.0936228251427814,
|
| 25 |
+
"eval_recall": 0.1881720430107527,
|
| 26 |
+
"eval_runtime": 0.1839,
|
| 27 |
+
"eval_samples_per_second": 1011.503,
|
| 28 |
+
"eval_steps_per_second": 65.258,
|
| 29 |
+
"step": 55
|
| 30 |
+
},
|
| 31 |
+
{
|
| 32 |
+
"epoch": 1.8181818181818183,
|
| 33 |
+
"grad_norm": 3.272284984588623,
|
| 34 |
+
"learning_rate": 3.96e-06,
|
| 35 |
+
"loss": 2.5118,
|
| 36 |
+
"step": 100
|
| 37 |
+
},
|
| 38 |
+
{
|
| 39 |
+
"epoch": 2.0,
|
| 40 |
+
"eval_accuracy": 0.26344086021505375,
|
| 41 |
+
"eval_f1": 0.17357707395658062,
|
| 42 |
+
"eval_loss": 2.3705923557281494,
|
| 43 |
+
"eval_precision": 0.23162059134445975,
|
| 44 |
+
"eval_recall": 0.26344086021505375,
|
| 45 |
+
"eval_runtime": 0.202,
|
| 46 |
+
"eval_samples_per_second": 920.577,
|
| 47 |
+
"eval_steps_per_second": 59.392,
|
| 48 |
+
"step": 110
|
| 49 |
+
},
|
| 50 |
+
{
|
| 51 |
+
"epoch": 2.7272727272727275,
|
| 52 |
+
"grad_norm": 4.249340057373047,
|
| 53 |
+
"learning_rate": 5.9600000000000005e-06,
|
| 54 |
+
"loss": 2.303,
|
| 55 |
+
"step": 150
|
| 56 |
+
},
|
| 57 |
+
{
|
| 58 |
+
"epoch": 3.0,
|
| 59 |
+
"eval_accuracy": 0.8064516129032258,
|
| 60 |
+
"eval_f1": 0.7699332427341867,
|
| 61 |
+
"eval_loss": 1.9005680084228516,
|
| 62 |
+
"eval_precision": 0.8112076095947064,
|
| 63 |
+
"eval_recall": 0.8064516129032258,
|
| 64 |
+
"eval_runtime": 0.18,
|
| 65 |
+
"eval_samples_per_second": 1033.189,
|
| 66 |
+
"eval_steps_per_second": 66.657,
|
| 67 |
+
"step": 165
|
| 68 |
+
},
|
| 69 |
+
{
|
| 70 |
+
"epoch": 3.6363636363636362,
|
| 71 |
+
"grad_norm": 4.2065110206604,
|
| 72 |
+
"learning_rate": 7.960000000000002e-06,
|
| 73 |
+
"loss": 1.8712,
|
| 74 |
+
"step": 200
|
| 75 |
+
},
|
| 76 |
+
{
|
| 77 |
+
"epoch": 4.0,
|
| 78 |
+
"eval_accuracy": 0.9838709677419355,
|
| 79 |
+
"eval_f1": 0.9837687666378767,
|
| 80 |
+
"eval_loss": 1.1924996376037598,
|
| 81 |
+
"eval_precision": 0.984740928604229,
|
| 82 |
+
"eval_recall": 0.9838709677419355,
|
| 83 |
+
"eval_runtime": 0.1796,
|
| 84 |
+
"eval_samples_per_second": 1035.622,
|
| 85 |
+
"eval_steps_per_second": 66.814,
|
| 86 |
+
"step": 220
|
| 87 |
+
},
|
| 88 |
+
{
|
| 89 |
+
"epoch": 4.545454545454545,
|
| 90 |
+
"grad_norm": 3.407564163208008,
|
| 91 |
+
"learning_rate": 9.960000000000001e-06,
|
| 92 |
+
"loss": 1.2426,
|
| 93 |
+
"step": 250
|
| 94 |
+
},
|
| 95 |
+
{
|
| 96 |
+
"epoch": 5.0,
|
| 97 |
+
"eval_accuracy": 1.0,
|
| 98 |
+
"eval_f1": 1.0,
|
| 99 |
+
"eval_loss": 0.5706155300140381,
|
| 100 |
+
"eval_precision": 1.0,
|
| 101 |
+
"eval_recall": 1.0,
|
| 102 |
+
"eval_runtime": 0.1958,
|
| 103 |
+
"eval_samples_per_second": 950.01,
|
| 104 |
+
"eval_steps_per_second": 61.291,
|
| 105 |
+
"step": 275
|
| 106 |
+
}
|
| 107 |
+
],
|
| 108 |
+
"logging_steps": 50,
|
| 109 |
+
"max_steps": 275,
|
| 110 |
+
"num_input_tokens_seen": 0,
|
| 111 |
+
"num_train_epochs": 5,
|
| 112 |
+
"save_steps": 500,
|
| 113 |
+
"stateful_callbacks": {
|
| 114 |
+
"EarlyStoppingCallback": {
|
| 115 |
+
"args": {
|
| 116 |
+
"early_stopping_patience": 2,
|
| 117 |
+
"early_stopping_threshold": 0.0
|
| 118 |
+
},
|
| 119 |
+
"attributes": {
|
| 120 |
+
"early_stopping_patience_counter": 0
|
| 121 |
+
}
|
| 122 |
+
},
|
| 123 |
+
"TrainerControl": {
|
| 124 |
+
"args": {
|
| 125 |
+
"should_epoch_stop": false,
|
| 126 |
+
"should_evaluate": false,
|
| 127 |
+
"should_log": false,
|
| 128 |
+
"should_save": true,
|
| 129 |
+
"should_training_stop": true
|
| 130 |
+
},
|
| 131 |
+
"attributes": {}
|
| 132 |
+
}
|
| 133 |
+
},
|
| 134 |
+
"total_flos": 34815742144680.0,
|
| 135 |
+
"train_batch_size": 16,
|
| 136 |
+
"trial_name": null,
|
| 137 |
+
"trial_params": null
|
| 138 |
+
}
|
checkpoint-275/training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7161c9b09c8630acc22aba2b488c51d515d356f1c8d06aae95857631447c08a5
|
| 3 |
+
size 5777
|
checkpoint-275/vocab.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
config.json
ADDED
|
@@ -0,0 +1,54 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"activation": "gelu",
|
| 3 |
+
"architectures": [
|
| 4 |
+
"DistilBertForSequenceClassification"
|
| 5 |
+
],
|
| 6 |
+
"attention_dropout": 0.1,
|
| 7 |
+
"dim": 768,
|
| 8 |
+
"dropout": 0.1,
|
| 9 |
+
"dtype": "float32",
|
| 10 |
+
"hidden_dim": 3072,
|
| 11 |
+
"id2label": {
|
| 12 |
+
"0": "LABEL_0",
|
| 13 |
+
"1": "LABEL_1",
|
| 14 |
+
"2": "LABEL_2",
|
| 15 |
+
"3": "LABEL_3",
|
| 16 |
+
"4": "LABEL_4",
|
| 17 |
+
"5": "LABEL_5",
|
| 18 |
+
"6": "LABEL_6",
|
| 19 |
+
"7": "LABEL_7",
|
| 20 |
+
"8": "LABEL_8",
|
| 21 |
+
"9": "LABEL_9",
|
| 22 |
+
"10": "LABEL_10",
|
| 23 |
+
"11": "LABEL_11",
|
| 24 |
+
"12": "LABEL_12"
|
| 25 |
+
},
|
| 26 |
+
"initializer_range": 0.02,
|
| 27 |
+
"label2id": {
|
| 28 |
+
"LABEL_0": 0,
|
| 29 |
+
"LABEL_1": 1,
|
| 30 |
+
"LABEL_10": 10,
|
| 31 |
+
"LABEL_11": 11,
|
| 32 |
+
"LABEL_12": 12,
|
| 33 |
+
"LABEL_2": 2,
|
| 34 |
+
"LABEL_3": 3,
|
| 35 |
+
"LABEL_4": 4,
|
| 36 |
+
"LABEL_5": 5,
|
| 37 |
+
"LABEL_6": 6,
|
| 38 |
+
"LABEL_7": 7,
|
| 39 |
+
"LABEL_8": 8,
|
| 40 |
+
"LABEL_9": 9
|
| 41 |
+
},
|
| 42 |
+
"max_position_embeddings": 512,
|
| 43 |
+
"model_type": "distilbert",
|
| 44 |
+
"n_heads": 12,
|
| 45 |
+
"n_layers": 6,
|
| 46 |
+
"pad_token_id": 0,
|
| 47 |
+
"problem_type": "single_label_classification",
|
| 48 |
+
"qa_dropout": 0.1,
|
| 49 |
+
"seq_classif_dropout": 0.2,
|
| 50 |
+
"sinusoidal_pos_embds": false,
|
| 51 |
+
"tie_weights_": true,
|
| 52 |
+
"transformers_version": "4.57.1",
|
| 53 |
+
"vocab_size": 30522
|
| 54 |
+
}
|
label_encoder.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:549d2b82bcefa38e87aa8a183c4936ab261263b64c055ba903b32b819049dc6d
|
| 3 |
+
size 517
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bf565fcda041fac56c09c0e315084bcb47682925c0c57096a43a29342e94bdce
|
| 3 |
+
size 267866404
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"cls_token": "[CLS]",
|
| 3 |
+
"mask_token": "[MASK]",
|
| 4 |
+
"pad_token": "[PAD]",
|
| 5 |
+
"sep_token": "[SEP]",
|
| 6 |
+
"unk_token": "[UNK]"
|
| 7 |
+
}
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"added_tokens_decoder": {
|
| 3 |
+
"0": {
|
| 4 |
+
"content": "[PAD]",
|
| 5 |
+
"lstrip": false,
|
| 6 |
+
"normalized": false,
|
| 7 |
+
"rstrip": false,
|
| 8 |
+
"single_word": false,
|
| 9 |
+
"special": true
|
| 10 |
+
},
|
| 11 |
+
"100": {
|
| 12 |
+
"content": "[UNK]",
|
| 13 |
+
"lstrip": false,
|
| 14 |
+
"normalized": false,
|
| 15 |
+
"rstrip": false,
|
| 16 |
+
"single_word": false,
|
| 17 |
+
"special": true
|
| 18 |
+
},
|
| 19 |
+
"101": {
|
| 20 |
+
"content": "[CLS]",
|
| 21 |
+
"lstrip": false,
|
| 22 |
+
"normalized": false,
|
| 23 |
+
"rstrip": false,
|
| 24 |
+
"single_word": false,
|
| 25 |
+
"special": true
|
| 26 |
+
},
|
| 27 |
+
"102": {
|
| 28 |
+
"content": "[SEP]",
|
| 29 |
+
"lstrip": false,
|
| 30 |
+
"normalized": false,
|
| 31 |
+
"rstrip": false,
|
| 32 |
+
"single_word": false,
|
| 33 |
+
"special": true
|
| 34 |
+
},
|
| 35 |
+
"103": {
|
| 36 |
+
"content": "[MASK]",
|
| 37 |
+
"lstrip": false,
|
| 38 |
+
"normalized": false,
|
| 39 |
+
"rstrip": false,
|
| 40 |
+
"single_word": false,
|
| 41 |
+
"special": true
|
| 42 |
+
}
|
| 43 |
+
},
|
| 44 |
+
"clean_up_tokenization_spaces": false,
|
| 45 |
+
"cls_token": "[CLS]",
|
| 46 |
+
"do_lower_case": true,
|
| 47 |
+
"extra_special_tokens": {},
|
| 48 |
+
"mask_token": "[MASK]",
|
| 49 |
+
"model_max_length": 512,
|
| 50 |
+
"pad_token": "[PAD]",
|
| 51 |
+
"sep_token": "[SEP]",
|
| 52 |
+
"strip_accents": null,
|
| 53 |
+
"tokenize_chinese_chars": true,
|
| 54 |
+
"tokenizer_class": "DistilBertTokenizer",
|
| 55 |
+
"unk_token": "[UNK]"
|
| 56 |
+
}
|
training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7161c9b09c8630acc22aba2b488c51d515d356f1c8d06aae95857631447c08a5
|
| 3 |
+
size 5777
|
training_results.json
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"timestamp": "2025-10-25T19:48:08.305153",
|
| 3 |
+
"model_name": "distilbert-base-uncased",
|
| 4 |
+
"num_classes": 13,
|
| 5 |
+
"class_names": [
|
| 6 |
+
"access-employee-record",
|
| 7 |
+
"approve-expense",
|
| 8 |
+
"check-leave-balance",
|
| 9 |
+
"confirm-training-completion",
|
| 10 |
+
"general-query",
|
| 11 |
+
"generate-offer",
|
| 12 |
+
"provide-candidate-feedback",
|
| 13 |
+
"request-leave",
|
| 14 |
+
"request-training",
|
| 15 |
+
"review-contract",
|
| 16 |
+
"schedule-interview",
|
| 17 |
+
"submit-expense",
|
| 18 |
+
"update-employee-profile"
|
| 19 |
+
],
|
| 20 |
+
"test_accuracy": 1.0,
|
| 21 |
+
"test_weighted_f1": 1.0,
|
| 22 |
+
"per_class_f1": {
|
| 23 |
+
"access-employee-record": 1.0,
|
| 24 |
+
"approve-expense": 1.0,
|
| 25 |
+
"check-leave-balance": 1.0,
|
| 26 |
+
"confirm-training-completion": 1.0,
|
| 27 |
+
"general-query": 1.0,
|
| 28 |
+
"generate-offer": 1.0,
|
| 29 |
+
"provide-candidate-feedback": 1.0,
|
| 30 |
+
"request-leave": 1.0,
|
| 31 |
+
"request-training": 1.0,
|
| 32 |
+
"review-contract": 1.0,
|
| 33 |
+
"schedule-interview": 1.0,
|
| 34 |
+
"submit-expense": 1.0,
|
| 35 |
+
"update-employee-profile": 1.0
|
| 36 |
+
}
|
| 37 |
+
}
|
vocab.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|