santman/distilbert

Browse files

Files changed (8) hide show

README.md +21 -50
config.json +14 -16
model.safetensors +2 -2
special_tokens_map.json +7 -0
tokenizer.json +0 -0
tokenizer_config.json +56 -0
training_args.bin +1 -1
vocab.txt +0 -0

README.md CHANGED Viewed

@@ -1,18 +1,16 @@
 ---
 library_name: transformers
 license: apache-2.0
-base_model: bert-base-uncased
 tags:
 - generated_from_trainer
-model-index:
-- name: results
-  results: []
-datasets:
-- SantmanKT/hr-intent-dataset
 metrics:
 - accuracy
 - precision
 - recall
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -20,53 +18,26 @@ should probably proofread and complete it, then remove this comment. -->
 # results
-This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an SantmanKT/hr-intent-dataset .
 It achieves the following results on the evaluation set:
-- Loss: 1.0347
 ## Model description
-This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) for intent classification in HR workflows.
-It takes a merged user query and context string as input and predicts the correct HR intent label (e.g., generate-offer, check-leave-balance, etc.).
 ## Intended uses & limitations
-Intended uses
-- Automating HR assistants, chatbots, and workflow engines to map employee queries to pre-defined HR actions.
-Limitations
-- Trained only on enterprise HR dataset with limited intent classes.
-- English only; not robust to out-of-domain (non-HR) queries.
 ## Training and evaluation data
-**Data:**
-- 133 rows of labeled HR queries covering 12 intent classes.
-- Each sample: text = user query + context, label = HR intent.
-- 80% train, 20% validation split, stratified by label.
 ## Training procedure
-## Evaluation results (on validation set)
-- **Accuracy**: 96.3%
-- **Weighted Precision**: 97.5%
-- **Weighted Recall**: 96.3%
-- **Weighted F1**: 97.0%
-### Per-class metrics
-| Intent                    | Precision | Recall | F1  | Support |
-|---------------------------|-----------|--------|-----|---------|
-| generate-offer            | 1.00      | 1.00   | 1.00| 4       |
-| review-contract           | 1.00      | 0.75   | 0.86| 4       |
-| ...                       | ...       | ...    | ... | ...     |
-### Example predictions
-- **Input:** I need a vacation from June 10 to 12. [context: {domain: HR, topic: leave management, subject: leave request}]
-- **Prediction:** request-leave
 ### Training hyperparameters
@@ -81,18 +52,18 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 2.5034        | 1.0   | 14   | 2.2755          |
-| 2.2674        | 2.0   | 28   | 1.8394          |
-| 1.6722        | 3.0   | 42   | 1.4040          |
-| 1.3466        | 4.0   | 56   | 1.1097          |
-| 1.1469        | 5.0   | 70   | 1.0347          |
 ### Framework versions
-- Transformers 4.54.1
 - Pytorch 2.6.0+cu124
 - Datasets 4.0.0
-- Tokenizers 0.21.4

 ---
 library_name: transformers
 license: apache-2.0
+base_model: distilbert-base-uncased
 tags:
 - generated_from_trainer
 metrics:
 - accuracy
 - precision
 - recall
+model-index:
+- name: results
+  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # results
+This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.4586
+- Accuracy: 0.8889
+- Precision: 1.0
+- Recall: 0.8889
 ## Model description
+More information needed
 ## Intended uses & limitations
+More information needed
 ## Training and evaluation data
+More information needed
 ## Training procedure
 ### Training hyperparameters
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|
+| 2.497         | 1.0   | 14   | 2.4813          | 0.1111   | 0.0617    | 0.1111 |
+| 2.3512        | 2.0   | 28   | 2.1629          | 0.4444   | 0.6222    | 0.4444 |
+| 1.7293        | 3.0   | 42   | 1.8070          | 0.8148   | 0.8148    | 0.8148 |
+| 1.4604        | 4.0   | 56   | 1.5398          | 0.8148   | 0.8148    | 0.8148 |
+| 1.1833        | 5.0   | 70   | 1.4586          | 0.8889   | 1.0       | 0.8889 |
 ### Framework versions
+- Transformers 4.54.0
 - Pytorch 2.6.0+cu124
 - Datasets 4.0.0
+- Tokenizers 0.21.2

config.json CHANGED Viewed

@@ -1,13 +1,12 @@
 {
   "architectures": [
-    "BertForSequenceClassification"
   ],
-  "attention_probs_dropout_prob": 0.1,
-  "classifier_dropout": null,
-  "gradient_checkpointing": false,
-  "hidden_act": "gelu",
-  "hidden_dropout_prob": 0.1,
-  "hidden_size": 768,
   "id2label": {
     "0": "LABEL_0",
     "1": "LABEL_1",
@@ -23,7 +22,6 @@
     "11": "LABEL_11"
   },
   "initializer_range": 0.02,
-  "intermediate_size": 3072,
   "label2id": {
     "LABEL_0": 0,
     "LABEL_1": 1,
@@ -38,17 +36,17 @@
     "LABEL_8": 8,
     "LABEL_9": 9
   },
-  "layer_norm_eps": 1e-12,
   "max_position_embeddings": 512,
-  "model_type": "bert",
-  "num_attention_heads": 12,
-  "num_hidden_layers": 12,
   "pad_token_id": 0,
-  "position_embedding_type": "absolute",
   "problem_type": "single_label_classification",
   "torch_dtype": "float32",
-  "transformers_version": "4.54.1",
-  "type_vocab_size": 2,
-  "use_cache": true,
   "vocab_size": 30522
 }

 {
+  "activation": "gelu",
   "architectures": [
+    "DistilBertForSequenceClassification"
   ],
+  "attention_dropout": 0.1,
+  "dim": 768,
+  "dropout": 0.1,
+  "hidden_dim": 3072,
   "id2label": {
     "0": "LABEL_0",
     "1": "LABEL_1",
     "11": "LABEL_11"
   },
   "initializer_range": 0.02,
   "label2id": {
     "LABEL_0": 0,
     "LABEL_1": 1,
     "LABEL_8": 8,
     "LABEL_9": 9
   },
   "max_position_embeddings": 512,
+  "model_type": "distilbert",
+  "n_heads": 12,
+  "n_layers": 6,
   "pad_token_id": 0,
   "problem_type": "single_label_classification",
+  "qa_dropout": 0.1,
+  "seq_classif_dropout": 0.2,
+  "sinusoidal_pos_embds": false,
+  "tie_weights_": true,
   "torch_dtype": "float32",
+  "transformers_version": "4.54.0",
   "vocab_size": 30522
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e207485364e5c1ef96b1dab87550f4b09fb0300594d6c3279055fadf6ed5c52d
-size 437989408

 version https://git-lfs.github.com/spec/v1
+oid sha256:59d11b4b218eb0ce47c768fd742fe0eac19423b2751ee59b8c9315dbcb0d767f
+size 267863328

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,56 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "DistilBertTokenizer",
+  "unk_token": "[UNK]"
+}

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:15a7e01d4074f6809f1635a188f0cadcdf101b0adc074dbfcb5be2522d7b6202
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:92ebce3e0ff3e35ffad1ef85eee74972e77cc6508e8a5636b61e7344265ba0fe
 size 5304

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff