4nkh commited on
Commit
db7370e
·
verified ·
1 Parent(s): 841c708

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +32 -52
README.md CHANGED
@@ -1,63 +1,43 @@
1
- ---
2
- library_name: transformers
3
- license: apache-2.0
4
- base_model: bert-base-uncased
5
- tags:
6
- - generated_from_trainer
7
- model-index:
8
- - name: theme_model
9
- results: []
10
- datasets:
11
- - 4nkh/theme_data
12
- ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
 
17
- # theme_model
18
 
19
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the None dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.1822
22
- - Micro/precision: 1.0
23
- - Micro/recall: 1.0
24
- - Micro/f1: 1.0
25
- - Macro/precision: 1.0
26
- - Macro/recall: 1.0
27
- - Macro/f1: 1.0
28
 
29
- ## Model description
30
 
31
- More information needed
 
 
32
 
33
- ## Intended uses & limitations
 
 
34
 
35
- More information needed
 
 
 
 
 
 
 
 
36
 
37
- ## Training and evaluation data
38
 
39
- More information needed
 
40
 
41
- ## Training procedure
42
 
43
- ### Training hyperparameters
44
-
45
- The following hyperparameters were used during training:
46
- - learning_rate: 2e-05
47
- - train_batch_size: 8
48
- - eval_batch_size: 16
49
- - seed: 42
50
- - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
51
- - lr_scheduler_type: linear
52
- - num_epochs: 5
53
-
54
- ### Training results
55
-
56
-
57
-
58
- ### Framework versions
59
-
60
- - Transformers 4.57.3
61
- - Pytorch 2.8.0
62
- - Datasets 4.4.2
63
- - Tokenizers 0.22.2
 
1
+ # Theme classification model (multi-label)
 
 
 
 
 
 
 
 
 
 
 
2
 
3
+ This repository contains a fine-tuned BERT model for classifying short texts into community-oriented themes. The model was trained locally and pushed to the Hugging Face Hub.
 
4
 
5
+ Model details
6
 
7
+ - Model architecture: bert-base-uncased (fine-tuned)
8
+ - Problem type: multi-label classification
9
+ - Labels: `mentorship`, `entrepreneurship`, `startup success`
10
+ - Training data: `train_theme.jsonl` (included)
11
+ - Final evaluation (example run):
12
+ - eval_loss: 0.1822
13
+ - eval_micro/f1: 1.0
14
+ - eval_macro/f1: 1.0
 
15
 
16
+ Usage
17
 
18
+ ```python
19
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
20
+ import torch
21
 
22
+ repo = "4nkh/theme_model"
23
+ tokenizer = AutoTokenizer.from_pretrained(repo)
24
+ model = AutoModelForSequenceClassification.from_pretrained(repo)
25
 
26
+ texts = ["Our co-op paired first-time founders with veteran shop owners to troubleshoot setbacks."]
27
+ inputs = tokenizer(texts, truncation=True, padding=True, return_tensors="pt")
28
+ with torch.no_grad():
29
+ outputs = model(**inputs)
30
+ logits = outputs.logits
31
+ probs = torch.sigmoid(logits)
32
+ preds = (probs >= 0.5).int()
33
+ print('probs', probs.numpy(), 'preds', preds.numpy())
34
+ ```
35
 
36
+ Notes
37
 
38
+ - This model uses a threshold of 0.5 for multi-label predictions. Adjust thresholds per-class as needed.
39
+ - If you want to re-train or fine-tune further, see `train_theme_model.py` in this folder.
40
 
41
+ License
42
 
43
+ Specify your license here (e.g., Apache-2.0) or remove this section if you prefer a different license.