b4c0n commited on
Commit
776bec5
·
verified ·
1 Parent(s): 65e4f6e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -34
README.md CHANGED
@@ -2,7 +2,6 @@
2
  language:
3
  - ja
4
  license: apache-2.0
5
- base_model: cl-tohoku/bert-base-japanese-v3
6
  library_name: transformers
7
  pipeline_tag: text-classification
8
  tags:
@@ -54,7 +53,7 @@ Japanese toxicity detection model specialized for Japanese language
54
 
55
  ### モデル概要
56
 
57
- 日本語テキストを有害/非有害に分類するモデルです。日本語特有の表現やニスに最適化されています。
58
 
59
  ### 学習データ
60
 
@@ -68,27 +67,38 @@ Japanese toxicity detection model specialized for Japanese language
68
 
69
  ### モデル詳細
70
 
71
- - **ベースモデル**: `cl-tohoku/bert-base-japanese-v3`
72
  - **タスク**: 二値分類(有害/非有害)
73
- - **学習手法**: 連続値ラベル学習(0.0〜1.0)+ BCEWithLogitsLoss
74
- - **特徴**: 改善された学習手法による日本語表現の最適化
 
 
 
75
 
76
- ### 使用例
 
 
77
 
 
 
 
 
 
 
78
  ```python
79
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
80
  import torch
81
 
82
- model_name = "b4c0n/KAi-toxicity-filter"
83
  tokenizer = AutoTokenizer.from_pretrained(model_name)
84
  model = AutoModelForSequenceClassification.from_pretrained(model_name)
85
 
86
  text = "終わってる暴言"
87
- inputs = tokenizer(text, return_tensors="pt")
88
  outputs = model(**inputs)
89
 
90
- toxic_logit = outputs.logits[0][1].item()
91
- toxic_prob = torch.sigmoid(torch.tensor(toxic_logit)).item()
92
 
93
  print(f"有害確率: {toxic_prob:.2%}")
94
  ```
@@ -104,9 +114,10 @@ KAi (かい鯖グループAI) における日本語テキストの有害コン
104
 
105
  ### 制限事項
106
 
107
- - レベルの分類(文脈考慮なし)
108
- - 誤検出(偽陽性/偽陰性)の可能性
109
- - 文化的・地域的文脈により判定が変わる可能性
 
110
  - 人間のレビューなしの自動検閲には適していません
111
 
112
  ### 倫理的配慮
@@ -118,10 +129,6 @@ KAi (かい鯖グループAI) における日本語テキストの有害コン
118
  - 定期的な人間によるレビューを推奨します
119
  - 自動フィルタリング実装時は表現の自由を考慮してください
120
 
121
- ### パフォーマンス
122
-
123
- 日本語の有害表現検出タスクにおいて高いパフォーマンスを発揮します。
124
-
125
  ### ライセンス
126
 
127
  Apache 2.0
@@ -136,7 +143,7 @@ Apache 2.0
136
 
137
  ### Model Description
138
 
139
- This model classifies Japanese text as toxic or non-toxic, specifically optimized for Japanese language nuances and expressions.
140
 
141
  ### Training Data
142
 
@@ -150,27 +157,38 @@ This model was trained on:
150
 
151
  ### Model Details
152
 
153
- - **Base Model**: `cl-tohoku/bert-base-japanese-v3`
154
  - **Task**: Binary Text Classification (toxic/not-toxic)
155
- - **Training**: Continuous label learning (0.0-1.0) with BCEWithLogitsLoss
156
- - **Special Feature**: Optimized for Japanese language with improved training techniques
 
 
 
157
 
158
- ### Usage
159
 
 
 
 
 
 
 
 
 
160
  ```python
161
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
162
  import torch
163
 
164
- model_name = "your-username/KAi-toxicity-filter"
165
  tokenizer = AutoTokenizer.from_pretrained(model_name)
166
  model = AutoModelForSequenceClassification.from_pretrained(model_name)
167
 
168
  text = "toxic expression"
169
- inputs = tokenizer(text, return_tensors="pt")
170
  outputs = model(**inputs)
171
 
172
- toxic_logit = outputs.logits[0][1].item()
173
- toxic_prob = torch.sigmoid(torch.tensor(toxic_logit)).item()
174
 
175
  print(f"Toxic probability: {toxic_prob:.2%}")
176
  ```
@@ -186,9 +204,10 @@ This model was developed for the KAi (KaisabaGroupAI) to detect and filter harmf
186
 
187
  ### Limitations
188
 
189
- - Single sentence classification (no context consideration)
190
  - May have false positives/negatives
191
  - Cultural and regional context may affect predictions
 
192
  - Not designed for automatic censorship without human review
193
 
194
  ### Ethical Considerations
@@ -200,23 +219,18 @@ This model was developed for the KAi (KaisabaGroupAI) to detect and filter harmf
200
  - Regular human review is recommended
201
  - Consider freedom of expression when implementing automated filtering
202
 
203
- ### Performance
204
-
205
- The model shows strong performance on Japanese toxicity detection tasks.
206
-
207
  ### License
208
 
209
  Apache 2.0
210
 
211
  ### Citation
212
-
213
  ```bibtex
214
  @misc{kai-toxicity-filter,
215
- author = {Your Name},
216
  title = {KAi Toxicity Filter: Japanese Toxicity Detection Model},
217
  year = {2025},
218
  publisher = {HuggingFace},
219
- howpublished = {\url{https://huggingface.co/your-username/KAi-toxicity-filter}}
220
  }
221
  ```
222
 
 
2
  language:
3
  - ja
4
  license: apache-2.0
 
5
  library_name: transformers
6
  pipeline_tag: text-classification
7
  tags:
 
53
 
54
  ### モデル概要
55
 
56
+ 日本語テキストを有害/非有害に分類するモデルです。このモデルは**tohoku-nlp/bert-base-japanese-v3**をベースに、日本語の有害表現検出タスクでファインチーニされています。
57
 
58
  ### 学習データ
59
 
 
67
 
68
  ### モデル詳細
69
 
70
+ - **ベースモデル**: tohoku-nlp/bert-base-japanese-v3
71
  - **タスク**: 二値分類(有害/非有害)
72
+ - **学習手法**: 連続値ラベル学習(0.0〜1.0)+ MSE Loss
73
+ - **訓練データ**: 1,899サンプル(訓練: 1,614 / 検証: 285)
74
+ - **エポック数**: 5
75
+ - **学習率**: 2e-5(線形減衰)
76
+ - **特徴**: ハードネガティブサンプリングによる日本語表現の最適化
77
 
78
+ ### 性能
79
+
80
+ 検証データセットでの評価結果:
81
 
82
+ - **Accuracy**: 86.32%
83
+ - **F1 Score**: 70.68%
84
+ - **Precision**: 72.31%
85
+ - **Recall**: 69.12%
86
+
87
+ ### 使用例
88
  ```python
89
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
90
  import torch
91
 
92
+ model_name = "b4c0n/KAi-Toxicity-Filter"
93
  tokenizer = AutoTokenizer.from_pretrained(model_name)
94
  model = AutoModelForSequenceClassification.from_pretrained(model_name)
95
 
96
  text = "終わってる暴言"
97
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
98
  outputs = model(**inputs)
99
 
100
+ probs = torch.softmax(outputs.logits, dim=1)
101
+ toxic_prob = probs[0][1].item()
102
 
103
  print(f"有害確率: {toxic_prob:.2%}")
104
  ```
 
114
 
115
  ### 制限事項
116
 
117
+ - 短い口語表現に特化しており、長文脈依存の有害性検出には限界があります
118
+ - 誤検出(偽陽性/偽陰性)の可能性があります
119
+ - 文化的・地域的文脈により判定が変わる可能性があります
120
+ - 訓練データに含まれない新しいタイプの有害表現は検出できない場合があります
121
  - 人間のレビューなしの自動検閲には適していません
122
 
123
  ### 倫理的配慮
 
129
  - 定期的な人間によるレビューを推奨します
130
  - 自動フィルタリング実装時は表現の自由を考慮してください
131
 
 
 
 
 
132
  ### ライセンス
133
 
134
  Apache 2.0
 
143
 
144
  ### Model Description
145
 
146
+ This model classifies Japanese text as toxic or non-toxic. It is fine-tuned from **tohoku-nlp/bert-base-japanese-v3** for Japanese toxicity detection tasks.
147
 
148
  ### Training Data
149
 
 
157
 
158
  ### Model Details
159
 
160
+ - **Base Model**: tohoku-nlp/bert-base-japanese-v3
161
  - **Task**: Binary Text Classification (toxic/not-toxic)
162
+ - **Training Data**: 1,899 samples (train: 1,614 / validation: 285)
163
+ - **Epochs**: 5
164
+ - **Learning Rate**: 2e-5 with linear decay
165
+ - **Training**: Continuous label learning (0.0-1.0) with MSE Loss
166
+ - **Special Feature**: Optimized for Japanese language with hard negative sampling
167
 
168
+ ### Performance
169
 
170
+ Evaluation results on validation dataset:
171
+
172
+ - **Accuracy**: 86.32%
173
+ - **F1 Score**: 70.68%
174
+ - **Precision**: 72.31%
175
+ - **Recall**: 69.12%
176
+
177
+ ### Usage
178
  ```python
179
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
180
  import torch
181
 
182
+ model_name = "b4c0n/KAi-Toxicity-Filter"
183
  tokenizer = AutoTokenizer.from_pretrained(model_name)
184
  model = AutoModelForSequenceClassification.from_pretrained(model_name)
185
 
186
  text = "toxic expression"
187
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
188
  outputs = model(**inputs)
189
 
190
+ probs = torch.softmax(outputs.logits, dim=1)
191
+ toxic_prob = probs[0][1].item()
192
 
193
  print(f"Toxic probability: {toxic_prob:.2%}")
194
  ```
 
204
 
205
  ### Limitations
206
 
207
+ - Optimized for short colloquial expressions; limited for long texts or context-dependent toxicity
208
  - May have false positives/negatives
209
  - Cultural and regional context may affect predictions
210
+ - Cannot detect new types of toxic expressions not present in training data
211
  - Not designed for automatic censorship without human review
212
 
213
  ### Ethical Considerations
 
219
  - Regular human review is recommended
220
  - Consider freedom of expression when implementing automated filtering
221
 
 
 
 
 
222
  ### License
223
 
224
  Apache 2.0
225
 
226
  ### Citation
 
227
  ```bibtex
228
  @misc{kai-toxicity-filter,
229
+ author = {b4c0n},
230
  title = {KAi Toxicity Filter: Japanese Toxicity Detection Model},
231
  year = {2025},
232
  publisher = {HuggingFace},
233
+ howpublished = {\url{https://huggingface.co/b4c0n/KAi-Toxicity-Filter}}
234
  }
235
  ```
236