hiroki-rad
/

bert-base-classification-ft

@@ -1,6 +1,16 @@
 ---
 library_name: transformers
-tags: []
 ---
 # Model Card for Model ID
@@ -17,13 +27,13 @@ tags: []
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
@@ -39,9 +49,60 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
 ### Downstream Use [optional]
@@ -77,7 +138,7 @@ Use the code below to get started with the model.
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 [More Information Needed]

 ---
 library_name: transformers
+tags:
+- code
+datasets:
+- elyza/ELYZA-tasks-100
+language:
+- ja
+metrics:
+- accuracy
+base_model:
+- tohoku-nlp/bert-base-japanese-v3
+pipeline_tag: text-classification
 ---
 # Model Card for Model ID
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** [Hiroki Yanagisawa]
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
+- **Model type:** [BERT]
+- **Language(s) (NLP):** [Japanese]
 - **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [cl-tohoku/bert-base-japanese-v3]
 ### Model Sources [optional]
 ### Direct Use
+from transformers import pipeline
+このlabel2idで学習しました。label2idはこれを利用してください。
+label2id = {'Task_Solution': 0,
+             'Creative_Generation': 1,
+             'Knowledge_Explanation': 2,
+             'Analytical_Reasoning': 3,
+             'Information_Extraction': 4,
+             'Step_by_Step_Calculation': 5,
+             'Role_Play_Response': 6,
+             'Opinion_Perspective': 7}
+def preprocess_text_classification(examples: dict[str, list]) -> BatchEncoding:
+    """バッチ処理用に修正"""
+    encoded_examples = tokenizer(
+        examples["questions"],  # バッチ処理なのでリストで渡される
+        max_length=512,
+        padding=True,
+        truncation=True,
+        return_tensors=None  # バッチ処理時はNoneを指定
+    )
+    # ラベルをバッチで数値に変換
+    encoded_examples["labels"] = [label2id[label] for label in examples["labels"]]
+    return encoded_examples
+##使用するデータセット
+test_data = test_data.to_pandas()
+test_data["labels"] = test_data["labels"].apply(lambda x: label2id[x])
+test_data
+model_name = "hiroki-rad/bert-base-classification-ft"
+classify_pipe = pipeline(model=model_name, device="cuda:0")
+class_label = dataset["labels"].unique()
+label2id = {label: id for id, label in enumerate(class_label)}
+id2label = {id: label for id, label in enumerate(class_label)}
+results: list[dict[str, float | str]] = []
+for i, example in tqdm(enumerate(test_data.itertuples())):
+    # モデルの予測結果を取得
+    model_prediction = classify_pipe(example.questions)[0]
+    # 正解のラベルIDをラベル名に変換
+    true_label = id2label[example.labels]
+    results.append(
+        {
+            "example_id": i,
+            "pred_prob": model_prediction["score"],
+            "pred_label": model_prediction["label"],
+            "true_label": true_label,
+        }
+    )
 ### Downstream Use [optional]
 ### Training Data
+<!https://huggingface.co/datasets/elyza/ELYZA-tasks-100>
 [More Information Needed]