| | --- |
| | library_name: transformers |
| | license: apache-2.0 |
| | base_model: hfl/chinese-roberta-wwm-ext |
| | tags: |
| | - generated_from_trainer |
| | metrics: |
| | - accuracy |
| | model-index: |
| | - name: music_ent_classification |
| | results: [] |
| | --- |
| | |
| | <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| | should probably proofread and complete it, then remove this comment. --> |
| |
|
| | # music_ent_classification |
| |
|
| | This model is a fine-tuned version of [hfl/chinese-roberta-wwm-ext](https://huggingface.co/hfl/chinese-roberta-wwm-ext) on an unknown dataset. |
| | It achieves the following results on the evaluation set: |
| | - Loss: 0.1230 |
| | - Accuracy: 0.9662 |
| |
|
| | ## Model description |
| |
|
| | More information needed |
| |
|
| | ## Intended uses & limitations |
| |
|
| | More information needed |
| |
|
| | ## Training and evaluation data |
| |
|
| | More information needed |
| |
|
| | ## Training procedure |
| |
|
| | ### Training hyperparameters |
| |
|
| | The following hyperparameters were used during training: |
| | - learning_rate: 2e-05 |
| | - train_batch_size: 32 |
| | - eval_batch_size: 8 |
| | - seed: 42 |
| | - distributed_type: multi-GPU |
| | - num_devices: 4 |
| | - total_train_batch_size: 128 |
| | - total_eval_batch_size: 32 |
| | - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
| | - lr_scheduler_type: linear |
| | - num_epochs: 5.0 |
| | - mixed_precision_training: Native AMP |
| |
|
| | ### Training results |
| |
|
| | | Training Loss | Epoch | Step | Validation Loss | Accuracy | |
| | |:-------------:|:-----:|:----:|:---------------:|:--------:| |
| | | 0.4408 | 1.0 | 54 | 0.1763 | 0.9459 | |
| | | 0.1407 | 2.0 | 108 | 0.1221 | 0.9628 | |
| | | 0.0762 | 3.0 | 162 | 0.1123 | 0.9640 | |
| | | 0.0563 | 4.0 | 216 | 0.1226 | 0.9718 | |
| | | 0.0423 | 5.0 | 270 | 0.1230 | 0.9662 | |
| |
|
| |
|
| | ### Framework versions |
| |
|
| | - Transformers 4.57.5 |
| | - Pytorch 2.6.0+cu124 |
| | - Datasets 2.19.0 |
| | - Tokenizers 0.22.2 |
| |
|
| |
|
| | ### 训练 |
| | ``` |
| | export WANDB_MODE=disabled # 禁用交互式登录 |
| | export CUDA_VISIBLE_DEVICES=0,1,2,3 # 确保识别 4 张 V100 |
| | # 变量定义 |
| | model="hfl/chinese-roberta-wwm-ext" |
| | transformers_root="transformers" |
| | |
| | output_dir="./models/music_ent_classification" |
| | mkdir ${output_dir} -p |
| | |
| | # 使用 torchrun 启动 |
| | torchrun --nproc_per_node=4 \ |
| | ${transformers_root}/examples/pytorch/text-classification/run_classification.py \ |
| | --model_name_or_path ${model} \ |
| | --train_file "./data/*.train.json" \ |
| | --validation_file "./data/*.test.json" \ |
| | --trust_remote_code True \ |
| | --do_train \ |
| | --do_eval \ |
| | --shuffle_train_dataset \ |
| | --metric_name accuracy \ |
| | --text_column_name sentence1 \ |
| | --label_column_name label \ |
| | --max_seq_length 256 \ |
| | --per_device_train_batch_size 32 \ |
| | --learning_rate 2e-5 \ |
| | --num_train_epochs 5 \ |
| | --logging_steps 50 \ |
| | --save_strategy epoch \ |
| | --eval_strategy epoch \ |
| | --fp16 True \ |
| | --output_dir ${output_dir} \ |
| | --overwrite_output_dir |
| | ``` |
| |
|
| | ### 推理 |
| |
|
| | ``` |
| | # 启动分布式推理 |
| | torchrun --nproc_per_node=$(echo $CUDA_DEVICES | tr ',' '\n' | wc -l) \ |
| | transformers/examples/pytorch/text-classification/run_classification.py \ |
| | --model_name_or_path "${MODEL_PATH}" \ |
| | --train_file "${TRAIN_DATA}" \ |
| | --validation_file "${TRAIN_DATA}" \ |
| | --test_file "${INPUT_FILE}" \ |
| | --text_column_name "sentence1" \ |
| | --label_column_name "label" \ |
| | --do_predict \ |
| | --max_seq_length 128 \ |
| | --per_device_eval_batch_size 256 \ |
| | --output_dir "${OUTPUT_DIR}" \ |
| | --fp16 True \ |
| | --trust_remote_code True \ |
| | --overwrite_output_dir |
| | |
| | if [ $? -eq 0 ]; then |
| | echo "✅ [Infer] 推理完成。" |
| | else |
| | echo "❌ [Infer] 推理失败。" |
| | exit 1 |
| | fi |
| | ``` |
| |
|