Chengfengke
/

herbert

Model card Files Files and versions

Chengfengke commited on Dec 4, 2024

Commit

21d2722

·

verified ·

1 Parent(s): 614183c

Update README.md

Files changed (1) hide show

README.md +34 -3

README.md CHANGED Viewed

@@ -1,3 +1,34 @@
----
-license: mit
----

+---
+license: apache-2.0
+base_model:
+- google-bert/bert-base-chinese
+---
+# Herberta: Pretrained Language Model for Herbal Medicine
+**Herberta** is a pretrained model for herbal medicine research, developed based on the `chinese-roberta-wwm-ext-large` model. The model has been fine-tuned on domain-specific data from 675 ancient books and 32 Traditional Chinese Medicine (TCM) textbooks. It is designed to support a variety of TCM-related NLP tasks.
+---
+## Introduction
+This model is optimized for TCM-related tasks, including but not limited to:
+- Herbal formula encoding
+- Domain-specific word embedding
+- Classification, labeling, and sequence prediction tasks in TCM research
+Herberta combines the strengths of modern pretraining techniques and domain knowledge, allowing it to excel in TCM-related text processing tasks.
+---
+## Model Config
+```json
+{
+  "hidden_size": 1024,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "torch_dtype": "float32",
+  "vocab_size": 21128
+}