Indah1 commited on
Commit
21b24f4
·
verified ·
1 Parent(s): 12f31ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -19
README.md CHANGED
@@ -8,12 +8,12 @@ base_model: BioMistral/BioMistral-7B
8
  </p>
9
 
10
  # BioChat Model
11
- Source Paper: [BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains](https://arxiv.org/abs/2402.10373)
12
 
13
- BioChat is a language model fine-tuned using the ChatDoctor dataset from [ChatDoctor-5k](https://huggingface.co/datasets/LinhDuong/chatdoctor-5k). Specifically designed for medical conversations, BioChat enables users to engage in interactive discussions with a virtual doctor. Whether you are seeking advice about symptoms you are experiencing, exploring possible health conditions, or looking for general medical insights, BioChat is built to assist in a reliable and informative manner.
 
14
 
15
- In addition to its ability to act as a virtual medical consultant, we are still in the early stages of exploring the generation capabilities and limitations of this model. It is important to emphasize that its text generation features are intended solely for research purposes and are not yet suitable for production use. By releasing this model, we aim to drive advancements in biomedical NLP applications and promote best practices for the responsible training and deployment of domain-specific language models. Ensuring reliability, accuracy, and explainability remains a top priority for us.
16
- - **Finetuned from model [optional]:** [BioMistral-7B](https://huggingface.co/BioMistral/BioMistral-7B).
17
 
18
  # Using BioChat
19
 
@@ -36,40 +36,29 @@ model = PeftModel.from_pretrained(model, "Indah1/BioChat10")
36
 
37
  # Fine-Tuning Data
38
 
39
-
40
- [More Information Needed]
41
 
42
  ### Training Procedure
43
 
44
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
45
 
46
- #### Preprocessing [optional]
47
-
48
- [More Information Needed]
49
-
50
-
51
  #### Training Hyperparameters
52
 
53
  | Hyperparameter | Value |
54
  |:-------------------:|:----------------------------------:|
55
- | Rank | 16 |
56
  | Learning Rate | 2e-05 |
57
  | Training Batch Size | 8 |
58
  | Batch Size | 8 |
59
  | Number of GPU | 1 |
60
  | Optimizer | AdamW_8Bit |
 
61
  | Scheduler | Cosine |
62
  | Number of Epoch | 10 |
63
 
64
- #### Speeds, Sizes, Times [optional]
65
-
66
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
67
-
68
- [More Information Needed]
69
-
70
  ## Evaluation
71
 
72
- <!-- This section describes the evaluation protocols and provides the results. -->
73
 
74
  ### Testing Data, Factors & Metrics
75
 
 
8
  </p>
9
 
10
  # BioChat Model
11
+ - **Source Paper:** [BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains](https://arxiv.org/abs/2402.10373)
12
 
13
+ - **BioChat** is a language model fine-tuned using the ChatDoctor dataset from [ChatDoctor-5k](https://huggingface.co/datasets/LinhDuong/chatdoctor-5k). Specifically designed for medical conversations, BioChat enables users to engage in interactive discussions with a virtual doctor. Whether you are seeking advice about symptoms you are experiencing, exploring possible health conditions, or looking for general medical insights, BioChat is built to assist in a reliable and informative manner.
14
+ - **NOTE**: We are still in the early stages of exploring the generation capabilities and limitations of this model. It is important to emphasize that its text generation features are intended solely for research purposes and are not yet suitable for production use.
15
 
16
+ - **Finetuned from model:** [BioMistral-7B](https://huggingface.co/BioMistral/BioMistral-7B).
 
17
 
18
  # Using BioChat
19
 
 
36
 
37
  # Fine-Tuning Data
38
 
39
+ The fine-tuning data used for BioChat is derived from the [ChatDoctor-5k](https://huggingface.co/datasets/LinhDuong/chatdoctor-5k) dataset. This dataset contains a collection of medical conversations tailored to simulate doctor-patient interactions, making it an ideal source for training a medical conversational model. The dataset was carefully curated to ensure relevance and diversity in medical topics.
 
40
 
41
  ### Training Procedure
42
 
43
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
44
 
 
 
 
 
 
45
  #### Training Hyperparameters
46
 
47
  | Hyperparameter | Value |
48
  |:-------------------:|:----------------------------------:|
49
+ | Weigh Decay | 0.01 |
50
  | Learning Rate | 2e-05 |
51
  | Training Batch Size | 8 |
52
  | Batch Size | 8 |
53
  | Number of GPU | 1 |
54
  | Optimizer | AdamW_8Bit |
55
+ | Warm Up Ratio | 0.03 |
56
  | Scheduler | Cosine |
57
  | Number of Epoch | 10 |
58
 
 
 
 
 
 
 
59
  ## Evaluation
60
 
61
+ To determine the best model for fine-tuning, I used *perplexity* as a metric to evaluate performance and select the most optimal version. By leveraging the model's capabilities, I aim to evaluate its behavior and responses using tools like the *Word Embedding Association Test (WEAT)*. It is important to emphasize that its text generation features are intended solely for research purposes and are not yet suitable for production use. By releasing this model, we aim to drive advancements in biomedical NLP applications and contribute to best practices for the responsible development of domain-specific language models. Ensuring reliability, fairness, accuracy, and explainability remains a top priority for us.
62
 
63
  ### Testing Data, Factors & Metrics
64