Update README.md
Browse files
README.md
CHANGED
|
@@ -7,6 +7,69 @@ base_model:
|
|
| 7 |
- Qwen/Qwen3-4B-Instruct-2507
|
| 8 |
---
|
| 9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
# Model Card for Model ID
|
| 11 |
|
| 12 |
<!-- Provide a quick summary of what the model is/does. -->
|
|
|
|
| 7 |
- Qwen/Qwen3-4B-Instruct-2507
|
| 8 |
---
|
| 9 |
|
| 10 |
+
## Introduction
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
## Data
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
## Methodology
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
## Evaluation
|
| 20 |
+
|
| 21 |
+
| Model | mmlu_hs_stats | miverva_math | race | bert_prec | bert_recall | bert_f1 |
|
| 22 |
+
|--------------------------|---------------|--------------|------|-----------|-------------|---------|
|
| 23 |
+
| AP_Stat_Inference_Helper | 0.72 | 0.45 | 0.32 | 0.75 | 0.85 | 0.80 |
|
| 24 |
+
| Qwen | 0.72 | 0.45 | 0.32 | 0.75 | 0.85 | 0.80 |
|
| 25 |
+
| Llama | 0.30 | 0.29 | 0.38 | x.xx | x.xx | x.xx |
|
| 26 |
+
| Mistral | 0.46 | 0.09 | 0.38 | x.xx | x.xx | x.xx |
|
| 27 |
+
|
| 28 |
+
## Usage and Intended Use
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
## Prompt Format
|
| 32 |
+
pipe = pipeline(
|
| 33 |
+
"text-generation",
|
| 34 |
+
model = model,
|
| 35 |
+
dtype = torch.bfloat16,
|
| 36 |
+
device_map = "auto",
|
| 37 |
+
tokenizer = tokenizer,
|
| 38 |
+
max_new_tokens = 500,
|
| 39 |
+
do_sample = False)
|
| 40 |
+
formatted_prompt = f"Q: YOUR QUESTION HERE \n\nA: "
|
| 41 |
+
text = pipe(formatted_prompt)
|
| 42 |
+
print(text[0]['generated_text'])
|
| 43 |
+
|
| 44 |
+
## Expected Output Format
|
| 45 |
+
Q: Past experience is that when individuals are approached with a request to fill out and return a particular questionnaire in a provided stamped and addressed envelope, the response rate is 40%. An investigator believes that if the person distributing the questionnaire were stigmatized in some obvious way, potential respondents would feel sorry for the distributor and thus tend to respond at a rate higher than 40%. To test this theory, a distributor wore an eye patch. Of the 200 questionnaires distributed by this individual, 109 were returned. Does this provide evidence that the response rate in this situation is greater than the previous rate of 40%? State and test the appropriate hypotheses using a significance level of 0.05.
|
| 46 |
+
|
| 47 |
+
A: 1. State the hypotheses:
|
| 48 |
+
H0: p = 0.40 (the response rate is the same as the previous rate)
|
| 49 |
+
Ha: p > 0.40 (the response rate is greater than the previous rate)
|
| 50 |
+
|
| 51 |
+
2. Check the conditions:
|
| 52 |
+
- Randomness: The 200 questionnaires were randomly distributed to individuals. (Assuming the investigator randomly selected the individuals to distribute the questionnaire to.)
|
| 53 |
+
- Independence: The 200 individuals are independent of each other. (Assuming the individuals are randomly selected and not related to each other.)
|
| 54 |
+
- Normality: np0 = 200(0.40) = 80 and n(1 - p0) = 200(1 - 0.40) = 120. Both are greater than 10, so the sampling distribution of the sample proportion is approximately normal.
|
| 55 |
+
|
| 56 |
+
3. Calculate the test statistic:
|
| 57 |
+
- Sample proportion: p-hat = 109 / 200 = 0.545
|
| 58 |
+
- Standard error: SE = sqrt(p0(1 - p0) / n) = sqrt(0.40(1 - 0.40) / 200) = sqrt(0.24 / 200) = 0.0346
|
| 59 |
+
- Test statistic: z = (p-hat - p0) / SE = (0.545 - 0.40) / 0.0346 = 4.19
|
| 60 |
+
|
| 61 |
+
4. Determine the p-value:
|
| 62 |
+
Using a standard normal distribution table or calculator, find the probability of obtaining a z-score of 4.19 or greater. This is a very small probability (p-value < 0.0001).
|
| 63 |
+
|
| 64 |
+
5. Make a decision:
|
| 65 |
+
Since the p-value is less than the significance level of 0.05, we reject the null hypothesis.
|
| 66 |
+
|
| 67 |
+
6. Conclusion:
|
| 68 |
+
There is sufficient evidence at the 0.05 significance level to conclude that the response rate in this situation is greater than the previous rate of 40%. The investigator's theory that people would respond at a higher rate when the distributor was stigmatized appears to be supported by the data.
|
| 69 |
+
|
| 70 |
+
## Limitations
|
| 71 |
+
The dataset for this model is solely focus on the inference procedures for the AP Statistics class. This model did not improve on the metrics, however it did improve the format of the answers to the questions asked.
|
| 72 |
+
|
| 73 |
# Model Card for Model ID
|
| 74 |
|
| 75 |
<!-- Provide a quick summary of what the model is/does. -->
|