diff --git "a/math-ds-complete/index.html" "b/math-ds-complete/index.html" --- "a/math-ds-complete/index.html" +++ "b/math-ds-complete/index.html" @@ -810,6 +810,90 @@
For skewed data (like income, house prices), always report the median along with the mean. If they're very different, your data has outliers or is skewed!
+ +Find the mean, median, and mode of: [12, 15, 12, 18, 20, 15, 12, 22]
+Calculate the Mean (Average)
+Sum = 12 + 15 + 12 + 18 + 20 + 15 + 12 + 22 = 126Count (n) = 8 valuesMean = Sum ÷ n = 126 ÷ 8 = 15.75
+ Add all values together, then divide by how many values there are
+Find the Median (Middle Value)
+Sorted data: [12, 12, 12, 15, 15, 18, 20, 22]Even number of values (8), so average the middle twoMiddle positions: 4th and 5th values = 15 and 15Median = (15 + 15) ÷ 2 = 15
+ For even-sized datasets, average the two middle values
+Find the Mode (Most Frequent Value)
+Frequency count: • 12 appears 3 times ← Most frequent! • 15 appears 2 times • 18, 20, 22 each appear 1 timeMode = 12
+ The mode is the value that appears most often
+Mean (15.75) is slightly higher than median (15) because the outlier 22 pulls it up. The mode (12) is the lowest because it's the most common value at the lower end.
+Calculate the variance and standard deviation for the dataset: [4, 8, 6, 5, 3]
+Calculate the Mean
+Sum = 4 + 8 + 6 + 5 + 3 = 26Mean (x̄) = 26 ÷ 5 = 5.2
+ First, we need the mean to calculate deviations
+Find Deviations from Mean
+(4 - 5.2) = -1.2(8 - 5.2) = 2.8(6 - 5.2) = 0.8(5 - 5.2) = -0.2(3 - 5.2) = -2.2
+ Subtract the mean from each value
+Square Each Deviation
+(-1.2)² = 1.44(2.8)² = 7.84(0.8)² = 0.64(-0.2)² = 0.04(-2.2)² = 4.84
+ Squaring eliminates negative signs and emphasizes larger deviations
+Calculate Variance (sample)
+Sum of squared deviations = 1.44 + 7.84 + 0.64 + 0.04 + 4.84 = 14.8Divide by (n-1) = 5-1 = 4s² = 14.8 ÷ 4 = 3.7
+ We use (n-1) for sample variance (Bessel's correction)
+Calculate Standard Deviation
+s = √s² = √3.7 ≈ 1.92
+ Standard deviation is the square root of variance
+A standard deviation of 1.92 means most values fall within about 1.92 units of the mean (5.2). This indicates moderate spread in the data.
+Calculate and interpret skewness for dataset: [2, 3, 4, 5, 15]
+Calculate the Mean
+Sum = 2 + 3 + 4 + 5 + 15 = 29n = 5Mean (x̄) = 29/5 = 5.8
+ First, find the average of all values
+Calculate Standard Deviation
+Deviations from mean: (2-5.8), (3-5.8), (4-5.8), (5-5.8), (15-5.8)= -3.8, -2.8, -1.8, -0.8, 9.2Squared: 14.44, 7.84, 3.24, 0.64, 84.64Variance (sample) = (14.44+7.84+3.24+0.64+84.64)/4 = 110.8/4 = 27.7SD = √27.7 = 5.26
+ We need standard deviation for the skewness formula
+Calculate Skewness
+Cubed deviations: (-3.8)³, (-2.8)³, (-1.8)³, (-0.8)³, (9.2)³= -54.87, -21.95, -5.83, -0.51, 778.69Sum = 695.53Skewness = (695.53/5) / (5.26)³ = 139.11 / 145.77 = 0.95
+ Skewness formula uses cubed deviations divided by cubed standard deviation
+Interpret the Result
+Skewness = +0.95 (positive)Distribution is right-skewedThe value 15 pulls the tail to the rightMost data clustered on left, long tail on right
+ Positive skewness means tail extends to the right
+The positive skewness confirms that the outlier (15) creates a long right tail, pulling the mean (5.8) above the median (4).
+Find covariance between X=[2, 4, 6, 8] and Y=[1, 3, 5, 7]
+Calculate the Means
+x̄ = (2 + 4 + 6 + 8) / 4 = 20 / 4 = 5ȳ = (1 + 3 + 5 + 7) / 4 = 16 / 4 = 4
+ Find the average of each variable
+Create Deviation Table
+| x | +y | +(x-x̄) | +(y-ȳ) | +(x-x̄)(y-ȳ) | +
|---|---|---|---|---|
| 2 | +1 | +-3 | +-3 | +9 | +
| 4 | +3 | +-1 | +-1 | +1 | +
| 6 | +5 | +1 | +1 | +1 | +
| 8 | +7 | +3 | +3 | +9 | +
| Sum | +20 | +|||
Calculate deviations from means and their products
+Calculate Sample Covariance
+Cov(X,Y) = Σ(x-x̄)(y-ȳ) / (n-1)Cov(X,Y) = 20 / (4-1)Cov(X,Y) = 20 / 3Cov(X,Y) = 6.67
+ Use n-1 for sample covariance (Bessel's correction)
+Interpret the Result
+Cov(X,Y) = 6.67 > 0Positive covariance indicates:• X and Y tend to increase together• When X is above its mean, Y tends to be above its mean• When X is below its mean, Y tends to be below its mean
+ Positive covariance shows positive relationship
+The positive covariance confirms that X and Y have a positive linear relationship. As X increases by 2, Y also increases by 2, showing consistent movement together.
+Study hours vs exam scores typically show r = 0.7 (strong positive). More study hours correlate with higher scores.
Calculate correlation coefficient for X=[2, 4, 6, 8] and Y=[1, 3, 5, 7]
+Use Covariance from Topic 11
+From previous calculation:Cov(X,Y) = 6.67x̄ = 5, ȳ = 4
+ We already calculated this in Topic 11
+Calculate Standard Deviation of X
+Deviations from mean: -3, -1, 1, 3Squared deviations: 9, 1, 1, 9Sum of squared deviations = 20Variance_x = 20 / (4-1) = 20/3 = 6.67SD_x = √6.67 ≈ 2.58
+ Standard deviation measures spread of X values
+Calculate Standard Deviation of Y
+Deviations from mean: -3, -1, 1, 3Squared deviations: 9, 1, 1, 9Sum of squared deviations = 20Variance_y = 20 / (4-1) = 20/3 = 6.67SD_y = √6.67 ≈ 2.58
+ Standard deviation measures spread of Y values
+Calculate Correlation Coefficient
+r = Cov(X,Y) / (SD_x × SD_y)r = 6.67 / (2.58 × 2.58)r = 6.67 / 6.66r ≈ 1.00
+ Correlation standardizes covariance by dividing by both standard deviations
+Interpret the Result
+r = 1.00 (perfect positive correlation)This means:• X and Y have a perfect linear relationship• As X increases by 2, Y increases by 2 (exactly)• All points lie exactly on a straight line• The relationship is: Y = 0.5X (or Y = -1 + 0.5X when adjusted)
+ r = 1 indicates perfect positive linear correlation
+Check: If we plot these points, they form a perfect line. When X=2, Y=1; X=4, Y=3; X=6, Y=5; X=8, Y=7. The relationship is Y = (X/2) - 1 + (X/2) = 0.5X, which is indeed perfectly linear! ✓
+Does ice cream cause drowning? NO! The third variable is summer weather—more people swim in summer (more drownings) and eat ice cream in summer.
Study finds r = -0.75 between hours of TV watched and exam scores. Interpret this result and discuss causation.
+Analyze the Sign
+Negative correlation (r < 0)As one variable increases, the other decreasesMore TV → Lower scores (or vice versa)
+ The negative sign tells us the direction of the relationship
+Analyze the Strength
+|r| = |-0.75| = 0.75Interpretation scale: • 0.0-0.3 = Weak • 0.3-0.7 = Moderate • 0.7-1.0 = Strong0.75 falls in "Strong" category
+ The absolute value determines relationship strength
+State the Relationship
+Strong negative correlationStudents who watch more TV tend to have lower exam scoresRelationship is fairly consistent but not perfect
+ Combine sign and strength for complete interpretation
+Address Causation
+Correlation ≠ Causation!Possible explanations: a) TV causes lower scores (less study time) b) Lower-performing students watch more TV (compensating) c) Third variable: stress causes both TV watching and poor performanceCannot determine causation from correlation alone
+ Correlation never proves causation - always consider alternatives
+Predict Using Correlation
+If we know TV hours, we can predict exam scoreBut prediction ≠ causationr² = 0.75² = 0.56 = 56% of variance explained
+ r² shows percentage of variance in one variable explained by the other
+While the correlation is strong, we must resist concluding causation. The relationship could be coincidental, reverse-causal, or due to confounding variables.
+In a class of 40 students: 25 like Math, 20 like Science, 10 like both. Find: a) P(Math OR Science), b) P(only Math), c) P(neither)
+Set Up the Information
+Total students: n = 40P(Math) = 25/40 = 0.625P(Science) = 20/40 = 0.5P(Math ∩ Science) = 10/40 = 0.25
+ Convert all counts to probabilities
+Find P(Math ∪ Science) using Addition Rule
+Formula: P(A ∪ B) = P(A) + P(B) - P(A ∩ B)P(Math ∪ Science) = 0.625 + 0.5 - 0.25= 1.125 - 0.25= 0.875
+ We subtract the intersection to avoid double-counting
+Find P(only Math)
+Only Math = Math AND NOT ScienceStudents in only Math = 25 - 10 = 15P(only Math) = 15/40 = 0.375
+ Subtract those who like both from total Math students
+Find P(neither)
+Neither = NOT (Math OR Science)P(neither) = 1 - P(Math ∪ Science)= 1 - 0.875= 0.125Or: 40 - 35 = 5 students, so 5/40 = 0.125 ✓
+ Use complement rule or count directly
+Check: 0.375 (only Math) + 0.25 (both) + 0.25 (only Science) + 0.125 (neither) = 1.0 ✓
+Two dice are rolled. Let A = "first die shows 6" and B = "sum is 7". Are A and B independent?
+Find P(A)
+First die shows 6: one outcome out of 6P(A) = 1/6 ≈ 0.167
+ Probability the first die is 6
+Find P(B)
+Sum equals 7: (1,6), (2,5), (3,4), (4,3), (5,2), (6,1)6 favorable outcomes out of 36 totalP(B) = 6/36 = 1/6 ≈ 0.167
+ Count all ways to get sum of 7
+Find P(A ∩ B)
+First die is 6 AND sum is 7Only possibility: (6,1)P(A ∩ B) = 1/36 ≈ 0.028
+ Find where both events occur simultaneously
+Test Independence
+If independent: P(A ∩ B) = P(A) × P(B)P(A) × P(B) = (1/6) × (1/6) = 1/36P(A ∩ B) = 1/361/36 = 1/36 ✓ EQUAL!
+ Compare the two probabilities to test independence
+Conclusion
+Events A and B ARE independentKnowing first die is 6 doesn't change probability of sum being 7
+ When the product rule holds, events are independent
+We can also verify: P(B|A) = P(A∩B)/P(A) = (1/36)/(1/6) = 1/6 = P(B). Since P(B|A) = P(B), the events are independent.
+A disease affects 1% of the population. A test is 99% accurate (detects 99% of sick people and correctly identifies 99% of healthy people). You test positive. What's the probability you actually have the disease?
+Define the Events and Given Information
+Let A = has diseaseLet B = tests positiveP(A) = 0.01 (1% of population has disease)P(B|A) = 0.99 (99% true positive rate)P(B|A') = 0.01 (1% false positive rate)
+ Set up all known probabilities before applying Bayes' Theorem
+Calculate P(B) using Total Probability
+P(B) = P(B|A) × P(A) + P(B|A') × P(A')P(B) = (0.99 × 0.01) + (0.01 × 0.99)P(B) = 0.0099 + 0.0099 = 0.0198
+ Find the overall probability of testing positive
+Apply Bayes' Theorem
+P(A|B) = [P(B|A) × P(A)] / P(B)P(A|B) = (0.99 × 0.01) / 0.0198P(A|B) = 0.0099 / 0.0198P(A|B) = 0.5 = 50%
+ This is the posterior probability - what we want to find!
+This counter-intuitive result occurs because the disease is so rare (1%). Even with a 99% accurate test, there are many more false positives from the healthy 99% than true positives from the sick 1%. Base rates matter!
+Continuous random variable X has uniform distribution on interval [0, 10]. a) Find the PDF f(x), b) Calculate P(3 ≤ X ≤ 7)
+Understand Uniform Distribution
+X is equally likely anywhere between 0 and 10For uniform on [a, b], PDF is constantTotal area under curve must equal 1
+ Uniform means constant probability density across the interval
+Find PDF Height
+Interval length = b - a = 10 - 0 = 10For area = 1: height × width = 1height × 10 = 1height = 1/10 = 0.1Therefore: f(x) = 0.1 for 0 ≤ x ≤ 10, and 0 otherwise
+ The constant height must give total area of 1
+Calculate P(3 ≤ X ≤ 7)
+For continuous uniform: P(a ≤ X ≤ b) = (b-a) × heightP(3 ≤ X ≤ 7) = (7-3) × 0.1= 4 × 0.1= 0.4
+ Probability is the area of the rectangle
+Visualize (Area Under Curve)
+Rectangle: width = 4, height = 0.1Area = 4 × 0.1 = 0.4This represents probability
+ The geometric area equals the probability
+P(0 ≤ X ≤ 10) = 10 × 0.1 = 1.0 ✓ (total probability = 1)
+For the uniform distribution from Topic 20 (X ~ Uniform[0,10]), find: a) F(5) = P(X ≤ 5), b) F(12), c) P(2 < X ≤ 8)
+Recall PDF
+f(x) = 0.1 for 0 ≤ x ≤ 10CDF is cumulative (area from left up to x)
+ CDF accumulates probability from the left
+Find F(5)
+F(5) = P(X ≤ 5)Area from 0 to 5: width = 5, height = 0.1F(5) = 5 × 0.1 = 0.5
+ Half of the distribution is below x = 5
+Find F(12)
+F(12) = P(X ≤ 12)But X can't exceed 10All probability is accounted for by x = 10F(12) = 1.0 (certainty)
+ CDF plateaus at 1 beyond the support of the distribution
+Find P(2 < X ≤ 8)
+Using CDF: P(a < X ≤ b) = F(b) - F(a)F(8) = 8 × 0.1 = 0.8F(2) = 2 × 0.1 = 0.2P(2 < X ≤ 8) = 0.8 - 0.2 = 0.6
+ Subtract lower CDF from upper CDF
+General CDF Formula
+For uniform [0, 10]: • F(x) = 0 if x < 0 • F(x) = x/10 if 0 ≤ x ≤ 10 • F(x) = 1 if x > 10
+ The complete CDF function has three pieces
+F(0) = 0 (no probability below 0), F(10) = 1 (all probability by 10), F is non-decreasing ✓
+Flip a fair coin once. Let X = 1 if Heads, X = 0 if Tails. a) Find P(X=1) and P(X=0), b) Calculate E(X) and Var(X)
+Identify Bernoulli Trial
+Single trial with two outcomes (Success/Failure)Success = Heads, p = 0.5Failure = Tails, 1-p = 0.5
+ This is a classic Bernoulli trial
+Find Probabilities
+P(X = 1) = p = 0.5 (probability of heads)P(X = 0) = 1-p = 0.5 (probability of tails)Check: 0.5 + 0.5 = 1.0 ✓
+ Probabilities must sum to 1
+Calculate Expected Value
+Formula: E(X) = pE(X) = 0.5Or: E(X) = 0×P(X=0) + 1×P(X=1)= 0×0.5 + 1×0.5 = 0.5 ✓
+ Expected value is the probability of success
+Calculate Variance
+Formula: Var(X) = p(1-p)Var(X) = 0.5 × 0.5 = 0.25Standard deviation: σ = √0.25 = 0.5
+ Variance measures spread of outcomes
+Interpret
+On average, we get 0.5 heads per flipVariance measures spread of 0 and 1 outcomes
+ Expected value represents long-run average
+For fair coin, p = 0.5 makes sense. Over many flips, we expect half heads (E(X) = 0.5).
+99.7% have IQ between 55-145
IQ scores follow Normal distribution with μ = 100, σ = 15. Find: a) P(IQ ≤ 115), b) P(85 ≤ IQ ≤ 115), c) IQ score at 95th percentile
+Understand Normal Distribution
+Bell-shaped, symmetric around meanμ = 100 (center)σ = 15 (spread)
+ Parameters define the shape and location of the curve
+Find P(IQ ≤ 115) using z-score
+z = (x - μ)/σ = (115 - 100)/15 = 15/15 = 1P(Z ≤ 1) = 0.8413 (from z-table)About 84.13% have IQ ≤ 115
+ Standardize to z-score, then use standard normal table
+Find P(85 ≤ IQ ≤ 115)
+Lower bound: z₁ = (85-100)/15 = -15/15 = -1Upper bound: z₂ = (115-100)/15 = 1This is μ ± 1σ (68-95-99.7 rule)P(-1 ≤ Z ≤ 1) = 0.68 (approximately 68%)Exact: P(Z≤1) - P(Z≤-1) = 0.8413 - 0.1587 = 0.6826
+ One standard deviation on each side covers 68% of data
+Find 95th Percentile
+P(IQ ≤ x) = 0.95From z-table: z = 1.645 for 95th percentilex = μ + zσ = 100 + 1.645×15= 100 + 24.675 = 124.675IQ ≈ 125
+ Convert z-score back to original scale using inverse formula
+Using 68-95-99.7 rule: μ±1σ contains 68% ✓, μ±2σ contains 95%, μ±3σ contains 99.7%. Our answer matches the empirical rule!
+Explain the difference between α = 0.05 and α = 0.01. Which is more strict? Find critical values for both in a two-tailed test.
+Understand α = 0.05
+α = 0.05 means 5% significance
+ 95% confidence level (1 - 0.05)
+ P(Type I error) = 5%
+ Willing to be wrong 5% of the time
+ Understand α = 0.01
+α = 0.01 means 1% significance
+ 99% confidence level (1 - 0.01)
+ P(Type I error) = 1%
+ Only willing to be wrong 1% of the time
+ Find Critical Values for α = 0.05
+Two-tailed: split α into both tails
+ Each tail = 0.05/2 = 0.025
+ Z₀.₉₇₅ = ±1.96
+ Reject if |z| > 1.96
+ Find Critical Values for α = 0.01
+Two-tailed: each tail = 0.01/2 = 0.005
+ Z₀.₉₉₅ = ±2.576
+ Reject if |z| > 2.576
+ Harder to reject (more strict!)
+ Compare
+α = 0.01 is MORE STRICT
+ Requires stronger evidence to reject H₀
+ Reduces Type I errors but increases Type II
+ Population has σ = 20. Calculate standard error for sample sizes: n = 4, n = 16, n = 64, n = 100. What pattern do you notice?
+Recall Standard Error Formula
+SE = σ / √n
+ Where:
+ - σ = population standard deviation
+ - n = sample size
+ SE measures variability of sample means
+ Calculate SE for n = 4
+SE = 20 / √4
+ SE = 20 / 2
+ SE = 10
+ Calculate SE for n = 16
+SE = 20 / √16
+ SE = 20 / 4
+ SE = 5
+ Calculate SE for n = 64
+SE = 20 / √64
+ SE = 20 / 8
+ SE = 2.5
+ Calculate SE for n = 100
+SE = 20 / √100
+ SE = 20 / 10
+ SE = 2
+ Analyze Pattern
+n = 4: SE = 10
+ n = 16: SE = 5 (4× sample → ½ SE)
+ n = 64: SE = 2.5 (16× sample → ¼ SE)
+ n = 100: SE = 2 (25× sample → ⅕ SE)
+
+ Pattern: Quadruple sample size → Half the SE
+ Larger samples give more precise estimates!
+ A factory claims μ = 100. Sample: n = 36, x̄ = 105, σ = 12. Test at α = 0.05 (two-tailed).
+State Hypotheses
+H₀: μ = 100 (claim is true)
+ H₁: μ ≠ 100 (claim is false)
+ α = 0.05, two-tailed test
+ Calculate Standard Error
+SE = σ / √n
+ SE = 12 / √36
+ SE = 12 / 6
+ SE = 2
+ Calculate Z-Statistic
+z = (x̄ - μ₀) / SE
+ z = (105 - 100) / 2
+ z = 5 / 2
+ z = 2.5
+ Find Critical Values
+α = 0.05, two-tailed
+ Critical values: z = ±1.96
+ Rejection regions: z < -1.96 or z > 1.96
+ Make Decision
+Test statistic: z = 2.5
+ Critical value: z = 1.96
+ 2.5 > 1.96 → In rejection region
+
+ REJECT H₀
+ Interpret
+There IS significant evidence that μ ≠ 100
+ The sample mean of 105 is statistically different
+ Factory's claim is likely false
+ P-value = 2 × P(Z > 2.5) = 2 × 0.0062 = 0.0124 < 0.05 ✓ Confirms rejection
+Converts any normal distribution to standard normal (μ=0, σ=1)
Find critical z-values for: a) α = 0.05 one-tailed (right), b) α = 0.05 two-tailed, c) α = 0.01 two-tailed. Draw rejection regions.
+One-Tailed Right (α = 0.05)
+All α in right tail
+ Find z where P(Z > z) = 0.05
+ P(Z ≤ z) = 1 - 0.05 = 0.95
+ From z-table: z₀.₉₅ = 1.645
+
+ Critical value: z = 1.645
+ Reject H₀ if z > 1.645
+ Two-Tailed (α = 0.05)
+Split α between both tails
+ Each tail = 0.05/2 = 0.025
+ Left tail: P(Z < z) = 0.025 → z = -1.96
+ Right tail: P(Z > z) = 0.025 → z = +1.96
+
+ Critical values: z = ±1.96
+ Reject H₀ if |z| > 1.96
+ Two-Tailed (α = 0.01)
+More strict test
+ Each tail = 0.01/2 = 0.005
+ P(Z < z) = 0.005 → z = -2.576
+ P(Z > z) = 0.005 → z = +2.576
+
+ Critical values: z = ±2.576
+ Reject H₀ if |z| > 2.576
+ Visualize Rejection Regions
+One-tailed (α=0.05): [______|████] z > 1.645
+ Two-tailed (α=0.05): [██|________|██] |z| > 1.96
+ Two-tailed (α=0.01): [█|__________|█] |z| > 2.576
+
+ Smaller α → Larger critical values → Harder to reject
+ P-value is NOT the probability that H₀ is true! It's the probability of observing your data IF H₀ were true.
Sample of 36 students has mean score x̄ = 78. Population mean claimed to be μ₀ = 75 with σ = 12. Test at α = 0.05 using p-value method.
+State Hypotheses
+H₀: μ = 75 (null hypothesis - no difference)H₁: μ ≠ 75 (alternative - there is a difference)Two-tailed test
+ Set up null and alternative hypotheses
+Calculate Test Statistic
+z = (x̄ - μ₀) / (σ/√n)z = (78 - 75) / (12/√36)z = 3 / (12/6)z = 3 / 2 = 1.5
+ Calculate the z-score
+Find P-Value
+For two-tailed: p-value = 2 × P(Z > |1.5|)P(Z > 1.5) = 1 - 0.9332 = 0.0668p-value = 2 × 0.0668 = 0.1336
+ Multiply by 2 for two-tailed test
+Compare with α
+p-value = 0.1336α = 0.050.1336 > 0.05
+ Since p-value exceeds α, we fail to reject H₀
+Make Decision
+Since p-value > α, FAIL TO REJECT H₀Not enough evidence to conclude mean differs from 75p-value of 13.36% means we'd see results this extreme13.36% of time if H₀ true
+ Interpret in context
+The result is not statistically significant at α = 0.05 level. We need stronger evidence to claim the mean differs from 75.
+Researcher claims new drug LOWERS blood pressure (μ < 120). Sample of 49: x̄ = 115, σ = 21. Test at α = 0.05. Should this be one-tailed or two-tailed?
+Analyze the Claim
+Claim: drug LOWERS pressure (directional)Looking for decrease specificallyThis requires ONE-TAILED test (left tail)
+ Directional claim = one-tailed test
+Set Up Hypotheses
+H₀: μ ≥ 120 (blood pressure not lower)H₁: μ < 120 (blood pressure IS lower)Left-tailed test
+ Alternative hypothesis shows the direction
+Calculate Z-Score
+z = (x̄ - μ₀) / (σ/√n)z = (115 - 120) / (21/√49)z = -5 / (21/7)z = -5 / 3 = -1.67
+ Negative z-score indicates below mean
+Find Critical Value (One-Tailed)
+For α = 0.05, one-tailed (left)Critical value: z = -1.645
+ One-tailed critical value differs from two-tailed
+Make Decision
+Test statistic: z = -1.67Critical value: z = -1.645-1.67 < -1.645 (in rejection region)REJECT H₀
+ Falls in rejection region, so reject null
+Contrast with Two-Tailed
+If two-tailed: critical values ±1.96Our |z| = 1.67 < 1.96Would NOT reject H₀ with two-tailed!This shows importance of choosing correct test
+ Test choice matters!
+Evidence supports claim that drug lowers blood pressure. One-tailed test was appropriate for directional claim.
+Small sample: n = 16, x̄ = 52, s = 8. Test if μ = 50 at α = 0.05. Population σ unknown.
+Choose Correct Test
+n = 16 < 30 (small sample)σ unknown (use sample s)Use T-TEST instead of z-test
+ Small sample + unknown σ = t-test
+Calculate T-Statistic
+t = (x̄ - μ₀) / (s/√n)t = (52 - 50) / (8/√16)t = 2 / (8/4)t = 2 / 2 = 1.0
+ Use sample standard deviation s
+Find Degrees of Freedom
+df = n - 1df = 16 - 1 = 15
+ Lose 1 df for estimating mean
+Find Critical Value
+Two-tailed test, α = 0.05df = 15From t-table: t₀.₀₂₅,₁₅ = ±2.131
+ Look up in t-distribution table
+Compare and Decide
+Test statistic: t = 1.0Critical values: ±2.131|1.0| < 2.131FAIL TO REJECT H₀
+ Test statistic not in rejection region
+Interpret
+Not enough evidence that μ ≠ 50Sample mean of 52 is not significantly different from 50
+ Interpret in context of problem
+The difference between 52 and 50 is not statistically significant at α = 0.05 level with this small sample.
+Calculate degrees of freedom for: a) Single sample t-test: n = 20, b) Two-sample t-test: n₁ = 15, n₂ = 18, c) Chi-squared test: 3×4 contingency table
+Single Sample T-Test
+Formula: df = n - 1n = 20df = 20 - 1 = 19We "lose" 1 df because we estimate mean from sample
+ Each parameter estimated reduces df by 1
+Two-Sample T-Test (Equal Variances)
+Formula: df = n₁ + n₂ - 2n₁ = 15, n₂ = 18df = 15 + 18 - 2 = 31Lose 1 df per sample for estimating each mean
+ Two samples = two means estimated
+Chi-Squared Contingency Table
+Formula: df = (rows - 1) × (columns - 1)3 rows, 4 columnsdf = (3 - 1) × (4 - 1)df = 2 × 3 = 6
+ Degrees of freedom for independence test
+Explain Concept
+Degrees of freedom = number of values free to varyEach parameter estimated reduces df by 1Higher df → distribution closer to normal
+ Conceptual understanding
+These df values would be used to find appropriate critical values from respective distribution tables.
+Type II Error: Telling sick person they're healthy (missed diagnosis)
Drug trial tests H₀: "Drug is safe" vs H₁: "Drug is dangerous". Describe Type I and Type II errors with consequences.
+Define Type I Error (False Positive)
+Type I: Reject H₀ when H₀ is TRUEIn this case: Conclude drug is dangerous when it's actually safeProbability = α (significance level)Consequence: Safe drug rejected, patients miss beneficial treatment
+ False alarm - reject truth
+Define Type II Error (False Negative)
+Type II: Fail to reject H₀ when H₁ is TRUEIn this case: Conclude drug is safe when it's actually dangerousProbability = βConsequence: Dangerous drug approved, patients harmed!
+ Miss detecting danger
+Create Decision Matrix
+Reality vs Decision:If H₀ true (safe) + Reject H₀ (call dangerous) = TYPE IIf H₁ true (dangerous) + Fail to reject = TYPE IICorrect decisions: Accept truth or reject false
+ Four possible outcomes
+Calculate Example
+If α = 0.05: 5% chance of Type I errorIf β = 0.20: 20% chance of Type II errorPower = 1 - β = 0.80 (80% chance of detecting dangerous drug)
+ Probabilities of each error
+Compare Consequences
+Type I: Waste safe drug (economic cost)Type II: Approve dangerous drug (LIFE RISK!)Type II often more serious → use lower α
+ Context determines which error is worse
+In medical contexts, Type II errors (missing danger) are often considered worse than Type I errors (false alarms).
+f'(x) = 3·4x³ - 2·2x + 0
f'(x) = 12x³ - 4x
Find the derivative of f(x) = 3x⁴ - 2x³ + 5x - 7
+Identify the Power Rule
+Power Rule: d/dx(xⁿ) = n·xⁿ⁻¹Apply to each term separately
+ The power rule works term by term
+Derivative of First Term: 3x⁴
+Coefficient stays: 3Exponent: 4 → multiply by 4 and decrease exponentResult: 3 × 4 × x³ = 12x³
+ Bring down the 4, multiply by coefficient 3
+Derivative of Second Term: -2x³
+Result: -2 × 3 × x² = -6x²
+ Same process, keep the negative sign
+Derivative of Third Term: 5x
+5x = 5x¹Result: 5 × 1 × x⁰ = 5
+ x⁰ = 1, so we just get the coefficient
+Derivative of Fourth Term: -7
+Constant → derivative is 0
+ Constants disappear when we take the derivative
+Combine All Terms
+f'(x) = 12x³ - 6x² + 5 + 0f'(x) = 12x³ - 6x² + 5
+ Sum the derivatives of each term
+At x=1: f'(1) = 12(1) - 6(1) + 5 = 11. This is the slope of the tangent line at x=1.
+Calculate the definite integral: ∫₀³ (2x + 1) dx
+Find the Antiderivative
+∫(2x + 1) dx∫2x dx = 2 × (x²/2) = x²∫1 dx = xF(x) = x² + x + C
+ Use the power rule in reverse
+Apply the Fundamental Theorem of Calculus (Part 2)
+∫ₐᵇ f(x)dx = F(b) - F(a)No need for +C (it cancels out)
+ Evaluate antiderivative at bounds and subtract
+Evaluate at Upper Bound (x = 3)
+F(3) = 3² + 3F(3) = 9 + 3 = 12
+ Substitute x = 3 into F(x)
+Evaluate at Lower Bound (x = 0)
+F(0) = 0² + 0F(0) = 0
+ Substitute x = 0 into F(x)
+Subtract: F(upper) - F(lower)
+∫₀³ (2x + 1) dx = F(3) - F(0)= 12 - 0 = 12
+ Final calculation
+The area under the curve y = 2x + 1 from x = 0 to x = 3 is 12 square units. This represents the accumulated value of the function over that interval.
+• Always visualize your data first (Anscombe's quartet!)
Find the best-fit line for the data points: (1,2), (2,4), (3,5), (4,7), (5,8)
+Calculate the Means
+x̄ = (1+2+3+4+5)/5 = 15/5 = 3ȳ = (2+4+5+7+8)/5 = 26/5 = 5.2
+ Find the average of x values and y values
+Set Up Slope Formula
+b = Σ(x-x̄)(y-ȳ) / Σ(x-x̄)²Need to calculate deviations and their products
+ This is the least squares formula for slope
+Create Calculation Table
+x | y | (x-x̄) | (y-ȳ) | (x-x̄)(y-ȳ) | (x-x̄)²1 | 2 | -2 | -3.2 | 6.4 | 42 | 4 | -1 | -1.2 | 1.2 | 13 | 5 | 0 | -0.2 | 0 | 04 | 7 | 1 | 1.8 | 1.8 | 15 | 8 | 2 | 2.8 | 5.6 | 4Sum| | | | 15 | 10
+ Organize all calculations in a table
+Calculate Slope (b)
+b = 15 / 10 = 1.5
+ For every 1 unit increase in x, y increases by 1.5
+Calculate Intercept (a)
+a = ȳ - b·x̄a = 5.2 - (1.5 × 3)a = 5.2 - 4.5 = 0.7
+ Where the line crosses the y-axis
+Write the Equation
+y = a + bxy = 0.7 + 1.5x
+ This is our best-fit line!
+When x=3: y = 0.7 + 1.5(3) = 5.2 ✓ (matches our mean!)
+When x=1: y = 0.7 + 1.5(1) = 2.2 (close to actual 2)
+• Typical learning rates: 0.001 to 0.1
Minimize f(x) = x² using gradient descent. Start at x₀ = 5, learning rate α = 0.1, run 3 iterations.
+Find the Gradient (Derivative)
+f(x) = x²f'(x) = 2xThis is the gradient we'll use
+ The gradient tells us which direction is uphill
+Iteration 1
+Current position: x₀ = 5Gradient: f'(5) = 2(5) = 10Update: x₁ = x₀ - α·f'(x₀)x₁ = 5 - 0.1(10) = 5 - 1 = 4
+ Move against the gradient (downhill)
+Iteration 2
+Current position: x₁ = 4Gradient: f'(4) = 2(4) = 8Update: x₂ = 4 - 0.1(8) = 4 - 0.8 = 3.2
+ Gradient is smaller, we're getting closer to minimum
+Iteration 3
+Current position: x₂ = 3.2Gradient: f'(3.2) = 2(3.2) = 6.4Update: x₃ = 3.2 - 0.1(6.4) = 3.2 - 0.64 = 2.56
+ Still moving toward x = 0 (the true minimum)
+Summary of Convergence
+Iter | x | f(x) | Gradient0 | 5.00 | 25.00 | 10.01 | 4.00 | 16.00 | 8.02 | 3.20 | 10.24 | 6.43 | 2.56 | 6.55 | 5.12... | ... | ... | ...∞ | 0.00 | 0.00 | 0.0
+ If we continue, x converges to 0
+The function f(x) = x² has its minimum at x = 0. Gradient descent successfully moves us from x = 5 toward x = 0. With more iterations (or larger learning rate), we'd get even closer!
+