TERM,CONTEXT aes() mapping,Maps variables to visual properties like x y color size in ggplot alpha (significance level),Probability threshold for Type I error commonly set at 0.05 alternative hypothesis,Research hypothesis claiming an effect or difference exists animal welfare protocols,Ethical guidelines ensuring humane treatment in research with vertebrates ANOVA (one-way),Tests for mean differences across three or more groups assumptions of linear regression,Linearity normality of residuals and homoscedasticity requirements augment(),broom function that adds residuals fitted values and diagnostics to model data bar plot,Shows counts or means for categorical variables bcPower(),Function in car package for Box-Cox power transformations binary data,Categorical variable with two levels like yes/no or presence/absence bioinformatics and computational biology methods,Sequence alignment phylogenetics protein folding machine learning for genomic data biological replicate,Independent experimental units providing true replication blinding,Concealing treatment assignment to reduce bias blocking,Grouping by known variable like age or location to control its effects Bonferroni correction,Adjusts alpha by dividing by number of tests to control Type I error bootstrapping,Resampling with replacement to estimate confidence intervals and standard errors Box-Cox transformation,Power transformation to normalize data and stabilize variance using optimal lambda boxplot,Displays median IQR whiskers and outliers for group comparisons broom package,Tidies model output into data frames for easier manipulation case-control study,Compares groups with and without outcome to identify risk factors categorical data,Qualitative groups like species or treatment levels Central Limit Theorem,Sample means approach normal distribution as n increases regardless of population shape central tendency,Measures of data center including mean median and mode chi-squared goodness of fit,Tests if observed frequencies match expected frequencies for one categorical variable chi-squared test of independence,Tests if two categorical variables are associated or independent CO2 dataset,Built-in R dataset with plant uptake measurements used for regression examples coefficient of determination (R²),Proportion of variance in response explained by predictors Cohen's d,Standardized effect size measure for mean differences confidence interval (95%),Range likely to contain true parameter value with 95% confidence confounding variable,Factor that correlates with both treatment and outcome conservation biology methods,Population viability analysis habitat modeling biodiversity assessment species monitoring continuous data,Quantitative measurements like weight length or concentration control group,Baseline comparison receiving no treatment or standard treatment in experiments Cook's distance,Measures influence of each observation on regression model identifies outliers cor.test(),R function for testing correlation significance between two variables correlation coefficient (r),Standardized measure of linear association from -1 to 1 cross-over design,Each participant receives all treatments in different periods with washout between cross-sectional study,Data collected at single time point across different subjects data transformation,Mathematical modifications like log or square root to meet assumptions discrete data,Count data taking only integer values double-blind,Neither participants nor researchers know treatment assignment dplyr,R package for data manipulation with verbs like select filter mutate ecology and evolution methods,Mark-recapture species distribution modeling community ecology population genetics effect size,Magnitude of difference between groups independent of sample size ethics in research,Principles ensuring participant welfare and scientific integrity experimental unit,Smallest independent unit receiving treatment assignment exploratory data analysis (EDA),Initial data examination to understand patterns before formal testing facet_grid,Creates grid of plots by two categorical variables in ggplot2 facet_wrap,Creates small multiples by single variable for quick comparisons factorial design,Tests multiple factors and their interactions simultaneously false discovery rate,Expected proportion of false positives among rejected hypotheses field study,Research in natural environment with ecological validity filter(),dplyr function to subset rows based on conditions fitted values,Model predictions for each observation in regression Fligner-Killeen test,Non-parametric test for equal variances across groups generalized linear model (GLM),Extension of linear models for non-normal response distributions genomics and molecular methods,CRISPR gene editing RNA-seq ChIP-seq proteomics single-cell analysis geom_bar,Bar chart layer for categorical data in ggplot2 geom_boxplot,Boxplot layer for group comparisons in ggplot2 geom_histogram,Histogram layer for distribution visualization geom_point,Scatterplot layer for continuous relationships geom_smooth,Adds regression line or smoothed curve to plots ggplot2,R package for creating layered graphics using grammar of graphics group_by(),dplyr function to perform operations by groups heteroscedasticity,Unequal variance violating assumptions of parametric tests histogram,Shows distribution of continuous variable using bins homoscedasticity,Equal variance assumption for groups or across predictor range hypothesis testing framework,Structured approach to testing claims using null and alternative hypotheses IACUC,Institutional Animal Care and Use Committee overseeing vertebrate research ethics in vitro,Experiments in controlled environment outside living organism in vivo,Experiments conducted in living organisms informed consent,Ethical requirement for human subjects to voluntarily agree to participate Institutional Review Board (IRB),Committee ensuring ethical standards in human subjects research intercept,Predicted y value when x equals zero in regression equation interquartile range (IQR),Range between 25th and 75th percentiles robust to outliers iris dataset,Classic R dataset with 150 flower measurements for classification examples Kolmogorov-Smirnov test,Tests if sample comes from specified distribution like normal kurtosis,Measure of distribution tail heaviness relative to normal lambda (λ),Transformation parameter in Box-Cox determining optimal power leverage,Measure of how extreme predictor values are potential for influence linear regression,Models relationship between predictor and continuous response variable lm(),R function for fitting linear models returns coefficients and diagnostics log transformation,Common transformation for right-skewed data or multiplicative relationships longitudinal study,Data collected from same subjects over multiple time points marine and environmental science methods,Ocean sampling environmental DNA water quality assessment climate modeling MASS package,R package containing functions for modern applied statistics mean,Average value sum divided by n central tendency measure used in t-tests ANOVA median,Middle value when ordered robust central tendency measure for boxplots IQR microbiology and immunology methods,Flow cytometry ELISA viral quantification microbiome analysis antibiotic resistance testing mode,Most frequent value in dataset third measure of central tendency model diagnostics,Checking assumptions through residual plots QQ plots and formal tests multiple comparisons problem,Increased Type I error risk when conducting multiple tests multiple regression,Linear model with two or more predictor variables mutate(),dplyr function to create or modify columns negative control,Treatment known to have no effect checks for artifacts neuroscience methods,Electrophysiology fMRI optogenetics behavior tracking connectomics analysis normality,Bell-shaped Gaussian distribution assumption for parametric tests checked Week 3 null hypothesis,Statement of no effect or no difference to be tested observational study,No treatment manipulation only observation of existing variation observer bias,Researcher expectations influence data collection or interpretation one-sample t-test,Tests if sample mean differs from hypothesized population value open science,Transparency practices including data sharing preprints reproducible code ordinary least squares (OLS),Method minimizing sum of squared residuals to fit regression line outlier,Data point substantially different from other observations p-value,Probability of obtaining results as extreme as observed if null hypothesis true paired t-test,Compares matched observations like before-after measurements Palmer Penguins dataset,Modern alternative to iris with 344 penguin measurements parametric tests,Statistical tests assuming specific probability distributions pilot study,Small preliminary study testing feasibility and methods pipe operator (|> or %>%),Chains functions together for readable workflows in R plant biology methods,Photosynthesis measurement growth assays metabolomics gene expression tissue culture plot(),Base R function for creating diagnostic plots from lm objects positive control,Treatment known to produce effect validates experiment post-hoc tests,Pairwise comparisons following significant omnibus test like ANOVA power (1-β),Probability of correctly rejecting false null hypothesis power analysis,Calculates needed sample size given expected effect alpha and power powerTransform(),car package function to find optimal Box-Cox lambda value pre-registration,Publishing study design and analysis plan before data collection predictor variable,Independent variable used to predict outcome in regression protected health information (PHI),Confidential patient data requiring special ethical handling pseudoreplication,Incorrectly treating non-independent observations as replicates QQ plot,Graphical method comparing data distribution to theoretical normal quasi-experimental design,Lacks random assignment but seeks causal inference R programming language,Statistical computing environment widely used in biological research R squared,Proportion of variance explained by regression model random sampling,Selection where each member has equal probability of inclusion randomization,Random assignment to treatments prevents systematic bias randomized controlled trial (RCT),Gold standard experimental design with random treatment assignment range,Maximum minus minimum quick variability check sensitive to outliers regression assumptions,Requirements including linearity normality and constant variance regression diagnostics,Tools for checking model assumptions using residuals and influence measures repeated measures design,Same subjects measured under multiple conditions reduces variance replication,Multiple independent observations per treatment group essential Week 9 concept research misconduct,Fabrication falsification plagiarism violations of scientific integrity residual standard error,Estimate of standard deviation of residuals around regression line residuals,Differences between observed and predicted values in regression response variable,Dependent variable being predicted in regression analysis sample size (n),Number of independent observations affects power and uncertainty sampling distribution,Distribution of sample statistics across repeated sampling scatterplot,Plots two continuous variables to show relationships select(),dplyr function to choose specific columns from data frame Shapiro-Wilk test,Statistical test for normality effective for small to moderate samples simple linear regression,Model with single predictor and continuous response skewness,Asymmetry in distribution with longer tail on one side slope,Rate of change in y per unit change in x regression coefficient sqrt transformation,Square root transformation for count data or moderate skew standard deviation,Average spread of data points around the mean standard error,Standard deviation of sampling distribution measures precision statistical methods in biomedicine,Clinical trials survival analysis epidemiology biomarkers meta-analysis statistical significance,Result unlikely due to chance alone typically p < 0.05 stratification,Dividing population into subgroups before sampling sum of squares,Total squared deviations used in ANOVA and regression calculations summarize(),dplyr function to calculate summary statistics summary(),R function displaying model coefficients tests and fit statistics systems biology methods,Network analysis metabolic modeling multi-omics integration pathway analysis t-statistic,Test statistic for t-tests ratio of effect to standard error technical replicate,Multiple measurements of same unit not true replication three Rs principle,Replacement reduction refinement in animal research ethics tidy(),broom function converting model output to tidy data frame tidyverse,Collection of R packages for data science including ggplot2 and dplyr transformation parameter,Value like lambda determining type and strength of transformation Tukey HSD,Post-hoc test for pairwise comparisons after significant ANOVA two-sample t-test (unpaired),Compares means of two independent groups Type I error,False positive rejecting true null hypothesis Type II error,False negative failing to reject false null hypothesis variance,Square of standard deviation measuring data dispersion violin plot,Combines boxplot with kernel density to show distribution shape Welch's t-test,Modified t-test for unequal variances between groups Winsorization,Replacing extreme values with less extreme ones to reduce outlier impact