Schema_Study_BILD5 / terms.csv
keefereuther's picture
Update to Responses API with GPT-5.1 support, web search functionality, and improved midterm review template
9545ea6
TERM,CONTEXT
aes() mapping,Maps variables to visual properties like x y color size in ggplot
alpha (significance level),Probability threshold for Type I error commonly set at 0.05
alternative hypothesis,Research hypothesis claiming an effect or difference exists
animal welfare protocols,Ethical guidelines ensuring humane treatment in research with vertebrates
ANOVA (one-way),Tests for mean differences across three or more groups
assumptions of linear regression,Linearity normality of residuals and homoscedasticity requirements
augment(),broom function that adds residuals fitted values and diagnostics to model data
bar plot,Shows counts or means for categorical variables
bcPower(),Function in car package for Box-Cox power transformations
binary data,Categorical variable with two levels like yes/no or presence/absence
bioinformatics and computational biology methods,Sequence alignment phylogenetics protein folding machine learning for genomic data
biological replicate,Independent experimental units providing true replication
blinding,Concealing treatment assignment to reduce bias
blocking,Grouping by known variable like age or location to control its effects
Bonferroni correction,Adjusts alpha by dividing by number of tests to control Type I error
bootstrapping,Resampling with replacement to estimate confidence intervals and standard errors
Box-Cox transformation,Power transformation to normalize data and stabilize variance using optimal lambda
boxplot,Displays median IQR whiskers and outliers for group comparisons
broom package,Tidies model output into data frames for easier manipulation
case-control study,Compares groups with and without outcome to identify risk factors
categorical data,Qualitative groups like species or treatment levels
Central Limit Theorem,Sample means approach normal distribution as n increases regardless of population shape
central tendency,Measures of data center including mean median and mode
chi-squared goodness of fit,Tests if observed frequencies match expected frequencies for one categorical variable
chi-squared test of independence,Tests if two categorical variables are associated or independent
CO2 dataset,Built-in R dataset with plant uptake measurements used for regression examples
coefficient of determination (R²),Proportion of variance in response explained by predictors
Cohen's d,Standardized effect size measure for mean differences
confidence interval (95%),Range likely to contain true parameter value with 95% confidence
confounding variable,Factor that correlates with both treatment and outcome
conservation biology methods,Population viability analysis habitat modeling biodiversity assessment species monitoring
continuous data,Quantitative measurements like weight length or concentration
control group,Baseline comparison receiving no treatment or standard treatment in experiments
Cook's distance,Measures influence of each observation on regression model identifies outliers
cor.test(),R function for testing correlation significance between two variables
correlation coefficient (r),Standardized measure of linear association from -1 to 1
cross-over design,Each participant receives all treatments in different periods with washout between
cross-sectional study,Data collected at single time point across different subjects
data transformation,Mathematical modifications like log or square root to meet assumptions
discrete data,Count data taking only integer values
double-blind,Neither participants nor researchers know treatment assignment
dplyr,R package for data manipulation with verbs like select filter mutate
ecology and evolution methods,Mark-recapture species distribution modeling community ecology population genetics
effect size,Magnitude of difference between groups independent of sample size
ethics in research,Principles ensuring participant welfare and scientific integrity
experimental unit,Smallest independent unit receiving treatment assignment
exploratory data analysis (EDA),Initial data examination to understand patterns before formal testing
facet_grid,Creates grid of plots by two categorical variables in ggplot2
facet_wrap,Creates small multiples by single variable for quick comparisons
factorial design,Tests multiple factors and their interactions simultaneously
false discovery rate,Expected proportion of false positives among rejected hypotheses
field study,Research in natural environment with ecological validity
filter(),dplyr function to subset rows based on conditions
fitted values,Model predictions for each observation in regression
Fligner-Killeen test,Non-parametric test for equal variances across groups
generalized linear model (GLM),Extension of linear models for non-normal response distributions
genomics and molecular methods,CRISPR gene editing RNA-seq ChIP-seq proteomics single-cell analysis
geom_bar,Bar chart layer for categorical data in ggplot2
geom_boxplot,Boxplot layer for group comparisons in ggplot2
geom_histogram,Histogram layer for distribution visualization
geom_point,Scatterplot layer for continuous relationships
geom_smooth,Adds regression line or smoothed curve to plots
ggplot2,R package for creating layered graphics using grammar of graphics
group_by(),dplyr function to perform operations by groups
heteroscedasticity,Unequal variance violating assumptions of parametric tests
histogram,Shows distribution of continuous variable using bins
homoscedasticity,Equal variance assumption for groups or across predictor range
hypothesis testing framework,Structured approach to testing claims using null and alternative hypotheses
IACUC,Institutional Animal Care and Use Committee overseeing vertebrate research ethics
in vitro,Experiments in controlled environment outside living organism
in vivo,Experiments conducted in living organisms
informed consent,Ethical requirement for human subjects to voluntarily agree to participate
Institutional Review Board (IRB),Committee ensuring ethical standards in human subjects research
intercept,Predicted y value when x equals zero in regression equation
interquartile range (IQR),Range between 25th and 75th percentiles robust to outliers
iris dataset,Classic R dataset with 150 flower measurements for classification examples
Kolmogorov-Smirnov test,Tests if sample comes from specified distribution like normal
kurtosis,Measure of distribution tail heaviness relative to normal
lambda (λ),Transformation parameter in Box-Cox determining optimal power
leverage,Measure of how extreme predictor values are potential for influence
linear regression,Models relationship between predictor and continuous response variable
lm(),R function for fitting linear models returns coefficients and diagnostics
log transformation,Common transformation for right-skewed data or multiplicative relationships
longitudinal study,Data collected from same subjects over multiple time points
marine and environmental science methods,Ocean sampling environmental DNA water quality assessment climate modeling
MASS package,R package containing functions for modern applied statistics
mean,Average value sum divided by n central tendency measure used in t-tests ANOVA
median,Middle value when ordered robust central tendency measure for boxplots IQR
microbiology and immunology methods,Flow cytometry ELISA viral quantification microbiome analysis antibiotic resistance testing
mode,Most frequent value in dataset third measure of central tendency
model diagnostics,Checking assumptions through residual plots QQ plots and formal tests
multiple comparisons problem,Increased Type I error risk when conducting multiple tests
multiple regression,Linear model with two or more predictor variables
mutate(),dplyr function to create or modify columns
negative control,Treatment known to have no effect checks for artifacts
neuroscience methods,Electrophysiology fMRI optogenetics behavior tracking connectomics analysis
normality,Bell-shaped Gaussian distribution assumption for parametric tests checked Week 3
null hypothesis,Statement of no effect or no difference to be tested
observational study,No treatment manipulation only observation of existing variation
observer bias,Researcher expectations influence data collection or interpretation
one-sample t-test,Tests if sample mean differs from hypothesized population value
open science,Transparency practices including data sharing preprints reproducible code
ordinary least squares (OLS),Method minimizing sum of squared residuals to fit regression line
outlier,Data point substantially different from other observations
p-value,Probability of obtaining results as extreme as observed if null hypothesis true
paired t-test,Compares matched observations like before-after measurements
Palmer Penguins dataset,Modern alternative to iris with 344 penguin measurements
parametric tests,Statistical tests assuming specific probability distributions
pilot study,Small preliminary study testing feasibility and methods
pipe operator (|> or %>%),Chains functions together for readable workflows in R
plant biology methods,Photosynthesis measurement growth assays metabolomics gene expression tissue culture
plot(),Base R function for creating diagnostic plots from lm objects
positive control,Treatment known to produce effect validates experiment
post-hoc tests,Pairwise comparisons following significant omnibus test like ANOVA
power (1-β),Probability of correctly rejecting false null hypothesis
power analysis,Calculates needed sample size given expected effect alpha and power
powerTransform(),car package function to find optimal Box-Cox lambda value
pre-registration,Publishing study design and analysis plan before data collection
predictor variable,Independent variable used to predict outcome in regression
protected health information (PHI),Confidential patient data requiring special ethical handling
pseudoreplication,Incorrectly treating non-independent observations as replicates
QQ plot,Graphical method comparing data distribution to theoretical normal
quasi-experimental design,Lacks random assignment but seeks causal inference
R programming language,Statistical computing environment widely used in biological research
R squared,Proportion of variance explained by regression model
random sampling,Selection where each member has equal probability of inclusion
randomization,Random assignment to treatments prevents systematic bias
randomized controlled trial (RCT),Gold standard experimental design with random treatment assignment
range,Maximum minus minimum quick variability check sensitive to outliers
regression assumptions,Requirements including linearity normality and constant variance
regression diagnostics,Tools for checking model assumptions using residuals and influence measures
repeated measures design,Same subjects measured under multiple conditions reduces variance
replication,Multiple independent observations per treatment group essential Week 9 concept
research misconduct,Fabrication falsification plagiarism violations of scientific integrity
residual standard error,Estimate of standard deviation of residuals around regression line
residuals,Differences between observed and predicted values in regression
response variable,Dependent variable being predicted in regression analysis
sample size (n),Number of independent observations affects power and uncertainty
sampling distribution,Distribution of sample statistics across repeated sampling
scatterplot,Plots two continuous variables to show relationships
select(),dplyr function to choose specific columns from data frame
Shapiro-Wilk test,Statistical test for normality effective for small to moderate samples
simple linear regression,Model with single predictor and continuous response
skewness,Asymmetry in distribution with longer tail on one side
slope,Rate of change in y per unit change in x regression coefficient
sqrt transformation,Square root transformation for count data or moderate skew
standard deviation,Average spread of data points around the mean
standard error,Standard deviation of sampling distribution measures precision
statistical methods in biomedicine,Clinical trials survival analysis epidemiology biomarkers meta-analysis
statistical significance,Result unlikely due to chance alone typically p < 0.05
stratification,Dividing population into subgroups before sampling
sum of squares,Total squared deviations used in ANOVA and regression calculations
summarize(),dplyr function to calculate summary statistics
summary(),R function displaying model coefficients tests and fit statistics
systems biology methods,Network analysis metabolic modeling multi-omics integration pathway analysis
t-statistic,Test statistic for t-tests ratio of effect to standard error
technical replicate,Multiple measurements of same unit not true replication
three Rs principle,Replacement reduction refinement in animal research ethics
tidy(),broom function converting model output to tidy data frame
tidyverse,Collection of R packages for data science including ggplot2 and dplyr
transformation parameter,Value like lambda determining type and strength of transformation
Tukey HSD,Post-hoc test for pairwise comparisons after significant ANOVA
two-sample t-test (unpaired),Compares means of two independent groups
Type I error,False positive rejecting true null hypothesis
Type II error,False negative failing to reject false null hypothesis
variance,Square of standard deviation measuring data dispersion
violin plot,Combines boxplot with kernel density to show distribution shape
Welch's t-test,Modified t-test for unequal variances between groups
Winsorization,Replacing extreme values with less extreme ones to reduce outlier impact