Schema_Study_BILD5_V1 / week2_running.csv
keefereuther's picture
Update week2_running.csv
c69fa4b verified
TERM,CONTEXT
Help with a code bug in R,"For this option you should prompt the user to copy and paste the relevant code and the error as it appears in the console. You ARE NOT allowed to directly solve the bug for them and give them fixed code to copy and paste. Instead you will socratically ask them guiding questions that logically guides them to solving it for themselves. Use palmer penguins for your examples. The issue will be in relation to RStudio, R, and Rmd files. Most students are using a cloud version of RStudio that does not allow installing packages. If there is an issue with knitting to pdf, please have them knit to html instead. If there is a fix that requires higher user access such as installing packages, then ask if they are using the online Datahub version (users do not have access) or their own install version of RStudio (users do have access to install packages."
Data Visualization in Biology,"Questions to ask yourself: How does my data vary? Are variables correlated? Are there outliers?"
The Palmer Penguins dataset,"the primary example dataset used in class. Assume students do know know the names and data types of the columns so they should be frequently provided for context."
bar plot,
scatterplot,
line graph,
histogram,
box plot,
violin plot,
heat map,
pie charts,"this is an awful choice for many reasons."
O-ring failure on the Challenger and bad data visualization,
ggplot2 and the grammer of graphics,
descriptive statistics,
robust,"this is inreference to the robustness of a model or descriptive statistic."
centrality and variation in statistics,
interquartile range,
standard deviation,
variance,"this is in reference to the descriptive statistic."
sum of squares,
Anscombe’s quartet,
cbind(),"R function"
length(),"R function"
round(),"R function"
runif(),"R function"
rbind(),"R function"
apply(),"R function"
designating a particular row column or cell in a dataframe,"Using the [row,column] or $ designation."
creating a vector in R,"R function"
str(),"R function"
which() in R,"R function"
When should you look at your data?,"Early and often with data visualization. Science is an itrative process where the initial experimental plans often won't work because of some unknown patterns in your population. Examining your sample visually can help show those to you - like outliers or weird distributions."
Science,"Science is the pursuit and application of knowledge and understanding of the natural and social world following a systematic methodology based on evidence."
PPDAC - Problem,
PPDAC - Plan,
PPDAC - Data,
PPDAC - Analysis,
PPDAC - Conclusion,"This step cycles back to 'PPDAC - Problem' because science is an iterative process that creates new questions and directions."
Biological biases,
Cognitive biases,
Confirmation bias,
Availability heuristic,
Anchoring bias,
Dunning- Kruger effect,
Why are statistical and programming knowledge useful for a career in biology?,"1. Biological data is messy and complex. 2. Biology uses BIG data. 3. These skills save lives. 4. These skills will help your career."
Big data in biology,"-omics data, GWAS"
Name a disease that is influenced by many different genetic and environmental factors,
a priori hypotheses,
Which natural process is most similar to machine learning?, "evolution by natural selection"
How will careers in biology be affected by generative AI?, "The barriers to learning new skills are falling. The need to learn programming and data analysis is greater now than it was in 2020! Some jobs will be automated. Those with the rarest and most useful combination of skills will be sought after."
Data types - continuous/numerical,
Data types - count/integer,
Data types - ordinal,
Data types - categorical,
Data types - binomial,
Null data, "Data missing in a dataset. It is important to appropriately deal with this data depending on the nature of the statistical analysis and experimental design."
tidy data,
R programming - objects,
R programming - functions,
R programming - Rmd file format,
RStudio,
print() in R,