Welcome
Syllabus
Instructors
Class meetings
Grading
Class Resources
201a Schedule
Week 0: Introduction
Week 1: Data
Week 2: Visualization
Week 3: Theoretical foundations
Week 4: Linear model: Regression
Week 5: Linear model, midterm
Week 6: Linear model: Categorical predictors
Week 7: Linear model: ANCOVA, diagnostics
Week 8: Linear model: Linearizing transforms
Week 9: Covarying errors (repeated measures / random effects)
Week 10: Review and preview
Projects
Examples of this sort of thing
201a
201a Timeline
Groups
201a Project plan
201a Preliminary data summaries
201a Write-ups
201a Presentation
201a Group-evaluation
201b
Data sources
R homework
Grading
Collaboration
Submitting R Assignments
Writing R scripts
Submitting your assignment
Additional resources.
I Notes
Getting started with R
Installing R
Packages
Introduction to R
Getting started
Better data analysis code.
Using R-markdown
Visualizations
General rules for scientific data visualization.
Picking a plot (what’s convention)
Categorical ~ 0
Histogram
Pie chart
Stacked area
numerical ~ 0
Histogram & density
numerical ~ categorical
Bar plot with error bars
Jittered data points.
Viola/Violin plot
Box and whiskers plot
Overlayed densities
Empirical cumulative distribution
Comparisons
Recommendations
numerical ~ numerical (2 x numerical ~ 0)
Scatter and heatmap
Conditional means
numerical ~ numerical + categorical
categorical ~ numerical
2 x categorical and categorical ~ categorical
Heatmap
categorical ~ categorical
Extra plot notes.
numerical ~ 2 x categorical
bin width and bandwidths
Probability
Probability terms
Absolute probability statements {prob}
Probability comparisons
Proportional magnitudes and confusion
Foundations of probability
Set notation for combinations of outcomes.
Basic probability definition and axioms
Events and the rules of probability.
Conditional probability and Bayes
Chain rule
Partitions and total probability
Bayes’ rule
Simulation, Sampling and Monte Carlo.
Sampling, long-run frequency, and the law of large numbers.
Sampling to estimate event probabilities.
Probability of a conjunction of events
Sampling to get probability of disjunction
Sampling to calculate conditional probability
Distribution functions: PDF, CDF, Quantile
Probability distribution (mass and density) functions (p.d.f.)
Cumulative distribution functions (c.d.f.)
Quantile functions (inverse CDF).
Expectation and moments
There are a few useful properties about how the Mean, Variance, Skewness, and Excess Kurtosis behave under various operations:
Central limit theorem and the normal distribution
Requirements for CLT to hold
Normal distribution
Foundations of Statistics
Frequentist statistics via simulation
Critical values, alpha, power
Setting up the “Alternate hypothesis”
Figuring out “power”
Figuring out “alpha”
Showing alpha, power
Figuring out the critical value.
Sampling distributions
TL; DR.
Logic
Expectation about the sampling distribution of the sample mean.
Standard error (of the sample mean)
Statistics via the Normal distribution
(Normal) Null hypothesis significance testing (NHST)
Normal tests with sample means
Z-scores and Z-tests
Z-tests
(Normal) Confidence intervals on the sample mean
What are these percents and probabilities?
Null hypothesis significance testing
Type 1 error rate: alpha (
\(\alpha\)
)
The “alternate model”
Effect size
Calculating power from effect size
Visualizing alpha and power
How power changes.
Calculating n for a desired level of power.
Sign and magnitude errors.
Binomial: Probability to statistics
Data description / summary
Estimation
Null Hypothesis Significance testing (NHST)
Model selection
t-distribution
TL; DR.
Sampling distribution of sample variance, and t-statistic
Sample variance
Sampling distribution of the sample variance
Sampling distribution of t-statistic
T-distribution
Degrees of freedom.
Summary.
(Student’s) t-tests
1-sample t-test
Paired / repeated-measures t-test
2-sample, presumed equal variance, t-test
2-sample, unequal variance, t-test
Power calculations.
Summary of tests for the mean and effect sizes
Math.
Math behind 2-sample equal variance t-test
Math behind unequal variance t-test
Binomial test.
Estimating proportions.
Sign test, test for percentiles
Pearson’s Chi-squared test
“Goodness of fit” test
Implementation in R
Chi-squared test calculations
Test for independence
Implementation in R
Independence test calculations
More than 2-way contingency table
Mathematical rationale
Limitations
Fisher’s “exact” test
Bivariate linear relationships
Linear
relationships
Covariance and correlation: Measuring the linear dependence.
Covariance
Correlation coefficient
(OLS) Regression: Predicting the mean of y for a given x
Difference between y~x, x~y, and the principle component line
Partitioning variance
Significance of a linear relationship.
Prediction from regression.
Anscombe’s quartet
Covariance
Estimating covariance.
Correlation
Correlation as the slope of z-scores
Coefficient of determination
Ordinary Least-Squares (OLS) Regression
Regression terminology
Estimating the regression line.
Standard errors of regression coefficients
Confidence intervals and tests for regression coefficients
y~x vs x~y vs principle component line
Partitioning variance and the coefficient of determination.
Calculating sums of squares.
Significance of linear relationship.
Significance of slope.
Significance of pairwise correlation
Significance of variance partition.
Isomorphism with one response and one predictor
Regression prediction.
Predicting mean y given x.
Predicting new y given x.
Visualizing the difference
Regression Diagnostics
Assumption: relationship between y and x is well described by a line.
Assumption: out estimates are not driven by a few huge outliers
Assumption: errors are independent, identically distributed, normal.
Testing assumptions
UCSD Psyc 201ab / CSS 205 / Psyc 193
(PART) Notes