Instructor: |
Prof. Robert Fovell |
Office: |
7162 Math Sciences |
Office hours: |
Informal and by appointment |
Phone: |
(310) 206-9956 |
E-mail: |
fovell@atmos.ucla.edu |
|
|
Lectures: |
Monday and Wednesday, 2-3:20 |
Texts:
Required: |
Gunst and Mason: Regression analysis and its
application |
|
Not required: |
Jackson: A user's guide to principal components |
Expensive |
Not required: |
Huff: How to lie with statistics |
Just for
fun! |
Grading:
- ``Midterm'' exam 30% [may be quite delayed, though]
- Homework, including class project 35%
- Final exam 35%
Overview:
This course was created in response to student
requests. They wanted a concise introduction to basic multivariate
analysis, with emphasis placed on practice over theory. This course
attempts to cover three basic analytical tools: linear regression,
principal component, and cluster analyses. There is no way to do all
three in a single quarter without going fast. Yet, the specific
tools learned herein are less important than the underlying
fundamental concepts. This is very much a hands-on class.
Topic and subtopic outline:
- Linear regression analysis:
- Dependent and independent variables
- Model building vs. prediction
- Uses and abuses of regression analysis
- Ordinary least squares (OLS) estimates
- Analysis of variance (ANOVA) for regression models
- Matrix formulation of the OLS problem
- Collinearity
- Significance tests for models and parameters
- Errors vs. residuals
- Testing residuals
- Model misspecification
- Deleted and studentized residuals, leverage values, Cook's
D
- Polynomial regression, interaction terms, indicator variables
- Variable selection strategies
- Partial correlation and the Extra Sum of Squares Principle
- Odds and ends
- Principal components analysis (PCA):
- Least normal squares
- Dispersion matrices, eigenvalues and eigenvectors
- Scaling issues
- PC ``scores''
- Principal component regression
- PCA for variable reduction; truncation tests and rules
- Rotation of components
- Cluster analysis:
- Motivation
- Concept of ``distance''; various distance measures
- Redundant and irrelevant information
- Clustering strategies and algorithms
- Biases of clustering methods
- Stopping rules; number of clusters problem