Sharpen your Data Science skills with this is a hands-on workshop on advanced regression techniques in R.
In this workshop we will talk about variety of regression models, give their definitions, discussing goodness-of-fit criteria, presenting fitted models, interpreting estimated regression coefficients, and using the fitted models for prediction. The models will be limited to: linear regression, Box-Cox transformation, gamma regression, ordinary logistic regression, Poisson regression, beta regression, longitudinal (repeated measures) regression, and hierarchical model.
The workshop is designed to be hands-on. Participants are required to bring laptops and be ready to write R, analyzing data and interpreting results. For each model, we present an example with a complete R code, and then will an exercise to work on. Workshop participants should be familiar with algebraic expressions of different probability distributions, and have a fundamental knowledge of simple linear regression: normally distributed random error, continuous and categorical independent variables (requiring creating dummy variables).
The material covered by the workshop will be taken from my recently published book “Advanced Regression Models with SAS and R Applications”, CRC Press, 2018.
We will have a limited number of books for sale. You can purchase the book and get it signed by Dr. Olga.
Dr. Olga Korosteleva, is a professor of Statistics at the Department of Mathematics and Statistics at California State University, Long Beach (CSULB). She received her Bachelor’s degree in Mathematics in 1996 from Wayne State University in Detroit, and a Ph.D. in Statistics from Purdue University in West Lafayette, Indiana, in 2002. Since then she has been teaching mostly Statistics courses in the Master’s program in Applied Statistics at CSULB, and loving it!
Dr. Olga is an undergraduate advisor for students majoring in Mathematics with an option in Statistics. She is also the faculty supervisor for the Statistics Student Association. She is also the immediate past-president of the Southern California Chapter of the American Statistical Association (SCASA). Dr. Olga is the editor-in-chief of SCASA’s monthly eNewsletter and the author (co-author) of four statistical books.
When: October 5, 2019
- Saturday: 8:30 AM - 04:30 PM
Where:
University of California, Irvine -- Paul Merage School of Business
4293 Pereira Drive
Irvine, CA 92617
- Google Maps
- Directions & Parking Information
- Rooms
- SB1 2100 - Main event room
- SB1 3rd floor patio - meals
Registration
- Cost: $25
- Register through EventBright
- All participants must register for the event and have a valid ticket to attend. If there is space you can register at the door.
- All participants must abide by the OCRUG Code of Conduct, including the R Consortium and the R Community Code of Conduct.
- Connect to SSID: UCInet Mobile
- Go to https://oit.uci.edu/reg
- register your device as a guest
If you have problems, please call OIT support line at (949) 824-2222 option 3
OCRUG GitHub Repo: https://github.com/ocrug/
Please install git and clone the following repo before the event and pull before the start of the event
command:
git clone https://github.com/ocrug/advanced_regression.git
Event Repo: https://github.com/ocrug/advanced_regression
If you would like to make thing easier during the course you can install a package that has all the code and data already loaded. It also has all the data used in the textbook, both examples and exercises.
Repo: https://github.com/ocrug/AdvancedRegression
You can install the package with:
install.packages('devtools', dependencies = TRUE)
devtools::install_github("https://github.com/ocrug/AdvancedRegression")
library(AdvancedRegression)
There is some documentation. Check it out by looking at the docs for a function such as:
? AICC
A slack channel has been set up for the hackathon. This will be used for general announcements but it is also a great source for you to ask questions to other participants.
If you have not created an account on our slack group, create one using the following link:
Slack Group Sign-up: https://ocrug-slack.herokuapp.com
Once you have an account, sign in (you can do it on a web browser or download an app on your phone or desktop).
Slack channel: https://ocrug.slack.com
The channel for the course is regression-2019
Since this event depends on you have an R setup that is functional with the correct packages and version of R, we highly recommend that you run the check_setup.r before the event. If you have issues, please reach out to use in the slack channel (see above) to get help.
Please follow us on twitter, oc_rug, and also tweet about the event with the hash tag #OCRUG
-
- 1-page note sheets covering data science fundamentals and useful R packages.
-
- Comprehensive book on the complete data science workflow, including data importing/cleaning, visualization, and data analysis
- Focus on
tidyverse
packages - Accessible for beginners who have a basic grasp of R
-
- This is the hub website for the core
tidyverse
packages - Check out the Packages section and associated links for helpful information on using the packages.
- This is the hub website for the core
-
- This book digs into the details of R.
- A great resource for more advanced users wanting to learning more about R under the hood.
- There is also a 1st Edition of the book.
Food, drinks and snacks will be provided throughout the event. We will have vegetarian options available. Please feel free to bring any additional food for yourself if you would like to supplement the meals or if you have other specific dietary constraints.
-
Saturday
- Lunch: mexican (tacos, rice & beans, chips & salsa)
-
Snacks and Drinks
- Coffee
- Soft drinks
- Water
- Various snacks, TBD (e.g. fruit, chips, nuts, granola bars)
Start | End | Activity | Slides | Location |
---|---|---|---|---|
08:30 | 09:00 | Sign-in | SB1 Lobby | |
09:00 | 09:30 | Introduction and computer setup | SB1 2100 | |
09:30 | 10:30 | Linear Regression - definition, fitted model, interpretation of estimated regression coefficients, prediction, R application | 2-15 | SB1 2100 |
10:30 | 11:00 | Gamma regression | 16-29 | SB1 2100 |
11:00 | 11:15 | Break | ||
11:15 | 11:40 | Logistic regression | 30-40 | SB1 2100 |
11:40 | 12:10 | Poisson regression | 41-49 | SB1 2100 |
12:10 | 12:30 | Zero-inflated Poisson regression | 50-63 | SB1 2100 |
12:30 | 01:30 | Lunch | Patio | |
01:30 | 02:00 | Beta regressions | 64-74 | SB1 2100 |
02:00 | 02:30 | Longitudinal normal regression | 75-83 | SB1 2100 |
02:30 | 02:45 | Break | ||
02:45 | 03:15 | Longitudinal normal regression: Exercise | 84-89 | SB1 2100 |
03:15 | 03:45 | Longitudinal logistic regression | 90-100 | SB1 2100 |
03:45 | 04:15 | Hierarchical normal models | 101-107 | SB1 2100 |
04:15 | 04:30 | Wrap-up | SB1 2100 |