diff --git a/_book/index.html b/_book/index.html index 264bf8a0..9fc21875 100644 --- a/_book/index.html +++ b/_book/index.html @@ -499,7 +499,7 @@

1 Welcome :)

diff --git a/_book/search_index.json b/_book/search_index.json index eed064cb..f40eb968 100644 --- a/_book/search_index.json +++ b/_book/search_index.json @@ -1 +1 @@ -[["index.html", "Big Book of R 1 Welcome :) 1.1 Your last-ever bookmark 1.2 Searching 1.3 Contributing 1.4 Contributors 1.5 Licence 1.6 Live stats 1.7 About me", " Big Book of R Oscar Baruffa 01 January, 2022 1 Welcome :) 1.1 Your last-ever bookmark Thanks for stopping by. If youre like me, you cant help but bookmark every R-related programming book you find in the hopes that one day you, or someone you know, might find it useful. Hopefully this is the only bookmark youll need in future ;). When I initially released this collection in late August 2020, it contained about 100 books that Id been collecting over the previous two years. Since then Ive found a few more and there have been contributions from many people. The collection now stands at about 250 books. Most of these are free. Some are paid but usually quite affordable. 1.2 Searching If theres something specific youre looking for, use the menu or search using the magnifying glass icon at the top of the screen. 1.3 Contributing Please feel free to contribute paid and free books - see GitHub. 1.4 Contributors If youve contributed, add your name and Twitter / blog link below! Oscar Baruffa, Mohit Sharma, Vebash Naidoo, Julia Silge, Erik Gahner Larsen, Nicole Radziwill, Nistara Randhawa, Antoine Fabri, Jon Calder, Mike Smith, Ben Bolker, Maëlle Salmon, Laura Ellis, Bryan Shalloway, Antonio Uzal, Louis Aslett, Lluís Revilla Sancho, Brendan Cullen, Rami Krispin, Michael Dorman, Ezekiel Adebayo Ogundepo, Shamsuddeen Hassan Muhammad, Eric Leung, Isabella Velásquez, Matt Roumaya, Legana Fingerhut, Robert D. Brown III 1.5 Licence This website/book is free to use, and is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. 1.6 Live stats Who says you cant have privacy AND transparency?? Im guessing that if youre interested in R then you also like data ;). I initially used Google Analytics for this site but as Im keen to enhance user privacy I switched to Plausible Analytics from 30 December 2020 onward. You can view the old Google Analytics summary report PDF here. TLDR 22k unique visitors and 33k sessions between August 2020 and December 2020 :D!! Note that unique visits will be higher in Plausible than youd find with Google Analytics. Because Plausible is GDPR compliant and privacy focused, each user is identified for only 1 day. If someone visits the site 2 days in a row, thats counted as 2 uniques whereas in Google Analytics it would only be counted as 1 unique visitor because of the presence of persistent cookies and such that allows for tracking of users. From now on, you can view the LIVE site stats right here. 1.7 About me Im Oscar. Fairly new to R and loving it. If you like this book, feel free to say Hi! on Twitter. If you want to stay in the loop on other data-related products I create, or major updates to this book, sign up to my newsletter. "],["new-to-r-start-here.html", "2 New to R? Start here", " 2 New to R? Start here If youre new to R and want to learn how to use it, this library might be a little daunting. Theres so much choice! If you arent sure where to start, then try one of these two options: 2.0.1 Book: R for Data Science This book is an excellent introduction to R programming and gets you started with visualizing data so you see some exciting stuff, and the power of R, right away. The book is free to read at https://r4ds.had.co.nz/ Theres an accompanying exercise solution book at https://jrnold.github.io/r4ds-exercise-solutions/ For a different take on the solutions, check out Yet another R for Data Science study guide which can also be found at https://brshallo.github.io/r4ds_solutions/ If youd like more of a roadmap which incorporates this book, have a look at my blogpost: https://oscarbaruffa.com/a-roadmap-for-getting-started-with-r/ 2.0.2 Video Course: Getting started with R If you prefer video instruction with progress tracking, check out this course from R for the Rest of Us called Getting Started with R. https://rfortherestofus.com/courses/getting-started/ "],["book-clubs.html", "3 Book Clubs 3.1 NHS-R community 3.2 R4DS Slack Community 3.3 R-ladies Netherlands - Advanced R by Hadley Wickham", " 3 Book Clubs Just like the book clubs you know and love, except that people actually talk about the book theyre busy reading! R book clubs are usually a group of people who follow along together in working though the same book, with some sort of periodic check-in (often weekly, often via video) discussing the text, exercises and solutions. Below is a list of book clubs. These usually have a specific start and end date, so it may happen that a book club has already ended even though its listed here. If you are running a book club, feel free to add it. 3.1 NHS-R community If youre one of the estimated 10 000 data analysts working in the NHS or someone who works closely with the NHS or health data, heres a blog post introducing the NHS-R Community book club. The book club is coordinated through the NHS-R Slack Group and the specific channel is #book-club. Certain email addresses can just join the Slack group (like @nhs.net) but if you have an email address that needs approval please contact NHS-R Community through their contact details on the website. The book club has covered statistics books like The Art of Statistics by David Spiegelhalter and The Book of Why by Judea Pearl and presentations given at the meetings can be found on the GitHub repository. The Community will be coordinating another book club for the R4DS book and the channel for that is #r4ds-book-club. 3.2 R4DS Slack Community The R4Ds slack Community has a number of running book clubs. Once youve joined the slack group, you can search for channels. They also have a channel specifically for book club facilitators! Theyve recorded the sessions of cohorts so you can pick your way through one, or catch up on the current one! 3.3 R-ladies Netherlands - Advanced R by Hadley Wickham A collaboration of multiple Netherlands-based R-ladies groups ran a club on Hadley Wickhams Advanced R book. The github repo contains all the slides from the sessions. "],["career-community.html", "4 Career & Community 4.1 Ace The Data Science Interview 4.2 Build Your Career in Data Science 4.3 Conversations On Data Science 4.4 Essays on Data Analysis 4.5 Executive Data Science 4.6 Getting Started in Data Science 4.7 Hiring Data Scientists and Machine Learning Engineers 4.8 Introduction to Machine Learning Interviews Book 4.9 Project Management Fundamentals for Data Analysts 4.10 Telling Stories With Data 4.11 The Programmers Brain : What every programmer needs to know about cognition 4.12 Twitter for R Programmers 4.13 Twitter for Scientists", " 4 Career & Community These books arent all strictly R focussed, but they do have a lot of relevance for many R programmers. 4.1 Ace The Data Science Interview by Kevin Huo, Nick Singh Authored by two Ex-Facebook employees, Ace the Data Science Interview is the best way to prepare for Data Science, Data Analyst, and Machine Learning interviews, so that you can land your dream job at FAANG, tech startups, or Wall Street. Paid: $30 Link: https://www.acethedatascienceinterview.com/ 4.2 Build Your Career in Data Science by Emily Robinson, Jacqueline Nolis You are going to need more than technical knowledge to succeed as a data scientist. Build a Career in Data Science teaches you what school leaves out, from how to land your first job to the lifecycle of a data science project, and even how to become a manager. Paid: Lots of free preview available $20 Link: https://www.manning.com/books/build-a-career-in-data-science 4.3 Conversations On Data Science by Roger Peng, Hilary Parker This book collects many of their discussions from the podcast Not So Standard Deviations and distills them into a readable format. Paid: Pay what you want for the ebook, minimum $0 Link: https://leanpub.com/conversationsondatascience 4.4 Essays on Data Analysis by Roger Peng This book draws a complete picture of the data analysis process, filling out many details that are missing from previous presentations. It presents a new perspective on what makes for a successful data analysis and how the quality of data analyses can be judged. Paid: Pay what you want for the ebook, minimum $0 Link: https://leanpub.com/dataanalysisessays 4.5 Executive Data Science by Brian Caffo, Roger D. Peng, Jeffrey Leek A Guide to Training and Managing the Best Data Scientists. Learn what you need to know to begin assembling and leading a data science enterprise. Paid: Pay what you want for the PDF, minimum $0 Link: https://leanpub.com/eds 4.6 Getting Started in Data Science by Ayodele Odubela This book is for anyone intersted in Data Science, but is unsure where to start. Cut through the noise and learn my best tips for understanding Machine Learning with insight from my 4 years of industry experience. Learn the math as it applies to real-life data projects and get an understanding of fairness, ethics, and accounability in AI. Paid: $20 Link: https://gumroad.com/l/getting-started-in-data-science 4.7 Hiring Data Scientists and Machine Learning Engineers by Roy Keyes Its quite possible that the only thing more confusing than defining data science is actually hiring data scientists. Hiring Data Scientists and Machine Learning Engineers is a concise, practical guide to cut through the confusion. Whether youre the founder of a brand new startup, the senior vice president in charge of digital transformation at a global industrial company, the leader of a new analytics effort at a non-profit, or a junior manager of a machine learning team at a tech giant, this book will help walk you through the important questions you need to answer to determine what role and which skills you should hire for, how to source applicants, how to assess those applicants skills, and how to set your new hires up for success. Special emphasis is placed on in-office vs remote hiring situations. Paid: varies $25 Link: https://dshiring.com 4.8 Introduction to Machine Learning Interviews Book by Chip Huyen This book is the result of the collective wisdom of many people who have sat on both sides of the table and who have spent a lot of time thinking about the hiring process. It was written with candidates in mind, but hiring managers who saw the early drafts told me that they found it helpful to learn how other companies are hiring, and to rethink their own process. The book consists of two parts. The first part provides an overview of the machine learning interview process, what types of machine learning roles are available, what skills each role requires, what kinds of questions are often asked, and how to prepare for them. This part also explains the interviewers mindset and what kind of signals they look for. The second part consists of over 200 knowledge questions, each noted with its level of difficulty interviews for more senior roles should expect harder questions that cover important concepts and common misconceptions in machine learning. Link: https://huyenchip.com/ml-interviews-book/ 4.9 Project Management Fundamentals for Data Analysts by Oscar Baruffa In Project Management Fundamentals for Data Analysts, Ive boiled the concepts down to the bare essentials which can be read in under 15 minutes you can certainly fit that into your crazy schedule (and it will help your future schedule not be so chaotic!). These concepts can be used to great effect on their own if you wish to never read another word on the topic. Itll also provide a solid foundation if you want to dive deeper into more formal courses or sophisticated theory. Paid: $12 Link: https://oscarbaruffa.com/pm/ 4.10 Telling Stories With Data by Rohan Alexander This aim of this book is to help you learn how to tell stories with data. It establishes a foundation on which you can build and share knowledge, based on data, about an aspect of the world of interest to you. In this book we explore, prod, push, manipulate, knead, and ultimately, try to understand the implications of, data. The motto of the university from which I took my PhD is Naturam primum cognoscere rerum or roughly first to learn the nature of things, and we will indeed attempt to do that. But the original quote continues temporis aeterni quoniam, or roughly for eternal time, and it is tools, approaches, and workflows that enable you to establish lasting knowledge that I focus on in this book. Link: https://www.tellingstorieswithdata.com/ 4.11 The Programmers Brain : What every programmer needs to know about cognition by Felienne Hermans Explores the way your brain works when its thinking about code. In it, youll master practical ways to apply these cognitive principles to your daily programming life. Youll improve your code comprehension by turning confusion into a learning tool, and pick up awesome techniques for reading code and quickly memorizing syntax. This practical guide includes tips for creating your own flashcards and study resources that can be applied to any new language you want to master. By the time youre done, youll not only be better at teaching yourselfyoull be an expert at bringing new colleagues and junior programmers up to speed. Paid: Free preview $30 Link: https://www.manning.com/books/the-programmers-brain 4.12 Twitter for R Programmers by Oscar Baruffa, Veerle van Son The R community is very active on Twitter. You can learn a lot about the language, about new approaches to problems, make friends and even land a job or next contract. Its a real-time pulse of the R community.What can you gain from becoming active on Twitter? This book will talk about the benefits and it will show you how to use Twitter. Link: https://www.t4rstats.com 4.13 Twitter for Scientists by Daniel S. Quintana Paid: I believe that Twitter can provide extraordinary opportunities for scientists, regardless of their seniority, mentors, or institution. By actively contributing to Twitter, Ive kept up-to-date with emerging methods, several doors have opened for research collaborations, and Ive been introduced to a supportive community of like-minded scientists. Most important, Ive received valuable feedback on my work and been able to share my research to people that would have not otherwise seen it. In fact, if it wasnt for Twitter I dont think Id still be in academia. Link: https://t4scientists.com/ "],["archeology.html", "5 Archeology 5.1 How To Do Archaeological Science Using R 5.2 Quantitative Methods in Archaeology Using R", " 5 Archeology 5.1 How To Do Archaeological Science Using R by Ben Marwick (editor) Archaeological science is becoming increasingly complex, and progress in this area is slowed by critical limitation of journal articles lacking the space to communicate new methods in enough detail to allow others to reproduce and reuse new research. One solution to this is to use a programming language such as R to analyse archaeological data, with authors sharing their R code with their publications to communicate our methods. This practice is becoming widespread in many other disciplines, but few archaeologists currently know how to use R or have an opportunity to learn during their training. In this forum we tackle this problem by discussing ubiquitous research methods of immediate relevance to most archaeologists, by using interactive, live-coded demonstrations of R code by archaeologists who program with R. Topics include getting data into R, working with C14 dates, spatial analysis and map-making, conducting simulations, and exploratory data visualizations. Link: https://benmarwick.github.io/How-To-Do-Archaeological-Science-Using-R/ 5.2 Quantitative Methods in Archaeology Using R The first hands-on guide to using the R statistical computing system written specifically for archaeologists. It shows how to use the system to analyze many types of archaeological data. Part I includes tutorials on R, with applications to real archaeological data showing how to compute descriptive statistics, create tables, and produce a wide variety of charts and graphs. Part II addresses the major multivariate approaches used by archaeologists, including multiple regression (and the generalized linear model); multiple analysis of variance and discriminant analysis; principal components analysis; correspondence analysis; distances and scaling; and cluster analysis. Part III covers specialized topics in archaeology, including intra-site spatial analysis, seriation, and assemblage diversity. Paid: Loan or buy $100 Link: https://www.cambridge.org/core/books/quantitative-methods-in-archaeology-using-r/DEAE593FA2418EA3B8ECD538C34ED2D5?fbclid=IwAR0guclfEtttfDkVKNUJWfhQ1wgUlXSKAIA3f_6D3hS_9EkUKivSY9AyFD8 "],["art.html", "6 Art 6.1 Thinking Outside The Grid - A bare bones intro to Rtistry concepts in R using ggplot.", " 6 Art There are no books available covering art, but there are some blog posts available. This first one is is a good intro. 6.1 Thinking Outside The Grid - A bare bones intro to Rtistry concepts in R using ggplot. by Megan Harris Recently Ive discovered the courage to dive into creative coding and generative aRt in R. Something that the R community calls Rtistry. My Rtistry journey so far has been an amazing and tranquil expedition into a world that seemed intimidating and scary on the outside but is honestly just a bottomless pit of fun and creativity on the inside. Im going to talk about some very basic concepts and perspectives you can think about while starting your own Rtistry journey in ggplot. This includes basics on geoms, aesthetics, layering, etc. But then Im also going to walk you through two of my Rtistry examples and code to get you started. This article is intended for those who have some experience with ggplot building in R but may not have realized how to transition from making regular visuals to Rtistry. This article goes over basic concepts that more seasoned users may already know. Link: https://www.thetidytrekker.com/post/thinking-outside-the-grid "],["big-data.html", "7 Big Data 7.1 Exploring, Visualizing, and Modeling Big Data with R 7.2 Mastering Spark with R", " 7 Big Data 7.1 Exploring, Visualizing, and Modeling Big Data with R by Okan Bulut, Christopher Desjardins Working with BIG DATA requires a particular suite of data analytics tools and advanced techniques, such as machine learning (ML). Many of these tools are readily and freely available in R. This full-day session will provide participants with a hands-on training on how to use data analytics tools and machine learning methods available in R to explore, visualize, and model big data. Link: https://okanbulut.github.io/bigdata/ 7.2 Mastering Spark with R by Javier Luraschi, Kevin Kuo, Edgar Ruiz In this book you will learn how to use Apache Spark with R. The book intends to take someone unfamiliar with Spark or R and help you become proficient by teaching you a set of tools, skills and practices applicable to large-scale data science. PS the first chapter has a Jon Snow quote ;) Link: https://therinspark.com/ "],["blogdown.html", "8 Blogdown 8.1 blogdown: Creating Websites with R Markdown 8.2 Create, Publish, and Analyze Personal Websites Using R and RStudio 8.3 https://r4sites-book.netlify.app/", " 8 Blogdown 8.1 blogdown: Creating Websites with R Markdown by Yihui Xie, Amber Thomas, Alison Presmanes Hill We introduce an R package, blogdown, in this short book, to teach you how to create websites using R Markdown and Hugo. Link: https://bookdown.org/yihui/blogdown/ 8.2 Create, Publish, and Analyze Personal Websites Using R and RStudio by Danny Morris A free, digital handbook with step-by-step instructions for launching your own personal website using R, RStudio, and other freely available technologies including GitHub, Hugo, Netlify, and Google Analytics. Link: https://r4sites-book.netlify.app/ 8.3 https://r4sites-book.netlify.app/ by Yihui Xie This short book introduces an R package, bookdown, to change your workflow of writing books. It should be technically easy to write a book, visually pleasant to view the book, fun to interact with the book, convenient to navigate through the book, straightforward for readers to contribute or leave feedback to the book author(s), and more importantly, authors should not always be distracted by typesetting details. Link: https://bookdown.org/yihui/bookdown/ "],["bookdown.html", "9 Bookdown 9.1 A Minimal Book Example", " 9 Bookdown 9.1 A Minimal Book Example This is a sample book written in Markdown. Link: https://benmarwick.github.io/bookdown-ort/ "],["data-science.html", "10 Data Science 10.1 A Business Analysts Introduction to Business Analytics 10.2 An Introduction to Data Analysis 10.3 APS 135: Introduction to Exploratory Data Analysis with R 10.4 Beginning Data Science in R 10.5 Business Case Analysis with R - Simulation Tutorials to Support Complex Business Decisions 10.6 Business Intelligence with R 10.7 Data Science at the Command Line, 2e 10.8 Data Science: A First Introduction 10.9 DevOps for Data Science 10.10 edav.info/ 10.11 Everyday Data Science 10.12 Exploratory Data Analysis with R 10.13 Introduction to Data Science 10.14 Model-Based Clustering and Classification for Data Science 10.15 Modern Data Science with R 10.16 Modern Statistics with R 10.17 Practical Data Science with R, Second Edition 10.18 R Data Science Quick Reference 10.19 R for Data Science 10.20 R for Data Science Solutions 10.21 R Programming for Data Science 10.22 The Art of Data Science 10.23 The Elements of Data Analytic Style 10.24 Yet another R for Data Science study guide", " 10 Data Science 10.1 A Business Analysts Introduction to Business Analytics by Adam Fleischhacker This textbook goes farther than just teaching you to make computational models using software or mathematical models using statistics. It teaches you how to align computational and mathematical models with real-world scenarios; empowering you to communicate with and leverage the expertise of business stakeholders while using modern software stacks and statistical workflows. In this book, you do not learn business analytics to make models; you learn business analytics to add tangible value in the real-world. Link: https://www.causact.com/ 10.2 An Introduction to Data Analysis by Michael Franke This book provides basic reading material for an introduction to data analysis. It uses R to handle, plot and analyze data. After covering the use of R for data wrangling and plotting, the book introduces key concepts of data analysis from a Bayesian and a frequentist tradition. This text is intended for use as a first introduction to statistics for an audience with some affinity towards programming, but no prior exposition to R. Link: https://michael-franke.github.io/intro-data-analysis/index.html 10.3 APS 135: Introduction to Exploratory Data Analysis with R by Dylan Z. Childs This is the online course book for the Introduction to Exploratory Data Analysis with R component of APS 135, a module taught by the Department and Animal and Plant Sciences at the University of Sheffield. You will be introduced to the R ecosystem.You will learn how to use R to carry out data manipulation and visualisation.This book provides a foundation for learning statistics later on. Link: https://dzchilds.github.io/eda-for-bio/ 10.4 Beginning Data Science in R by Thomas Mailund Beginning Data Science in R details how data science is a combination of statistics, computational science, and machine learning. Youll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. Those with some data science or analytics background, but not necessarily experience with the R programming language Paid: $40 Link: https://amzn.to/2Ns1HHi 10.5 Business Case Analysis with R - Simulation Tutorials to Support Complex Business Decisions by Robert D. Brown III Business case analysis, often conducted in spreadsheets, exposes decision makers to additional risks that arise just from the use of the spreadsheet environment. This book discusses how to use the statistical programming language R to develop a business case simulation and analysis. It presents a methodology that minimizes decision delay by focusing stakeholders on what matters most and suggests pathways for minimizing the risk in strategic and capital allocation decisions. Paid: Apress/Springer-Nature eBook $24.99, Softcover $34.99 $25 Link: https://www.apress.com/us/book/9781484234945# 10.6 Business Intelligence with R by Dwight Barry) A desktop reference for busy professionals, giving you fingertip access to a variety of BI analytic methods done in R as simply as possible. All proceeds will support mitochondrial disorder research at Seattle Childrens Hospital. Paid: Free or up to $20 for a good cause! $20 Link: https://leanpub.com/businessintelligencewithr 10.7 Data Science at the Command Line, 2e by Jeroen Janssens This book is about doing data science at the command line. Our aim is to make you a more efficient and productive data scientist by teaching you how to leverage the power of the command line. Link: https://www.datascienceatthecommandline.com/2e/ 10.8 Data Science: A First Introduction by Tiffany-Anne Timbers, Trevor Campbell, Melissa Lee This is an open source textbook aimed at introducing undergraduate students to data science. It was originally written for the University of British Columbias DSCI 100 - Introduction to Data Science course. In this book, we define data science as the study and development of reproducible, auditable processes to obtain value (i.e., insight) from data. Link: https://ubc-dsci.github.io/introduction-to-datascience/ 10.9 DevOps for Data Science by Alex Gold At some point, most data scientists reach the point where they want to show their work to others. But the skills and tools to deploy data science are completely different from the skills and tools needed to do data science. If youre a data scientist who wants to get your work in front of the right people, this book aims to equip you with all the technical things you need to know that arent data science. Hopefully, once youve read this book, youll understand how to deploy your data science, whether youre building a DIY deployment system or trying to work with your organizations IT/DevOps/SysAdmin/SRE group to make that happen. Link: https://akgold.github.io/do4ds/index.html 10.10 edav.info/ by Zach Bogart, Joyce Robbins With this resource, we try to give you a curated collection of tools and references that will make it easier to learn how to work with data in R. In addition, we include sections on basic chart types/tools so you can learn by doing. There are also several walkthroughs where we work with data and discuss problems as well as some tips/tricks that will help you. Link: https://edav.info/ 10.11 Everyday Data Science by Andrew Carr Everyday data science is a collection of tools and techniques you can use to master data science in your day-to-day life. There are case studies, tutorials, code snippets, pictures, math, and jokes. All designed as a fun introduction to the world of data science. Some example chapters include, A/B testing to make perfect lemonade, word vectors to improve your resume, differential equations for weight loss, and how a man used statistics to qualify for the Olympics. Life is full of decisions. We, as people, have the remarkable ability to make decisions in the face of uncertainty. We, as humans, have only recently developed the ability to use computers to process vast amounts of data to improve our decision making. This innovation has led to the development of the field of Data Science. This book is written to give tools and inspiration to aspiring decision makers. You make decisions daily and the methodology of data science can help. Paid: $8 Link: https://gumroad.com/l/everydaydata 10.12 Exploratory Data Analysis with R by Roger Peng This book teaches you to use R to effectively visualize and explore complex datasets. Exploratory data analysis is a key part of the data science process because it allows you to sharpen your question and refine your modeling strategies. This book is based on the industry-leading Johns Hopkins Data Science Specialization Paid: Free or Pay what you want $15 Link: https://leanpub.com/exdata 10.13 Introduction to Data Science by Rafael A Irizarry The demand for skilled data science practitioners in industry, academia, and government is rapidly growing. This book introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling with dplyr, data visualization with ggplot2, algorithm building with caret, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation with knitr and R markdown. Bookdown version https://rafalab.github.io/dsbook/ Paid: Free or pay what you want $50 Link: https://leanpub.com/datasciencebook 10.14 Model-Based Clustering and Classification for Data Science by Charles Bouveyron, Gilles Celeux, T. Brendan Murphy, Adrian E. Raftery Among the broad field of statistical and machine learning, model-based techniques for clustering and classification have a central position for anyone interested in exploiting those data. This text book focuses on the recent developments in model-based clustering and classification while providing a comprehensive introduction to the field. It is aimed at advanced undergraduates, graduates or first year PhD students in data science, as well as researchers and practitioners. Link: https://math.unice.fr/~cbouveyr/MBCbook/ 10.15 Modern Data Science with R by Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton This book is intended for readers who want to develop the appropriate skills to tackle complex data science projects and think with data (as coined by Diane Lambert of Google). The desire to solve problems using data is at the heart of our approach. We acknowledge that it is impossible to cover all these topics in any level of detail within a single book: Many of the chapters could productively form the basis for a course or series of courses. Instead, our goal is to lay a foundation for analysis of real-world data and to ensure that analysts see the power of statistics and data analysis. After reading this book, readers will have greatly expanded their skill set for working with these data, and should have a newfound confidence about their ability to learn new technologies on-the-fly. This book was originally conceived to support a one-semester, 13-week undergraduate course in data science. We have found that the book will be useful for more advanced students in related disciplines, or analysts who want to bolster their data science skills. At the same time, Part I of the book is accessible to a general audience with no programming or statistics experience. Link: https://mdsr-book.github.io/mdsr2e/ 10.16 Modern Statistics with R by Måns Thulin This book covers the fundamentals of data science and statistics. The first half deals with the basics of R and R coding, data wrangling, exploratory data analysis and more advandced programming. The second half deals with modern statistics (favouring permutation tests, the bootstrap and Bayesian methods over traditional asymptotic methods), regression models and predictive modelling. It also contains information about debugging and explanations of 25 commonly encountered error messages in R. In addition, there are 170 or so exercises with fully worked solutions. Link: http://www.modernstatisticswithr.com/ 10.17 Practical Data Science with R, Second Edition by Nina Zumel, John Mount Practical Data Science with R, Second Edition takes a practice-oriented approach to explaining basic principles in the ever expanding field of data science. Youll jump right to real-world use cases as you apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business intelligence, and decision support. Paid: Free preview $25 Link: https://www.manning.com/books/practical-data-science-with-r-second-edition#toc 10.18 R Data Science Quick Reference by Thomas Mailund In this book, youll learn about the following APIs and packages that deal specifically with data science applications: readr, dibble, forecasts, lubridate, stringr, tidyr, magnittr, dplyr, purrr, ggplot2, modelr, and more. Paid: $30 Link: https://amzn.to/2WN1mQy 10.19 R for Data Science by Hadley Wickham, Garret Grolemund This book will teach you how to do data science with R: Youll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, youll learn how to clean data and draw plotsand many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. Youll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. Youll also learn how to manage cognitive resources to facilitate discoveries when wrangling, visualising, and exploring data. Link: https://r4ds.had.co.nz/ 10.20 R for Data Science Solutions by Jeffrey B. Arnold Solutions for the hadley and Grolemund R4Ds book Link: https://jrnold.github.io/r4ds-exercise-solutions/ 10.21 R Programming for Data Science by Roger Peng This book is about the fundamentals of R programming. You will get started with the basics of the language, learn how to manipulate datasets, how to write functions, and how to debug and optimize code. With the fundamentals provided in this book, you will have a solid foundation on which to build your data science toolbox. Link: https://bookdown.org/rdpeng/rprogdatascience/ 10.22 The Art of Data Science by Roger D. Peng, Elizabeth Matsui A Guide for Anyone Who Works with Data This book describes the process of analyzing data. The authors have extensive experience both managing data analysts and conducting their own data analyses, and this book is a distillation of their experience in a format that is applicable to both practitioners and managers in data science. Paid: Free (excl lecture videos) or pay what you want $15 Link: https://leanpub.com/artofdatascience 10.23 The Elements of Data Analytic Style by Jeffrey Leek Data analysis is at least as much art as it is science. This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks. It is based in part on the authors blog posts, lecture materials, and tutorials. Paid: Free or pay what you want $10 Link: https://leanpub.com/datastyle 10.24 Yet another R for Data Science study guide by Bryan Shalloway This book contains my solutions and notes to Garrett Grolemund and Hadley Wickhams excellent book, R for Data Science (Grolemund and Wickham 2017). R for Data Science (R4DS) is my go-to recommendation for people getting started in R programming, data science, or the tidyverse. Link: https://brshallo.github.io/r4ds_solutions/ "],["data-visualization.html", "11 Data Visualization 11.1 A ggplot2 Tutorial for Beautiful Plotting in R 11.2 BBC Visual and Data Journalism cookbook for R graphics 11.3 Data Processing & Visualization 11.4 Data visualisation using R, for researchers who dont use R 11.5 Data Visualization - A practical introduction 11.6 Data Visualization in R 11.7 Data Visualization with R 11.8 Fundamentals of Data Visualization 11.9 ggplot2 in 2 11.10 ggplot2: Elegant Graphics for Data Analysis 11.11 Graphical Data Analysis with R 11.12 Hands-On Data Visualization: Interactive Storytelling from Spreadsheets to Code 11.13 JavaScript for R 11.14 plotly Interactive web-based data visualization with R, plotly, and shiny 11.15 R Graphics Cookbook, 2nd edition 11.16 Solutions to ggplot2: Elegant Graphics for Data Analysis", " 11 Data Visualization 11.1 A ggplot2 Tutorial for Beautiful Plotting in R by Cédric Sherer (Oscar: Not a book per se, but it should be, so Im adding !) A mega tutorial of creating great ggplot2 visuals. Link: https://cedricscherer.netlify.app/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ 11.2 BBC Visual and Data Journalism cookbook for R graphics At the BBC data team, we have developed an R package and an R cookbook to make the process of creating publication-ready graphics in our in-house style using Rs ggplot2 library a more reproducible process, as well as making it easier for people new to R to create graphics. Link: https://bbc.github.io/rcookbook/ 11.3 Data Processing & Visualization by Michael Clark This document provides some tools, demonstrations, and more to make data processing, programming, modeling, visualization, and presentation easier.While the programming language focus is on R, where applicable (which is most of the time), Python notebooks are also available. Link: https://m-clark.github.io/data-processing-and-visualization/ 11.4 Data visualisation using R, for researchers who dont use R by Emily Nordmann, Phil McAleer, Wilhelmiina Toivo, Helena Paterson, Lisa DeBruine In this tutorial, we aim to provide a practical introduction to data visualisation using R, specifically aimed at researchers who have little to no prior experience of using R. First we detail the rationale for using R for data visualisation and introduce the grammar of graphics that underlies data visualisation using the ggplot package. The tutorial then walks the reader through how to replicate plots that are commonly available in point-and-click software such as histograms and boxplots, as well as showing how the code for these basic plots can be easily extended to less commonly available options such as violin-boxplots. Link: https://psyteachr.github.io/introdataviz/ 11.5 Data Visualization - A practical introduction by Kieran Healy This book is a hands-on introduction to the principles and practice of looking at and presenting data using R and ggplot. Link: https://socviz.co/ 11.6 Data Visualization in R by Brooke Anderson Workshop for the 2019 Navy and Marine Corps Public Health Conference. I have based this workshop on examples for you to try yourself, because you wont be able to learn how to program unless you try it out. Ive picked example data that I hope will be interesting to Navy and Marine Corp public health researchers and practitioners. Link: https://geanders.github.io/navy_public_health/index.html#prerequisites 11.7 Data Visualization with R by Rob Kabakoff This book helps you create the most popular visualizations - from quick and dirty plots to publication-ready graphs. The text relies heavily on the ggplot2 package for graphics, but other approaches are covered as well. Link: https://rkabacoff.github.io/datavis/ 11.8 Fundamentals of Data Visualization by Claus Wilke The book is meant as a guide to making visualizations that accurately reflect the data, tell a story, and look professional. Link: https://clauswilke.com/dataviz/ 11.9 ggplot2 in 2 by Lucy DAgostino McGowan Really good overview of ggplot2. The premise is that youll cover the fundamentals in 2 hours. Oscar Baruffa made a sped-up screencast while working through it. It did take 2 hours :). Paid: Pay what you want, minimum $4.99 $5 Link: https://leanpub.com/ggplot2in2 11.10 ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham ggplot2 is an R package for producing statistical, or data, graphics. Unlike most other graphics packages, ggplot2 has an underlying grammar, based on the Grammar of Graphics (Wilkinson 2005), that allows you to compose graphs by combining independent components. This makes ggplot2 powerful. Rather than being limited to sets of pre-defined graphics, you can create novel graphics that are tailored to your specific problem. Link: https://ggplot2-book.org/ 11.11 Graphical Data Analysis with R by Antony Unwin The main aim of the book is to show, using real datasets, what information graphical displays can reveal in data. The target readership includes anyone carrying out data analyses who wants to understand their data using graphics. The book is published by CRC Press and available to purchase, but all the examples and code are freely available on a comprehensive website accompanying the text at http://www.gradaanwr.net/ Link: http://www.gradaanwr.net/ 11.12 Hands-On Data Visualization: Interactive Storytelling from Spreadsheets to Code by Jack Dougherty, Ilya Ilyankou (Oscar: looks like am amazing resource and includes code templates!) In this book, youll learn how to create true and meaningful data visualizations through chapters that blend design principles and step-by-step tutorials, in order to make your information-based analysis and arguments more insightful and compelling. Just as sentences become more persuasive with supporting evidence and source notes, your data-driven writing becomes more powerful when paired with appropriate tables, charts, or maps. Words tell us stories, but visualizations show us data stories by transforming quantitative, relational, or spatial patterns into images. When visualizations are well-designed, they draw our attention to what is most important in the data in ways that would be difficult to communicate through text alone. Link: https://handsondataviz.org/ 11.13 JavaScript for R by John Coene Learn how to build your own data visualisation packages, improve shiny with JavaScript, and use JavaScript for computations. Link: https://javascript-for-r.com 11.14 plotly Interactive web-based data visualization with R, plotly, and shiny by Carson Sievert In this book, youll gain insight and practical skills for creating interactive and dynamic web graphics for data analysis from R. It makes heavy use of plotly for rendering graphics, but youll also learn about other R packages that augment a data science workflow, such as the tidyverse and shiny. Along the way, youll gain insight into best practices for visualization of high-dimensional data, statistical graphics, and graphical perception. Link: https://plotly-r.com/ 11.15 R Graphics Cookbook, 2nd edition by Winston Chang The goal of the cookbook is to provide solutions to common tasks and problems in analyzing data. Link: https://r-graphics.org/ 11.16 Solutions to ggplot2: Elegant Graphics for Data Analysis by Howard Baek This is the website for Solutions to ggplot2: Elegant Graphics for Data Analysis, a solution manual to the exercises in the 3rd edition of ggplot2: Elegant Graphics for Data Analysis, written by Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen. While there are bookdown solution manuals to Hadley Wickhams Advanced R and Mastering Shiny, there is no such thing for the ggplot2 book. This website is an attempt to fill this missing void. Link: https://ggplot2-book-solutions-3ed.netlify.app/index.html "],["field-specific.html", "12 Field specific 12.1 An introduction to quantitative analysis of political data in R 12.2 Analyzing Financial and Economic Data with R 12.3 Computer-age Calculus with R 12.4 Crime by the Numbers: A Criminologists Guide to R 12.5 Cryptocurrency Research: Open Source R Tutorial 12.6 Data Science in Education Using R 12.7 Data Skills for Reproducible Science 12.8 Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data 12.9 Handbook of Graphs and Networks in People Analytics: With Examples in R and Python 12.10 Handbook of Regression Modeling in People Analytics 12.11 How to be a modern scientist 12.12 Introduction to Econometrics with R 12.13 Learning Microeconometrics with R 12.14 Machine Learning for Factor Investing 12.15 Public Policy Analytics: Code & Context for Data Science in Government 12.16 R for Excel users 12.17 R for SEO 12.18 R for Water Resources Data Science 12.19 R Programming with Minecraft 12.20 Technical Foundations of Informatics", " 12 Field specific 12.1 An introduction to quantitative analysis of political data in R by Erik Gahner Larsen, Zoltán Fazekas In this book, we aim to provide an easily accessible introduction to R for the collection, study and presentation of different types of political data. Specifically, the book will teach you how to get different types of political data into R and manipulate, analyze and visualize the output. In doing this, we will not only teach you how to get existing data into R, but also how to collect your own data. Link: http://qpolr.com/ 12.2 Analyzing Financial and Economic Data with R by Marcelo S. Perlin Not surprisingly, fields with abundant access to data and practical applications, such as economics and finance, it is expected that a graduate student or a data analyst has learned at least one programming language that allows him/her to do his work efficiently. Learning how to program is becoming a requisite for the job market. Link: https://www.msperlin.com/afedR/ 12.3 Computer-age Calculus with R by Daniel Kaplan R is closely associated with statistics, but not with calculus. It turns out that R is an excellent language for doing calculus. This book shows how to do common calculus calculations using R. Link: https://dtkaplan.github.io/RforCalculus/ 12.4 Crime by the Numbers: A Criminologists Guide to R by Jacob Kaplan This book introduces the programming language R and is meant for undergrads or graduate students studying criminology. R is a programming language that is well-suited to the type of work frequently done in criminology - taking messy data and turning it into useful information. While R is a useful tool for many fields of study, this book focuses on the skills criminologists should know and uses crime data for the example data sets. Link: https://crimebythenumbers.com 12.5 Cryptocurrency Research: Open Source R Tutorial by Riccardo (Ricky) Esclapon, John Chandler Johnson, Kai R. Larsen The tutorial is in R. For those without experience programming in R we have a high-level version to help you learn before attempting the full version. Scroll down for a breakdown of the individual sections for an overview of what you will learn throughout. You will get more familiar with tools from the tidyverse, including dplyr, ggplot2, tibble and purrr. These tools provide an excellent complete ecosystem to do data science in R. You will learn to create machine learning models and how to fairly assess their performance. Cryptocurrency Data: You will learn these tools analyzing the latest cryptocurrency data. The tutorial automatically refreshes every 12 hours and the data is publicly available and refreshed hourly. Link: https://cryptocurrencyresearch.org/ 12.6 Data Science in Education Using R by Ryan A. Estrellado, Emily A. Bovee, Jesse Mostipak, Isabella C. Velásquez Dear Data Scientists, Educators, and Data Scientists who are Educators: This book is a warm welcome and an invitation. If youre a data scientist in education or an educator in data science, your role isnt exactly straightforward. This book is our contribution to a growing movement to merge the paths of data analysis and education. We wrote this book to make your first step on that path a little clearer and a little less scary. Link: https://datascienceineducation.com/ 12.7 Data Skills for Reproducible Science by PsyTeachR team, University of Glasgow This course provides an overview of skills needed for reproducible research and open science using the statistical programming language R. Students will learn about data visualisation, data tidying and wrangling, archiving, iteration and functions, probability and data simulations, general linear models, and reproducible workflows. Learning is reinforced through weekly assignments that involve working with different types of data. Link: https://psyteachr.github.io/msc-data-skills/ 12.8 Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data by Michael Friendly, David Meyer Presents an applied treatment of modern methods for the analysis of categorical data, both discrete response data and frequency data. It explains how to use graphical methods for exploring data, spotting unusual features, visualizing fitted models, and presenting results. Paid: $80 Link: http://ddar.datavis.ca/ 12.9 Handbook of Graphs and Networks in People Analytics: With Examples in R and Python by Keith McNulty The technology of graphs is all around us, and enables so many of the ways in which we live our lives today. That same technology is also available to us at no cost as an analytic tool to allow us to better understand network structures and dynamics in the fields of science, technology, economics, sociology and psychology to name just a few. It is available to academics and practitioners alike, and can be used on problems ranging from a very small network analysis which takes a few minutes on a laptop, to massive scale network mining requiring days or weeks of processing time. But heres the problem: few people really know how to do network analysis. It is still considered by many as a deep specialism or even a dark art. It shouldnt be. This book aims to make the field of graph and network analysis more approachable to students and professionals by explaining the most important elements of theory and sharing common methodologies using open source programming languages like R and Python. It does so by explaining theory in as much detail as is necessary to support analytical curiosity and interpretation, and by using a wide array of example data sets and code snippets to demonstrate the specific implementation and interpretation of methodologies. Link: https://ona-book.org/ 12.10 Handbook of Regression Modeling in People Analytics by Keith McNulty It is the authors firm belief that all people analytics professionals should have a strong understanding of regression models and how to implement and interpret them in practice, and the aim with this book is to provide those who need it with help in getting there. For accompanying solutions to some of the questions: https://keithmcnulty.github.io/peopleanalytics-regression-book/solutions/ Link: http://peopleanalytics-regression-book.org/index.html 12.11 How to be a modern scientist by Jeffrey Leek A book about how to be a scientist the modern, open-source way. The face of academia is changing. It is no longer sufficient to just publish or perish. We are now in an era where Twitter, Github, Figshare, and Alt Metrics are regular parts of the scientific workflow. Here I give high level advice about which tools to use, how to use them, and what to look out for. This book is appropriate for scientists at all levels who want to stay on top of the current technological developments affecting modern scientific careers. Paid: Free or pay what you want $10 Link: https://leanpub.com/modernscientist 12.12 Introduction to Econometrics with R by Christoph Hanck, Martin Arnold, Alexander Gerber, Martin Schmelzer Instead of confronting students with pure coding exercises and complementary classic literature like the book by Venables & Smith (2010), we figured it would be better to provide interactive learning material that blends R code with the contents of the well-received textbook Introduction to Econometrics by Stock & Watson (2015) which serves as a basis for the lecture. Link: https://www.econometrics-with-r.org/ 12.13 Learning Microeconometrics with R by Christopher P. Adams This book provides an introduction to the field of microeconometrics through the use of R. The focus is on applying current learning from the field to real world problems. It uses R to both teach the concepts of the field and show the reader how the techniques can be used. It is aimed at the general reader with the equivalent of a bachelors degree in economics, statistics or some more technical field. It covers the standard tools of microeconometrics, OLS, instrumental variables, Heckman selection and difference in difference. In addition, it introduces bounds, factor models, mixture models and empirical Bayesian analysis. Paid: $100 Link: https://www.routledge.com/Learning-Microeconometrics-with-R/Adams/p/book/9780367255381 12.14 Machine Learning for Factor Investing by Guillaume Coqueret, Tony Guida This book is intended to cover some advanced modelling techniques applied to equity investment strategies that are built on firm characteristics. Link: http://www.mlfactor.com/ 12.15 Public Policy Analytics: Code & Context for Data Science in Government by Ken Steif, Ph.D The goal of this book is to make data science accessible to social scientists and City Planners, in particular. I hope to convince readers that one with strong domain expertise plus intermediate data skills can have a greater impact in government than the sharpest computer scientist who has never studied economics, sociology, public health, political science, criminology etc. Link: https://urbanspatial.github.io/PublicPolicyAnalytics/ 12.16 R for Excel users by Julie Lowndes, Allison Horst This course is for Excel users who want to add or integrate R and RStudio into their existing data analysis toolkit. It is a friendly intro to becoming a modern R user, full of tidyverse, RMarkdown, GitHub, collaboration & reproducibility. Link: https://rstudio-conf-2020.github.io/r-for-excel/ 12.17 R for SEO by François Joly Even though R is a terrific option for SEO, there are simply not enough resources out there. This guide is not here to deliver a course about R, there are plenty already. This guide is meant to be as practical as possible. How things should be done in an R-ish way is not the purpose of this guide. Grab what you want to grab and feel free to submit your own solution. Link: https://www.rforseo.com/ 12.18 R for Water Resources Data Science by Ryan Peek, Rich Pauloo Consists of 2 courses Introductory: This course is most relevant and targeted at folks who work with data, from analysts and program staff to engineers and scientists. This course provides an introduction to the power and possibility of a reproducible programming language (R) by demonstrating how to import, explore, visualize, analyze, and communicate different types of data. Using water resources based examples, this course guides participants through basic data science skills and strategies for continued learning and use of R. Intermediate: In this course, we will move more quickly, assume familiarity with basic R skills, and also assume that the participant has working experience with more complex workflows, operations, and code-bases. Each module in this course functions as a stand-alone lesson, and can be read linearly, or out of order according to your needs and interests. Each module doesnt necessarily require familiarity with the previous module. This course emphasizes intermediate scripting skills like iteration, functional programming, writing functions, and controlling project workflows for better reproducibility and efficiency. Approaches to working with more complex data structures like lists and timeseries data, the fundamentals of building Shiny Apps, pulling water resources data from APIs, intermediate mapmaking and spatial data processing, integrating version control in projects with git. Link: https://www.r4wrds.com/ 12.19 R Programming with Minecraft by Brooke Anderson, Karl Broman, Gergely Daróczi, Mario Inchiosa, David Smith, Ali Zaidi Minecraft is awesome fun, especially in creative mode, where you can build all sorts of crazy stuff. But ambitious building projects can be really tedious to create by hand. With the miner R package, you can write R code to manipulate your Minecraft world and create even more awesome stuff. Heres an introduction Rstats NYC conference talk on it: https://www.youtube.com/watch?v=r_JgPF8MJpY Link: https://kbroman.org/miner_book/?s=09 12.20 Technical Foundations of Informatics by Michael Freeman, Joel Ross This book covers the foundation skills necessary to start writing computer programs to work with data using modern and reproducible techniques. It requires no technical background. These materials were developed for the INFO 201: Technical Foundations of Informatics course taught at the University of Washington Information School; however they have been structured to be an online resource for anyone hoping to learn to work with information using programmatic approaches. Link: https://info201.github.io/ "],["geospatial.html", "13 Geospatial 13.1 Geocomputation with R 13.2 Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny 13.3 Introduction to Spatial Data Programming with R 13.4 Predictive Soil Mapping with R 13.5 Spatial Data Science 13.6 Spatial Microsimulation with R 13.7 Spatial Modelling for Data Scientists 13.8 Using R for Digital Soil Mapping", " 13 Geospatial 13.1 Geocomputation with R by Robin Lovelace, Jakub Nowosad, Jannes Muenchow This is the online home of Geocomputation with R, a book on geographic data analysis, visualization and modeling. Link: https://geocompr.robinlovelace.net/ 13.2 Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny by Paula Moraga This book describes spatial and spatio-temporal statistical methods and visualization techniques to analyze georeferenced health data in R. After a detailed introduction of geospatial data, the book shows how to develop Bayesian hierarchical models for disease mapping and apply computational approaches such as the integrated nested Laplace approximation (INLA) and the stochastic partial differential equation (SPDE) to analyze areal and geostatistical data. Link: https://www.paulamoraga.com/book-geospatial/ 13.3 Introduction to Spatial Data Programming with R by Michael Dorman This book introduces processing and analysis methods for working with spatial data in R. The book is composed of two parts. The first part gives an overview of the basic syntax and usage of the R language, required before we can start working with spatial data. The second part then covers spatial data workflows, including how to process rasters, vector layers, and both of them together, as well as two selected advanced topics: spatio-temporal data and spatial interpolation. Link: https://geobgu.xyz/r 13.4 Predictive Soil Mapping with R by Tom Heng, Robert A. MacMillan Predictive Soil Mapping (PSM) with R explains how to import, process and analyze soil data in R using the state-of-the-art soil and Machine Learning packages with ultimate objective to produce most objective spatial predictions of soil numeric and factor-type variables. Especial focus has been put on using R in combination with the Open Source GIS such as GDAL, SAGA GIS and similar, and on using Machine Learning packages ranger, xgboost, SuperLearner and similar. This book is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contributions of new chapters are welcome. Link: https://soilmapper.org 13.5 Spatial Data Science by Edzer Pebesma, Roger Bivand This book introduces and explains the concepts underlying spatial data: points, lines, polygons, rasters, coverages, geometry attributes, data cubes, reference systems, as well as higher-level concepts including how attributes relate to geometries and how this affects analysis. Link: https://keen-swartz-3146c4.netlify.app/ 13.6 Spatial Microsimulation with R by Robin Lovelace, Morgane Dumont Imagine a world in which data on companies, households and governments were widely available. Imagine, further, that researchers and decision-makers acting in the public interest had tools enabling them to test and model such data to explore different scenarios of the future. People would be able to make more informed decisions, based on the best available evidence. In this technocratic dreamland pressing problems such as climate change, inequality and poor human health could be solved. These are the types of real-world issues that we hope the methods in this book will help to address. Spatial microsimulation can provide new insights into complex problems and, ultimately, lead to better decision-making. By shedding new light on existing information, the methods can help shift decision-making processes away from ideological bias and towards evidence-based policy. Link: https://spatial-microsim-book.robinlovelace.net/index.html 13.7 Spatial Modelling for Data Scientists by Francisco Rowe, Dani Arribas-Bel This is the website for Spatial Modeling for Data Scientists. This is a course taught by Dr. Francisco Rowe and Dr. Dani Arribas-Bel in the Second Semester of 2020/21 at the University of Liverpool, United Kingdom. You will learn how to analyse and model different types of spatial data as well as gaining an understanding of the various challenges arising from manipulating such data. Link: https://gdsl-ul.github.io/san/ 13.8 Using R for Digital Soil Mapping by Malone, Brendan P., Minasny, Budiman, McBratney, Alex B Describes in detail, with ample exercises, how digital soil mapping is done This work includes a number of work-flows that direct users how to create digital soil maps for their own projects This work includes tutorials for users to learn the fundamentals of R, but with a focus on how to use it for digital soil mapping Paid: $90 Link: https://www.springer.com/gp/book/9783319443256 "],["getting-cleaning-and-wrangling-data.html", "14 Getting, cleaning and wrangling data 14.1 21 Recipes for Mining Twitter Data with rtweet 14.2 A Beginners Guide to Clean Data 14.3 Spreadsheet Munging Strategies 14.4 Text Mining with R 14.5 Text Mining With Tidy Data Principles", " 14 Getting, cleaning and wrangling data 14.1 21 Recipes for Mining Twitter Data with rtweet by Bob Rudis The recipes contained in this book use the rtweet package by Michael W. Kearney. Link: https://rud.is/books/21-recipes/ 14.2 A Beginners Guide to Clean Data by Benjamin Greve This book will help you to become a better data scientist by showing you the things that can go wrong when working with data - particularly low-quality data. A key difference between a junior and a senior data scientist is the awareness of potential pitfalls. The experienced data scientist will expect them, navigate around them and avoid costly iteration cycles. After reading this book, you will be able to spot data quality problems and deal with them before they can break your work, saving yourself a lot of time. Link: https://b-greve.gitbook.io/beginners-guide-to-clean-data/ 14.3 Spreadsheet Munging Strategies by Duncan Garmonsway This is a work-in-progress book about getting data out of spreadsheets, no matter how peculiar. The book is designed primarily for R users who have to extract data from spreadsheets and who are already familiar with the tidyverse. It has a cookbook structure, and can be used as a reference, but readers who begin in the middle might have to work backwards from time to time. Link: https://nacnudus.github.io/spreadsheet-munging-strategies/ 14.4 Text Mining with R by Julia Silge, David Robinson This book serves as an introduction of text mining using the tidytext package and other tidy tools in R. The functions provided by the tidytext package are relatively simple; what is important are the possible applications. Thus, this book provides compelling examples of real text mining problems. Link: https://www.tidytextmining.com/ 14.5 Text Mining With Tidy Data Principles by Julia Silge Text data sets are diverse and ubiquitous, and tidy data principles provide an approach to make text mining easier, more effective, and consistent with tools already in wide use. In this tutorial, you will develop your text mining skills using the tidytext package in R, along with other tidyverse tools. Link: https://juliasilge.shinyapps.io/learntidytext/ "],["journalism.html", "15 Journalism 15.1 Practical R for Mass Communication and Journalism 15.2 Using R for Data Journalism", " 15 Journalism 15.1 Practical R for Mass Communication and Journalism by Sharon Machlis Welcome to this excerpt from Practical R for Mass Communication and Journalism. In these sample chapters, youll: learn how to find your way around R and RStudio, see how much you can do in just a few lines of code, start doing some basic data exploration, and get some ideas and sample code for using R in analyzing election results. I hope you find this excerpt useful! If you do and would like to read more, you can order the complete book from CRC Press or Amazon. Paid: Free samples $55 Link: http://www.machlis.com/R4Journalists/index.html 15.2 Using R for Data Journalism by Andrew Ba Tran This site will help you learn how to use the statistical computing and graphics language R to enhance your data analysis and reporting process. It was originally part of a free MOOC offered by the Knight Center at the University of Texas Link: https://learn.r-journalism.com/en/ "],["life-sciences.html", "16 Life Sciences 16.1 An Open Compendium of Soil Datasets 16.2 Assigning cell types with SingleR 16.3 Computational Genomics with R 16.4 Data Analysis and Visualization in R for Ecologists 16.5 Data Analysis for the Life Sciences 16.6 Data Science for the Biomedical Sciences 16.7 Experimental Design for Laboratory Biologists: Maximising Information and Improving Reproducibility 16.8 Git and Github for Advanced Ecological Data Analysis 16.9 Hydroinformatics at VT 16.10 Introduction to Data Analysis with R 16.11 Modern Statistics for Modern Biology 16.12 Numerical Ecology with R 16.13 Orchestrating Single-Cell Analysis with Bioconductor 16.14 R for applied epidemiology and public health 16.15 R for Conservation and Development Projects: A Primer for Practitioners 16.16 R for Health Data Science 16.17 Reproducible Medical Research with R 16.18 Statistics in R for Biodiversity Conservation Paperback 16.19 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling 16.20 WEHI Intro to Tidy R Course", " 16 Life Sciences 16.1 An Open Compendium of Soil Datasets by Tomislav Hengl (Not R specific but looks really relevant) This is a public compendium of global, regional, national and sub-national soil samples and/or soil profile datasets (points with Observations and Measurements of soil properties and characteristics). Datasets listed here, assuming compatible open license, are afterwards imported into the Global compilation of soil chemical and physical properties and soil classes and eventually used to create a better open soil information across countries. The specific objectives of this initiative are: To enable data digitization, import and binding + harmonization, To accelerate research collaboration and networking, To enable development of more accurate / more usable global and regional soil property and class maps (typically published via https://OpenLandMap.org), Link: https://opengeohub.github.io/SoilSamples/ 16.2 Assigning cell types with SingleR by Aaron Lun and contributors This book covers the use of SingleR, one implementation of an automated annotation method for cell type annotation. Link: https://bioconductor.org/books/3.12/SingleRBook/ 16.3 Computational Genomics with R by Altuna Akalin The aim of this book is to provide the fundamentals for data analysis for genomics. We developed this book based on the computational genomics courses we are giving every year. Link: http://compgenomr.github.io/book/ 16.4 Data Analysis and Visualization in R for Ecologists by François Michonneau, Auriel Fournier Data Carpentrys aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with ecology data in R. This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from R. This lesson assumes no prior knowledge of R or RStudio and no programming experience. Link: https://datacarpentry.org/R-ecology-lesson/ 16.5 Data Analysis for the Life Sciences by Rafael A Irizarry, Michael I Love Data analysis is now part of practically every research project in the life sciences. In this book we use data and computer code to teach the necessary statistical concepts and programming skills to become a data analyst. Instead of showing theory first and then applying it to toy examples, we start with actual applications. http://genomicsclass.github.io/book/ Paid: Free or pay what you want $40 Link: https://leanpub.com/dataanalysisforthelifesciences 16.6 Data Science for the Biomedical Sciences by Daniel Chen, Anne Brown We hope this book provides a gentle introduction to data science. The main goal is to understand how to work with spreadsheet data and how data can be manipulated for multiple purposes. If nothing else, the book hopes to help you plan how to structure your own datasets for your own analysis. Even if you never go on to program on your own, understanding the way data can be manipulated and having a plan for your own dataset in the processing pipeline, will go a long ways when leaning and doing the analysis on your own, and/or working with collegues and collaborators on a project. Link: https://ds4biomed.tech/ 16.7 Experimental Design for Laboratory Biologists: Maximising Information and Improving Reproducibility by Stanley E. Lazic This practical guide shows biologists how to design reproducible experiments that have low bias, high precision, and results that are widely applicable. With specific examples using both cell cultures and model organisms, it shows how to plan a successful experiment. It demonstrates how to control biological and technical factors that can introduce bias or add noise, and covers rarely discussed topics such as graphical data exploration, choosing outcome variables, data quality control checks, and data pre-processing. It also shows how to use R for analysis, and is designed for those with no prior experience. This is an ideal guide for anyone conducting lab-based biological research. Paid: $52 Link: https://stanlazic.github.io/EDLB.html 16.8 Git and Github for Advanced Ecological Data Analysis by Alexa Fredston This material was prepared for a three-hour virtual session to teach Git and Github to a graduate-level course on Advanced Ecological Data Analysis taught at Rutgers University by Malin Pinsky and Rachael Winfree. (However, the only course-specific material is Section 4; the rest should be applicable to any reader.) Link: https://afredston.github.io/learn-git/learn-git.htm 16.9 Hydroinformatics at VT by JP Gannon This bookdown contains the notes and most exercises for a course on data analysis techniques in hydrology using the programming language R. The material will be updated each time the course is taught. If new topics are added, the topics they replace will be left, in case they are useful to others. Link: https://vt-hydroinformatics.github.io/ 16.10 Introduction to Data Analysis with R by Jannik Buhr This is a video lecture series with accompanying lecture script that is designed to read much like a book. The lecture is held in English for biochemists at Heidelberg University, Germany, but the examples covered are no specific to life sciences in order to enable a focus on learning the techniques with R. Link: https://jmbuhr.de/dataIntro20 16.11 Modern Statistics for Modern Biology by Susan Holmes, Wolfgang Huber The aim of this book is to enable scientists working in biological research to quickly learn many of the important ideas and methods that they need to make the best of their experiments and of other available data. Link: https://www.huber.embl.de/msmb/ 16.12 Numerical Ecology with R by Daniel Borcard, François Gillet, Pierre Legendre This new edition of Numerical Ecology with R guides readers through an applied exploration of the major methods of multivariate data analysis, as seen through the eyes of three ecologists. It provides a bridge between a textbook of numerical ecology and the implementation of this discipline in the R language. The book begins by examining some exploratory approaches. Paid: $60 Link: https://www.springer.com/us/book/9783319714035 16.13 Orchestrating Single-Cell Analysis with Bioconductor by Aaron Lun, Robert Amezquita, Stephanie Hicks, Raphael Gottardo This is the website for Orchestrating Single-Cell Analysis with Bioconductor, a book that teaches users some common workflows for the analysis of single-cell RNA-seq data (scRNA-seq). Link: https://osca.bioconductor.org/ 16.14 R for applied epidemiology and public health by EpiR authors This handbook is produced by a collaboration of epidemiologists from around the world drawing upon experience with organizations including local, state, provincial, and national health agencies, the World Health Organization (WHO), Médecins Sans Frontières / Doctors without Borders (MSF), hospital systems, and academic institutions. Written by epidemiologists, for epidemiologists. Link: https://epirhandbook.com/ 16.15 R for Conservation and Development Projects: A Primer for Practitioners by Nathan Whitmore This book is aimed at conservation and development practitioners who need to learn and use R in a part-time professional context. It gives people with a non-technical background a set of skills to graph, map, and model in R. It also provides background on data integration in project management and covers fundamental statistical concepts. The book aims to demystify R and give practitioners the confidence to use it. Key Features: Viewing data science as part of a greater knowledge and decision making system Foundation sections on inference, evidence, and data integration Plain English explanations of R functions Relatable examples which are typical of activities undertaken by conservation and development organisations in the developing world Worked examples showing how data analysis can be incorporated into project reports Paid: $60 Link: https://www.routledge.com/R-for-Conservation-and-Development-Projects-A-Primer-for-Practitioners/Whitmore/p/book/9780367205485 16.16 R for Health Data Science by Ewan Harrison, Riinu Pius In this age of information, the manipulation, analysis and interpretation of data have become a fundamental part of professional life. Nowhere more so than in the delivery of healthcare. From the understanding of disease and the development of new treatments, to the diagnosis and management of individual patients, the use of data and technology are now an integral part of the business of healthcare. Those working in healthcare interact daily with data, often without realising it. The conversion of this avalanche of information to useful knowledge is essential for high-quality patient care. An important part of this information revolution is the opportunity for everybody to become involved in data analysis. This democratisation is driven in part by the open source software movement no longer do we require expensive specialised software to do this. The statistical programming language, R, is firmly at the heart of this. This book will take an individual with little or no experience in data science all the way through to the execution of sophisticated analyses. We emphasise the importance of truly understanding the underlying data with liberal use of plotting, rather than relying on opaque and possibly poorly understood statistical tests. There are numerous examples included that can be adapted for your own data, together with our own R packages with easy-to-use functions. Link: https://argoshare.is.ed.ac.uk/healthyr_book/ 16.17 Reproducible Medical Research with R by Peter D.R. Higgins, MD, PhD, MSc This is a book for anyone in the medical field interested in analyzing the data available to them to better understand health, disease, or the delivery of care. This could include nurses, dieticians, psychologists, and PhDs in related fields, as well as medical students, residents, fellows, or doctors in practice. I expect that most learners will be using this book in their spare time at night and on weekends, as the health training curricula are already packed full of information, and there is no room to add skills in reproducible research to the standard curriculum. This book is designed for self-teaching, and many hints and solutions will be provided to avoid roadblocks and frustration. Many learners find themselves wanting to develop reproducible research skills after they have finished their training, and after they have become comfortable with their clinical role. This is the time when they identify and want to address problems faced by patients in their practice with the data they have before them. This book is for you. Link: https://bookdown.org/pdr_higgins/rmrwr/ 16.18 Statistics in R for Biodiversity Conservation Paperback by Carl Smith, Antonio Uzal, Mark Warren A practical handbook to introduce data analysis and model fitting using R to ecologists and conservation biologists. The book is aimed at undergraduate and post-graduate students and provides access to datasets and RScript. Paid: $10 Link: https://www.amazon.co.uk/dp/B08HBLYHQL/ref=cm_sw_r_cp_apa_i_g0luFb86PXJ9Z 16.19 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling by Andrew B. Lawson Progressively more and more attention has been paid to how location affects health outcomes. The area of disease mapping focusses on these problems, and the Bayesian paradigm has a major role to play in the understanding of the complex interplay of context and individual predisposition in such studies of disease. Using R for Bayesian Spatial and Spatio-Temporal Health Modeling provides a major resource for those interested in applying Bayesian methodology in small area health data studies. Paid: $100 Link: https://www.routledge.com/Using-R-for-Bayesian-Spatial-and-Spatio-Temporal-Health-Modeling/Lawson/p/book/9780367490126 16.20 WEHI Intro to Tidy R Course by Brendan Ansell A complete beginners introduction to tidy R for data transformation, visualization and analysis automation with applications in experimental biology. This book is based on a short course developed for biomedical scientists at the WEHI Medical Research Institute. The content is designed to make learners comfortable with using R for exploratory analysis of large data sets, but does not cover statistics. The material and teaching examples draw on popular (non-biological) data sets, as well as gene expression and drug screening data types. Link: https://bookdown.org/ansellbr/WEHI_tidyR_course_book/ "],["machine-learning.html", "17 Machine Learning 17.1 A Minimal rTorch Book 17.2 Explanatory Model Analysis 17.3 Feature Engineering and Selection: A Practical Approach for Predictive Models 17.4 Hands-On Machine Learning with R 17.5 Interpretable Machine Learning 17.6 Lightweight Machine Learning Classics with R Marek Gagolewski 17.7 Machine Learning for Factor Investing 17.8 Mathematics and Programming for Machine Learning with R: From the Ground Up 1st Edition, Kindle 17.9 mlr3 book 17.10 Supervised Machine Learning for Text Analysis in R 17.11 The caret Package 17.12 Tidy Modeling with R", " 17 Machine Learning 17.1 A Minimal rTorch Book by Alfonso R. Reyes Practically, you can do everything you could with PyTorch within the R ecosystem. Link: https://f0nzie.github.io/rtorch-minimal-book/ 17.2 Explanatory Model Analysis by Przemyslaw Biecek, Tomasz Burzykowski Responsible, Fair and Explainable Predictive Modeling with examples in R and Python Link: https://pbiecek.github.io/ema/ 17.3 Feature Engineering and Selection: A Practical Approach for Predictive Models by Max Kuhn, Kjell Johnson The goals of Feature Engineering and Selection are to provide tools for re-representing predictors, to place these tools in the context of a good predictive modeling framework, and to convey our experience of utilizing these tools in practice. Link: http://www.feat.engineering/index.html 17.4 Hands-On Machine Learning with R by Bradley Boehmke, Brandon Greenwell This book provides hands-on modules for many of the most common machine learning methods to include: Generalized low rank models, Clustering algorithms, Autoencoders, Regularized models, Random forests, Gradient boosting machines, Deep neural networks, Stacking / super learners and more! Link: https://bradleyboehmke.github.io/HOML/ 17.5 Interpretable Machine Learning by Christoph Molnar A Guide for Making Black Box Models Explainable Online book Paid: Free or pay what you want $42 Link: https://leanpub.com/interpretable-machine-learning 17.6 Lightweight Machine Learning Classics with R Marek Gagolewski In this book we will take an unpretentious glance at the most fundamental algorithms that have stood the test of time and which form the basis for state-of-the-art solutions of modern AI, which is principally (big) data-driven. Link: https://lmlcr.gagolewski.com/ 17.7 Machine Learning for Factor Investing by Guillaume Coqueret, Tony Guida This book is intended to cover some advanced modelling techniques applied to equity investment strategies that are built on firm characteristics. Link: http://www.mlfactor.com/ 17.8 Mathematics and Programming for Machine Learning with R: From the Ground Up 1st Edition, Kindle by William B. Claster Based on the authors experience in teaching data science for more than 10 years, Mathematics and Programming for Machine Learning with R: From the Ground Up reveals how machine learning algorithms do their magic and explains how these algorithms can be implemented in code. It is designed to provide readers with an understanding of the reasoning behind machine learning algorithms as well as how to program them. Written for novice programmers, the book progresses step-by-step, providing the coding skills needed to implement machine learning algorithms in R. Paid: $40 Link: https://www.amazon.com/Mathematics-Programming-Machine-Learning-Ground-ebook-dp-B08JHDCX9Y/dp/B08JHDCX9Y 17.9 mlr3 book by Michel Lang The mlr3 package and ecosystem provide a generic, object-oriented, and extensible framework for classification, regression, survival analysis, and other machine learning tasks for the R language. They do not implement any learners, but provide a unified interface to many existing learners in R. Link: https://mlr3book.mlr-org.com/ 17.10 Supervised Machine Learning for Text Analysis in R by Emil Hvitfeldt, Julia Silge Modeling as a statistical practice can encompass a wide variety of activities. This book focuses on supervised or predictive modeling for text, using text data to make predictions about the world around us. We use the tidymodels framework for modeling, a consistent and flexible collection of R packages developed to encourage good statistical practice. Link: https://smltar.com/ 17.11 The caret Package by Max Kuhn The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models. Link: https://topepo.github.io/caret/index.html 17.12 Tidy Modeling with R by Max Kuhn, Julia Silge This book provides an introduction to how to use the tidymodels suite of packages to create models using a tidyverse approach and encourages good methodology and statistical practice throughout demonstrated using series of applied examples. Link: https://www.tmwr.org/ "],["network-analysis.html", "18 Network analysis 18.1 Awesome network analysis 18.2 Handbook of Graphs and Networks in People Analytics: With Examples in R and Python 18.3 Network Analysis in R Cookbook 18.4 Statistical Analysis of Network Data with R", " 18 Network analysis 18.1 Awesome network analysis Not a book, but a compendium of resources that look really valuable. Link: https://github.com/briatte/awesome-network-analysis 18.2 Handbook of Graphs and Networks in People Analytics: With Examples in R and Python by Keith McNulty The technology of graphs is all around us, and enables so many of the ways in which we live our lives today. That same technology is also available to us at no cost as an analytic tool to allow us to better understand network structures and dynamics in the fields of science, technology, economics, sociology and psychology to name just a few. It is available to academics and practitioners alike, and can be used on problems ranging from a very small network analysis which takes a few minutes on a laptop, to massive scale network mining requiring days or weeks of processing time. But heres the problem: few people really know how to do network analysis. It is still considered by many as a deep specialism or even a dark art. It shouldnt be. This book aims to make the field of graph and network analysis more approachable to students and professionals by explaining the most important elements of theory and sharing common methodologies using open source programming languages like R and Python. It does so by explaining theory in as much detail as is necessary to support analytical curiosity and interpretation, and by using a wide array of example data sets and code snippets to demonstrate the specific implementation and interpretation of methodologies. Link: https://ona-book.org/ 18.3 Network Analysis in R Cookbook by Sacha Epskamp [Oscar Baruffa: Note this resource is a bit out of date, but because there are so few available on this topic, and it might still be good as a reference, itll stay in Big Book of R for now.] Link: https://web.archive.org/web/20210414173702/http://sachaepskamp.com/files/Cookbook.html 18.4 Statistical Analysis of Network Data with R by Kolaczyk, Eric D., Csárdi, Gábor This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. Paid: $65 Link: https://www.springer.com/us/book/9781493909834#otherversion=9781493909827 "],["packages.html", "19 Packages 19.1 A Minimal Book Example 19.2 A Minimal rTorch Book 19.3 ComplexHeatmap Complete Reference 19.4 Create, Publish, and Analyze Personal Websites Using R and RStudio 19.5 data.table in R The Complete Beginners Guide 19.6 ggplot2: Elegant Graphics for Data Analysis 19.7 GT Cookbook 19.8 Highcharter Cookbook 19.9 knitr 19.10 mlr3 book 19.11 The caret Package 19.12 The Data Validation Cookbook 19.13 The lidR package 19.14 The targets R Package User Manual 19.15 The Tidyverse Cookbook", " 19 Packages 19.1 A Minimal Book Example This is a sample book written in Markdown. Link: https://benmarwick.github.io/bookdown-ort/ 19.2 A Minimal rTorch Book by Alfonso R. Reyes Practically, you can do everything you could with PyTorch within the R ecosystem. Link: https://f0nzie.github.io/rtorch-minimal-book/ 19.3 ComplexHeatmap Complete Reference by Zuguang Gu The ComplexHeatmap package is used to generate heatmap visualizations. It is a highly flexible tool to arrange multiple heatmaps and supports various annotation graphics for high-dimensional data. These visualizations are efficient to visualize visualizations between different sources of data sets and reveal potential patterns. This book here contains the full documentation to using the ComplexHeatmap package effectively with plenty of small and complex examples to help you create your own complex heatmap data vizualization. Link: https://jokergoo.github.io/ComplexHeatmap-reference/book/ 19.4 Create, Publish, and Analyze Personal Websites Using R and RStudio by Danny Morris A free, digital handbook with step-by-step instructions for launching your own personal website using R, RStudio, and other freely available technologies including GitHub, Hugo, Netlify, and Google Analytics. Link: https://r4sites-book.netlify.app/ 19.5 data.table in R The Complete Beginners Guide by Selva Prabhakaran data.table is a package is used for working with tabular data in R. It provides the efficient data.table object which is a much improved version of the default data.frame. It is super fast and has intuitive and terse syntax. If you know R language and havent picked up the data.table package yet, then this tutorial guide is a great place to start. Link: https://www.machinelearningplus.com/data-manipulation/datatable-in-r-complete-guide/ 19.6 ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham ggplot2 is an R package for producing statistical, or data, graphics. Unlike most other graphics packages, ggplot2 has an underlying grammar, based on the Grammar of Graphics (Wilkinson 2005), that allows you to compose graphs by combining independent components. This makes ggplot2 powerful. Rather than being limited to sets of pre-defined graphics, you can create novel graphics that are tailored to your specific problem. Link: https://ggplot2-book.org/ 19.7 GT Cookbook by Thomas Mock This cookbook attempts to walk through many of the example usecases for gt, and provide useful commentary around the use of the various gt functions. The full gt documentation has other more succinct examples and full function arguments. For advanced use cases, make sure to check out the Advanced Cookbook Link: https://themockup.blog/static/gt-cookbook.html 19.8 Highcharter Cookbook by Tom Bishop Highcharter is an R implementation of the highcharts javascript library, enabled by Rs htmlwidgets package. Most of the highcharts functionality is implemented through highcharter however the documentation is a little light. This guide will provide examples on how to create and customise various graphs whilst providing some tips on how to think about the package that will help you build and debug your more ambitious charts. Link: https://www.tmbish.me/lab/highcharter-cookbook/ 19.9 knitr by Yihui Xie Dynamic documents with R and knitr! The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package. Link: https://yihui.org/knitr/ 19.10 mlr3 book by Michel Lang The mlr3 package and ecosystem provide a generic, object-oriented, and extensible framework for classification, regression, survival analysis, and other machine learning tasks for the R language. They do not implement any learners, but provide a unified interface to many existing learners in R. Link: https://mlr3book.mlr-org.com/ 19.11 The caret Package by Max Kuhn The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models. Link: https://topepo.github.io/caret/index.html 19.12 The Data Validation Cookbook by Mark P.J. van der Loo The purposes of this book include demonstrating the main tools and workflows of the validate package, giving examples of common data validation tasks, and showing how to analyze data validation results. Link: https://data-cleaning.github.io/validate/ 19.13 The lidR package by Jean-Romain Roussel, Tristan R.H. Goodbody, Piotr Tompalski lidR is an R package for manipulating and visualizating airborne laser scanning (ALS) data with an emphasis on forestry applications. The package is entirely open source and is integrated within the geospatial R ecosytem (i.e. raster, sp, sf, rgdal etc.). This guide has been written to help both the ALS novice, as well as seasoned point cloud processing veterans. Link: https://jean-romain.github.io/lidRbook/ 19.14 The targets R Package User Manual by Will Landau The targets package is a Make-like pipeline toolkit for Statistics and data science in R. With targets, you can maintain a reproducible workflow without repeating yourself. targets learns how your pipeline fits together, skips costly runtime for tasks that are already up to date, runs only the necessary computation, supports implicit parallel computing, abstracts files as R objects, and shows tangible evidence that the results match the underlying code and data. Link: https://books.ropensci.org/targets/ 19.15 The Tidyverse Cookbook by Edited by Garrett Grolemund This book collects code recipes for doing data science with Rs tidyverse. Each recipe solves a single common task, with a minimum of discussion. Link: https://rstudio-education.github.io/tidyverse-cookbook/ "],["r-package-development.html", "20 R package development 20.1 HTTP testing in R 20.2 R packages 20.3 rOpenSci Packages: Development, Maintenance, and Peer Review", " 20 R package development 20.1 HTTP testing in R by Scott Chamberlain, Maëlle Salmon This book is meant to be a free, central reference for developers of R packages accessing web resources, to help them have a faster and more robust development. Our aim is to develop an useful guidance to go with the great recent tools that vcr, webmockr, httptest and presser are. Link: https://books.ropensci.org/http-testing/ 20.2 R packages by Hadley Wickham, Jenny Bryan Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. In this section youll learn how to turn your code into packages that others can easily download and use. Writing a package can seem overwhelming at first. So start with the basics and improve it over time. It doesnt matter if your first version isnt perfect as long as the next version is better. Link: https://r-pkgs.org/ 20.3 rOpenSci Packages: Development, Maintenance, and Peer Review by rOpenSci software review editorial team This book is a package development guide for authors, maintainers, reviewers and editors of rOpenSci. Link: https://devguide.ropensci.org/index.html "],["r-programming.html", "21 R programming 21.1 A sufficient Introduction to R 21.2 Advanced Object-Oriented Programming in R 21.3 Advanced R 21.4 Advanced R Solutions 21.5 An Introduction to Data Analysis 21.6 An Introduction to R 21.7 Another Book on Data Science : Learn R and Python in Parallel 21.8 Best Coding Practices for R 21.9 Book of R: A First Course in Programming and Statistics 21.10 Cookbook for R 21.11 Data Analytics with R: A Recipe book 21.12 Domain-Specific Languages in R 21.13 Efficient R programming 21.14 Field Guide to the R Ecosystem 21.15 Functional Data Structures in R 21.16 Functional Programming 21.17 Functional Programming in R 21.18 Hands-On Programming with R 21.19 Introduction to Programming with R 21.20 Introduction to R - R spatial 21.21 Mastering Software Development in R 21.22 Metaprogramming in R 21.23 Modern R with the tidyverse 21.24 R Cookbook - 2nd edition 21.25 R Development Guide 21.26 R for Excel users 21.27 R for Graduate Students 21.28 R language for programmers 21.29 Rcpp for everyone 21.30 stats545 Data wrangling, exploration, and analysis with R 21.31 The R Inferno 21.32 The R Language 21.33 The Tidyverse Cookbook 21.34 The tidyverse style guide 21.35 Tidy evaluation 21.36 Tidyverse design guide 21.37 Tidyverse Skills for Data Science 21.38 What They Forgot to Teach You About R 21.39 YaRrr! The Pirates Guide to R", " 21 R programming 21.1 A sufficient Introduction to R by Derek l. Sonderegger This book is intended to guide people that are completely new to programming along a path towards a useful skill level using R. I believe that while people can get by with just copying code chunks, that doesnt give them the background information to modify the code in non-trivial ways. Therefore we will spend more time on foundational details than a crash-course would. Link: https://dereksonderegger.github.io/570L/ 21.2 Advanced Object-Oriented Programming in R by Thomas Mailund Learn how to write object-oriented programs in R and how to construct classes and class hierarchies in the three object-oriented systems available in R. This book gives an introduction to object-oriented programming in the R programming language and shows you how to use and apply R in an object-oriented manner. You will then be able to use this powerful programming style in your own statistical programming projects to write flexible and extendable software. Paid: $20 Link: https://amzn.to/2wZnBbp 21.3 Advanced R by Hadley Whickham This is the companion website for Advanced R, a book in Chapman & Halls R Series. The book is designed primarily for R users who want to improve their programming skills and understanding of the language. It should also be useful for programmers coming to R from other languages, as it explains some of Rs quirks and shows how some parts that seem horrible do have a positive side. The book is free online. (Ignore the message redirecting you to the 2nd edition, this is the latest edition) Link: http://adv-r.had.co.nz/ 21.4 Advanced R Solutions by Malte Grosser, Henning Bumann, Hadley Wickham This book offers solutions to the exercises from Hadley Wickhams book Advanced R (Edition 2). It is work in progress and under active development. The 2nd edition of Advanced R has been published and we are currently working towards completion. Link: https://advanced-r-solutions.rbind.io/ 21.5 An Introduction to Data Analysis by Michael Franke This book provides basic reading material for an introduction to data analysis. It uses R to handle, plot and analyze data. After covering the use of R for data wrangling and plotting, the book introduces key concepts of data analysis from a Bayesian and a frequentist tradition. This text is intended for use as a first introduction to statistics for an audience with some affinity towards programming, but no prior exposition to R. Link: https://michael-franke.github.io/intro-data-analysis/index.html 21.6 An Introduction to R by Alex Douglas, Deon Roos, Ana Couto, Francesca Mancini, David Lusseau The aim of this book is to introduce you to using R, a powerful and flexible interactive environment for statistical computing and research. R in itself is not difficult to learn, but as with learning any new language (spoken or computer) the initial learning curve can be a little steep and somewhat daunting. We have tried to simplify the content of this book as much as possible and have based it on our own personal experience of teaching (and learning) R over the last 15 years. It is not intended to cover everything there is to know about R - that would be an impossible task. Neither is it intended to be an introductory statistics course, although you will be using some simple statistics to highlight some of Rs capabilities. The main aim of this book is to help you climb the initial learning curve and provide you with the basic skills and experience (and confidence!) to enable you to further your experience in using R. Link: https://intro2r.com/ 21.7 Another Book on Data Science : Learn R and Python in Parallel by Nailong Zhang There has been considerable debate over choosing R vs. Python for Data Science. Based on my limited knowledge/experience, both R and Python are great languages and are worth learning; so why not learn them together? Besides the side-by-side comparison of the two popular languages used in Data Science, this book also focuses on the translation from mathematical models to codes. In the book, the audience could find the applications/implementations of some important algorithms from scratch, such as maximum likelihood estimation, inversion sampling, copula simulation, simulated annealing, bootstrapping, linear regression (lasso/ridge regression), logistic regression, gradient boosting trees, etc. Link: https://www.anotherbookondatascience.com/ 21.8 Best Coding Practices for R by Vikram Singh Rawat) R is a huge language and I would like to share the little knowledge I have in the subject. I dont claim to be an expert but this book will guide you in the right path wherever possible. Most of the books about R programming language will tell you what are the possible ways to do one thing in R. This book will only tell you one way to do that thing correctly. Link: https://bookdown.org/content/d1e53ac9-28ce-472f-bc2c-f499f18264a3/ 21.9 Book of R: A First Course in Programming and Statistics by Tilman M. Davies The Book of R is a comprehensive, beginner-friendly guide to R, the worlds most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, youll find everything you need to begin using R effectively for statistical analysis. Youll start with the basics, like how to handle data and write simple programs, before moving on to more advanced topics, like producing statistical summaries of your data and performing statistical tests and modeling. Youll even learn how to create impressive data visualizations with Rs basic graphics tools and contributed packages, like ggplot2 and ggvis, as well as interactive 3D visualizations using the rgl package. Paid: $40 Link: https://nostarch.com/bookofr 21.10 Cookbook for R by Winston Chang The goal of the cookbook is to provide solutions to common tasks and problems in analyzing data. Not to be confused with R Cookbook Link: http://www.cookbook-r.com/ 21.11 Data Analytics with R: A Recipe book by Ryan Garnett The structure and design of this book is based on iterative learning, starting with the most basic and build by adding one new element concept. the book has been structured to be small easily consumable chunks similar to that of a recipe card. The concept for a recipe card is that they are self contained, providing all the ingredients, preparation, and instructions required to create a meal. While a cookbook may consist of many recipes, there is no expectation to read, understand, and master all the recipes in order to prepare a meal. Following this as the central theme the book, it has been designed as a number of data analytics recipes focusing on the R language. Link: https://ryangarnett.github.io/r-recipe-book 21.12 Domain-Specific Languages in R by Thomas Mailund Gain an accelerated introduction to domain-specific languages in R, including coverage of regular expressions. This compact, in-depth book shows you how DSLs are programming languages specialized for a particular purpose, as opposed to general purpose programming languages. Along the way, youll learn to specify tasks you want to do in a precise way and achieve programming goals within a domain-specific context. Domain-Specific Languages in R includes examples of DSLs including large data sets or matrix multiplication; pattern matching DSLs for application in computer vision; and DSLs for continuous time Markov chains and their applications in data science. After reading and using this book, youll understand how to write DSLs in R and have skills you can extrapolate to other programming languages. Paid: $25 Link: https://amzn.to/2CDqhAU 21.13 Efficient R programming by Colin Gillespie, Robin Lovelace This book is for anyone who wants to make their R code faster to type, faster to run and more scalable. These considerations generally come after learning the very basics of R for data analysis. Link: https://csgillespie.github.io/efficientR/ 21.14 Field Guide to the R Ecosystem by Mark Sellors This field guide aims to introduce the reader to the main components of the R ecosystem that may be encountered in the field.Whatever the reason, whilst there is a wealth of in-depth information for people actually using the language, I could find precious little information that provided the sort of overview of the ecosystem that I know Id have appreciated when I first came to the language. And with that thought, a field guide is born Link: https://fg2re.sellorm.com/ 21.15 Functional Data Structures in R by Thomas Mailund Get an introduction to functional data structures using R and write more effective code and gain performance for your programs. This book teaches you workarounds because data in functional languages is not mutable: for example youll learn how to change variable-value bindings by modifying environments, which can be exploited to emulate pointers and implement traditional data structures. Youll also see how, by abandoning traditional data structures, you can manipulate structures by building new versions rather than modifying them. Youll discover how these so-called functional data structures are different from the traditional data structures you might know, but are worth understanding to do serious algorithmic programming in a functional language such as R. Paid: $20 Link: https://amzn.to/2oUG2cP 21.16 Functional Programming by Sara Altman, Bill Behrman, Hadley Wickham This book is a practical introduction to functional programming using the tidyverse. Link: https://dcl-prog.stanford.edu/ 21.17 Functional Programming in R by Thomas Mailund Master functions and discover how to write functional programs in R. In this concise book, youll make your functions pure by avoiding side-effects; youll write functions that manipulate other functions, and youll construct complex functions using simpler functions as building blocks. Paid: $20 Link: https://amzn.to/2wY4m11 21.18 Hands-On Programming with R by Garrett Grolemund This book will teach you how to program in R, with hands-on examples. I wrote it for non-programmers to provide a friendly introduction to the R language. Youll learn how to load data, assemble and disassemble data objects, navigate Rs environment system, write your own functions, and use all of Rs programming tools. Throughout the book, youll use your newfound skills to solve practical data science problems. Link: https://rstudio-education.github.io/hopr/ 21.19 Introduction to Programming with R by Reto Stauffer, Joanna Chimiak-Opoka, Thorsten Simon, Achim Zeileis A learning resource for programming novices who want to learn programming using the statistical programming language R. While one of the major strengths of R is the broad variety of packages for statistics and data science, this resource focuses on learning and understanding basic programming concepts using base R. Only a couple of additional packages are used and/or briefly discussed for special tasks. This online book is specifically written for participants of the course Introduction to Programming: Programming in R offered by the Digital Science Center at Universität Innsbruck. Link: https://eeecon.uibk.ac.at/~discdown/rprogramming/index.html 21.20 Introduction to R - R spatial by R Spatial This document provides a concise introduction to R. It emphasizes what you need to know to be able to use the language in any context. There is no fancy statistical analysis here. We just present the basics of the R language itself. We do not assume that you have done any computer programming before (but we do assume that you think it is about time you did). Experienced R users obviously need not read this. But the material may be useful if you want to refresh your memory, if you have not used R much, or if you feel confused. Link: https://rspatial.org/intr/index.html 21.21 Mastering Software Development in R by Roger D. Peng, Sean Kross, Brooke Anderson This book covers R software development for building data science tools. This book provides rigorous training in the R language and covers modern software development practices for building tools that are highly reusable, modular, and suitable for use in a team-based environment or a community of developers. Paid: Free or pay what you want $20 Link: https://leanpub.com/msdr 21.22 Metaprogramming in R by Thomas Mailund Learn how to manipulate functions and expressions to modify how the R language interprets itself. This book is an introduction to metaprogramming in the R language, so you will write programs to manipulate other programs. Metaprogramming in R shows you how to treat code as data that you can generate, analyze, or modify. Paid: $20 Link: https://amzn.to/2x1cYUR 21.23 Modern R with the tidyverse by Bruno Rodrigues This book can be useful to different audiences. If you have never used R in your life, and want to start, start with Chapter 1 of this book. Chapter 1 to 3 are the very basics, and should be easy to follow up to Chapter 9. Starting with Chapter 9, it gets more technical, and will be harder to follow. But I suggest you keep on going, and do not hesitate to contact me for help if you struggle! Chapter 9 is also where you can start if you are already familiar with R and the {tidyverse}, but not functional programming. If you are familiar with R but not the {tidyverse} (or have no clue what the {tidyverse} is), then you can start with Chapter 4. If you are familiar with R, the {tidyverse} and functional programming, you might still be interested in this book, especially Chapter 9 and 10, which deal with package development and further advanced topics respectively. Link: https://b-rodrigues.github.io/modern_R/ 21.24 R Cookbook - 2nd edition by JD Long, Paul Teetor I have written software professionally in perhaps a dozen programming languages, and the hardest language for me to learn has been R. The language is actually fairly simple, but it is unconventional. These notes are intended to make the language easier to learn for someone used to more commonly used languages such as C++, Java, Perl, etc. Not to be confused with Cookbook for R Link: https://rc2e.com/index.html 21.25 R Development Guide by R Contribution Working Group This guide is heavily influenced by the Python Developer Guide, and is a comprehensive resource for contributing to R Core for both new and experienced contributors. It is maintained by the R Contribution Working Group. We welcome your contributions to R Core! Link: https://forwards.github.io/rdevguide/ 21.26 R for Excel users by Julie Lowndes, Allison Horst This course is for Excel users who want to add or integrate R and RStudio into their existing data analysis toolkit. It is a friendly intro to becoming a modern R user, full of tidyverse, RMarkdown, GitHub, collaboration & reproducibility. Link: https://rstudio-conf-2020.github.io/r-for-excel/ 21.27 R for Graduate Students by Y. Wendy Huynh Hello! My name is Wendy Huynh and I am a current PhD student working in the behavioral neurosciences. I began my R journey at the end of my first year of graduate school, slowly and painfully piecing together code. Although programming was never really part of my program, I now see it as an integral part of my work. Many fellow graduate students expressed interest in learning R, but didnt know where to begin. Programming with R is still relatively niche among my cohort and there are very few formal classes teaching this subject. Although there are many amazing guides/textbooks for R out there, very few of them featured examples relevant for my specific needs and were user-friendly enough for a true beginner. In the Fall of my second year, I began teaching a new graduate student in my lab everything I knew about R. However, I quickly found that teaching R even just to one person was very time consuming. I decided to write up assignments as a short guide to R. After writing a short 11 page first assignment and receiving positive feedback, I began writing up a second assignment. Then a third. Soon enough, I had written enough pages that I couldnt deny that this short guide had turned into a book. Link: https://bookdown.org/yih_huynh/Guide-to-R-Book/ 21.28 R language for programmers by John D Cook I have written software professionally in perhaps a dozen programming languages, and the hardest language for me to learn has been R. The language is actually fairly simple, but it is unconventional. These notes are intended to make the language easier to learn for someone used to more commonly used languages such as C++, Java, Perl, etc. Link: https://www.johndcook.com/blog/r_language_for_programmers/ 21.29 Rcpp for everyone by Masaki E. Tsuda Rcpp is a package that enables you to implement R functions in C++. It is easy to use even without deep knowledge of C++, because it is implemented so as to write your C++ code in a style similar to R. And Rcpp does not sacrifice execution speed for the ease of use, anyone can get high performance outcome. This document focuses on providing necessary information to users who are not familiar with C++. Therefore, in some cases, I explain usage of Rcpp conceptually rather than describing accurately from the viewpoint of C++, so that I hope readers can easily understand it. Link: https://teuder.github.io/rcpp4everyone_en/ 21.30 stats545 Data wrangling, exploration, and analysis with R by Jenny Bryan Learn how to: Explore, groom, visualize, and analyze data, make all of that reproducible, reusable, and shareable, using R. This site is about everything that comes up during data analysis except for statistical modelling and inference. Link: https://stat545.com/ 21.31 The R Inferno by Patrick Burns If Rs behaviour has ever suprised you, then this book is a guide for many more surprises, written in the style of Dante. Its a concise report on number of common-errors and unexpected behaviours in R. This book would make more sense, if you have been programming and are familiar with such behaviours (not all though), as there is little time spent on explaining why part of behaviour. As mentioned, its a concise book, 126 pages only. Link: https://www.burns-stat.com/pages/Tutor/R_inferno.pdf 21.32 The R Language by R Core team A collection of manuals: 1. An Introduction to R 1. The R Language Definition 1. Writing R Extensions 1. R Installation and Administration 1. R Data Import/Export 1. R Internals Link: https://stat.ethz.ch/R-manual/R-patched/doc/html/ 21.33 The Tidyverse Cookbook by Edited by Garrett Grolemund This book collects code recipes for doing data science with Rs tidyverse. Each recipe solves a single common task, with a minimum of discussion. Link: https://rstudio-education.github.io/tidyverse-cookbook/ 21.34 The tidyverse style guide by Hadley Whickham Good coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread. This site describes the style used throughout the tidyverse. It was derived from Googles original R Style Guide - but Googles current guide is derived from the tidyverse style guide. Link: https://style.tidyverse.org/ 21.35 Tidy evaluation by Lionel Henry, Hadley Wickham This guide is now superseded by more recent efforts at documenting tidy evaluation in a user-friendly way. We now recommend reading: The new Programming with dplyr vignette. The Using ggplot2 in packages vignette. (Oscars note: Im keeping this in for my own reference) Link: https://tidyeval.tidyverse.org/ 21.36 Tidyverse design guide by Tidyverse team The goal of this book is to help you write better R code. It has four main components: Design problems which lead to suboptimal outcomes. Useful patterns that help solve common problems. Key principles that help you balance conflicting patterns. Selected case studies that help you see how all the pieces fit together with real code. It is used by the tidyverse team to promote consistency across packages in the core tidyverse. Link: https://design.tidyverse.org/ 21.37 Tidyverse Skills for Data Science by Carrie Wright, Shannon E. Ellis, Stephanie C. Hicks, Roger D. Peng Book and Course formats This course introduces a powerful set of data science tools known as the Tidyverse. The Tidyverse has revolutionized the way in which data scientists do almost every aspect of their job. We will cover the simple idea of tidy data and how this idea serves to organize data for analysis and modeling. We will also cover how non-tidy data can be transformed to tidy data, the data science project life cycle, and the ecosystem of Tidyverse R packages that can be used to execute a data science project. Book format https://jhudatascience.org/tidyversecourse/ Ebook: https://leanpub.com/tidyverseskillsdatascience Course format https://www.coursera.org/specializations/tidyverse-data-science-r Link: https://jhudatascience.org/tidyversecourse/ 21.38 What They Forgot to Teach You About R by Jenny Bryan, Jim Hester The initial impetus for creating these materials is a two-day hands-on workshop. The target learner: Has a moderate amount of R and RStudio experience.Is largely self-taught.Suspects they have drifted into some idiosyncratic habits that may slow them down or make their work products more brittle.Is interested in (re)designing their R lifestyle, to be more effective and more self-sufficient. Link: https://rstats.wtf/ 21.39 YaRrr! The Pirates Guide to R by Nathaniel D. Phillips Learn R from the ground up. Let me make something very, very clear I did not write this book. This whole story started in the Summer of 2015. I was taking a late night swim on the Bodensee in Konstanz and saw a rusty object sticking out of the water. Upon digging it out, I realized it was an ancient usb-stick with the word YaRrr inscribed on the side. Intrigued, I brought it home and plugged it into my laptop. Inside the stick, I found a single pdf file written entirely in pirate-speak. After watching several pirate movies, I learned enough pirate-speak to begin translating the text to English. Sure enough, the book turned out to be an introduction to R called The Pirates Guide to R. Link: https://bookdown.org/ndphillips/YaRrr/ "],["reports-r-markdown-and-knitr.html", "22 Reports: R Markdown and knitr 22.1 Getting used to R, RStudio, and R Markdown 22.2 Introduction to R Markdown 22.3 knitr 22.4 Pimp my RMD: a few tips for R Markdown 22.5 R Markdown Cookbook 22.6 R Markdown: The Definitive Guide 22.7 Report Writing for Data Science in R 22.8 Reproducible Research with R and RStudio 22.9 RMarkdown for Scientists", " 22 Reports: R Markdown and knitr 22.1 Getting used to R, RStudio, and R Markdown by Chester Ismay This resource is designed to provide new users to R, RStudio, and R Markdown with the introductory steps needed to begin their own reproducible research. A review of many of the common R errors encountered (and what they mean in laymans terms) will also provided be provided. Link: https://bookdown.org/chesterismay/rbasics/ 22.2 Introduction to R Markdown by Michael Clark The goal is for you to be able to get quickly started with your own document, and understand the possibilities available to you. You will get a feel for the basic mechanics at play, as well as have ideas on how to customize the result to your own tastes. Link: https://m-clark.github.io/Introduction-to-Rmarkdown/ 22.3 knitr by Yihui Xie Dynamic documents with R and knitr! The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package. Link: https://yihui.org/knitr/ 22.4 Pimp my RMD: a few tips for R Markdown by Yan Holtz R markdown creates interactive reports from R code. This post provides a few tips I use on a daily basis to improve the appearance of output documents. Link: https://holtzy.github.io/Pimp-my-rmd/ 22.5 R Markdown Cookbook by Yihui Xie, Christophe Dervieux, Emily Riederer This book showcases short, practical examples of lesser-known tips and tricks to helps users get the most out of these tools. After reading this book, you will understand how R Markdown documents are transformed from plain text and how you may customize nearly every step of this processing. For example, you will learn how to dynamically create content from R code, reference code in other documents or chunks, control the formatting with customer templates, fine-tune how your code is processed, and incorporate multiple languages into your analysis. Link: https://bookdown.org/yihui/rmarkdown-cookbook/ 22.6 R Markdown: The Definitive Guide by Yihui Xie, J. J. Allaire, Garrett Grolemund The first official book authored by the core R Markdown developers that provides a comprehensive and accurate reference to the R Markdown ecosystem. With R Markdown, you can easily create reproducible data analysis reports, presentations, dashboards, interactive applications, books, dissertations, websites, and journal articles, while enjoying the simplicity of Markdown and the great power of R and other languages. Link: https://bookdown.org/yihui/rmarkdown/ 22.7 Report Writing for Data Science in R by [Roger D. Peng]](https://twitter.com/rdpeng) This book teaches the fundamental concepts and tools behind reporting modern data analyses in a reproducible manner. As data analyses become increasingly complex, the need for clear and reproducible report writing is greater than ever. Paid: Free or pay what you want $10 Link: https://leanpub.com/reportwriting 22.8 Reproducible Research with R and RStudio by Christopher Gandrud This book present all the Tools for Gathering and Analyzing Data and Presenting Results Reproducible Research with R and RStudio through practical examples. The book can be reproduced by using the R package bookdown. You can buy a copy at: https://www.routledge.com/Reproducible-Research-with-R-and-RStudio/Gandrud/p/book/9780367143985 Link: https://github.com/christophergandrud/Rep-Res-Book> Also, you can buy the copy. 22.9 RMarkdown for Scientists by Nicholas Tierney This is a book on rmarkdown, aimed for scientists. It was initially developed as a 3 hour workshop, but is now developed into a resource that will grow and change over time as a living book. Link: https://rmd4sci.njtierney.com/ "],["shiny.html", "23 Shiny 23.1 A gRadual intRoduction to Shiny 23.2 Engineering Production-Grade Shiny Apps 23.3 JavaScript 4 Shiny - Field Notes 23.4 JavaScript for R 23.5 Mastering Shiny 23.6 Mastering Shiny Solutions 23.7 Outstanding User Interfaces with Shiny 23.8 Shiny Production with AWS Book 23.9 Supplement to Shiny in Production", " 23 Shiny 23.1 A gRadual intRoduction to Shiny by Ted Laderas, Jessica Minnier By the end of this workshop, you should be able to: Browse examples in the shiny gallery and understand how they work.Understand the components of a Shiny app and how they communicate.Learn three basic design patterns to the shiny apps. Link: https://laderast.github.io/gradual_shiny/ 23.2 Engineering Production-Grade Shiny Apps by Colin Fay, Sébastien Rochette, Vincent Guyader, Cervan Girard This book will not get you started with Shiny, nor talk how to work with Shiny once it is sent to production. What well see is the process of building an application that will later be sent to production. Link: https://engineering-shiny.org/ 23.3 JavaScript 4 Shiny - Field Notes by Colin Fay JavaScript in practice for Shiny users. Link: https://connect.thinkr.fr/js4shinyfieldnotes/ 23.4 JavaScript for R by John Coene Learn how to build your own data visualisation packages, improve shiny with JavaScript, and use JavaScript for computations. Link: https://javascript-for-r.com 23.5 Mastering Shiny by Hadley Wickham This book complements Shinys online documentation and is intended to help app authors develop a deeper understanding of Shiny. After reading this book, youll be able to write apps that have more customized UI, more maintainable code, and better performance and scalability. Link: https://mastering-shiny.org/ 23.6 Mastering Shiny Solutions by Maya Gans, Marly Gotti This book offers solutions to the exercises from Hadley Wickhams book Mastering Shiny. It is a work in progress and under active development. Link: https://mastering-shiny-solutions.org 23.7 Outstanding User Interfaces with Shiny by David Granjon This book will help you to: Manipulate Shiny tags from R to create custom layouts. Harness the power of CSS and JavaScript to quickly design apps standing out from the pack. Discover the steps to import and convert existing web frameworks like Bootstrap 4, framework7 and more Learn how Shiny internally deals with inputs. Learn more about less documented Shiny mechanisms (websockets, sessions, ) Link: https://divadnojnarg.github.io/outstanding-shiny-ui/ 23.8 Shiny Production with AWS Book by Matt Doncho A big problem exists No one teaches Data Scientists how to deploy web applications. You spend all of this time building Shiny web applications. And then [silence]. This book alongside the Shiny Developer with AWS Course (DS4B 202A-R) solves this problem - teaching Data Scientists how to deploy, host, and maintain web applications. Link: https://business-science.github.io/shiny-production-with-aws-book/ 23.9 Supplement to Shiny in Production This document is full of supplemental resources and content from the Shiny in Production Workshop delievered at rstudio::conf 2019. Link: https://kellobri.github.io/shiny-prod-book/ "],["social-science.html", "24 Social Science 24.1 Analyzing US Census Data: Methods, Maps, and Models in R 24.2 Composite Indicator Development and Analysis in R with COINr 24.3 Computing for the Social Sciences 24.4 Crime by the Numbers: A Criminologists Guide to R 24.5 Crime by the Numbers: A Criminologists Guide to R 24.6 Introduction to R for Social Scientists:A Tidy Programming Approach 24.7 Public Policy Analytics: Code & Context for Data Science in Government 24.8 Social Data Science with R 24.9 The Plain Persons Guide to Plain Text Social Science 24.10 Using R for Data Analysis in Social Sciences: A Research Project-Oriented Approach", " 24 Social Science 24.1 Analyzing US Census Data: Methods, Maps, and Models in R by Kyle Walker Census data are widely used in the United States across numerous research and applied fields, including education, business, journalism, and many others. Until recently, the process of working with US Census data has required the use of a wide array of web interfaces and software platforms to prepare, map, and present data products. The goal of this book is to illustrate the utility of the R programming language for handling these tasks, allowing Census data users to manage their projects in a single computing environment. Link: https://walker-data.com/census-r/ 24.2 Composite Indicator Development and Analysis in R with COINr by William Becker Composite indicators are aggregations of indicators which aim to measure (usually socio-economic) complex and multidimensional concepts which are difficult to define, and cannot be measured directly. Examples include innovation, human development, environmental performance, and so on. This book gives a detailed guide on building composite indicators in R, focusing on the recent COINr package, which is an end-to-end development environment for composite indicators. Although COINr is the main tool used in the book, it also gives general explanation and guidance on composite indicator construction and analysis in R, ranging from normalisation, aggregation, multivariate analysis and global sensitivity analysis. Link: https://bluefoxr.github.io/COINrDoc/ 24.3 Computing for the Social Sciences by Dr. Benjamin Soltoff The goal of this course is to teach you basic computational skills and provide you with the means to learn what you need to know for your own research. I start from the perspective that you want to analyze data, and programming is a means to that end. You will not become an expert programmer - that is a given. But you will learn the basic skills and techniques necessary to conduct computational social science, and gain the confidence necessary to learn new techniques as you encounter them in your research. We will cover many different topics in this course, including: Elementary programming techniques (e.g. loops, conditional statements, functions) Writing reusable, interpretable code Problem-solving - debugging programs for errors Obtaining, importing, and munging data from a variety of sources Performing statistical analysis Visualizing information Creating interactive reports Generating reproducible research Link: https://cfss.uchicago.edu/notes/intro-to-course/ 24.4 Crime by the Numbers: A Criminologists Guide to R by Jacob Kaplan This book introduces the programming language R and is meant for undergrads or graduate students studying criminology. R is a programming language that is well-suited to the type of work frequently done in criminology - taking messy data and turning it into useful information. While R is a useful tool for many fields of study, this book focuses on the skills criminologists should know and uses crime data for the example data sets. Link: https://crimebythenumbers.com 24.5 Crime by the Numbers: A Criminologists Guide to R by Jacob Kaplan This book introduces the programming language R and is meant for undergrads or graduate students studying criminology. R is a programming language that is well-suited to the type of work frequently done in criminology - taking messy data and turning it into useful information. While R is a useful tool for many fields of study, this book focuses on the skills criminologists should know and uses crime data for the example data sets. Link: https://crimebythenumbers.com/ 24.6 Introduction to R for Social Scientists:A Tidy Programming Approach by Ryan Kennedy, Philip Waggoner Introduction to R for Social Scientists: A Tidy Programming Approach introduces the Tidy approach to programming in R for social science research to help quantitative researchers develop a modern technical toolbox. The Tidy approach is built around consistent syntax, common grammar, and stacked code, which contribute to clear, efficient programming. The authors include hundreds of lines of code to demonstrate a suite of techniques for developing and debugging an efficient social science research workflow. Link: https://i2rss.weebly.com/# 24.7 Public Policy Analytics: Code & Context for Data Science in Government by Ken Steif, Ph.D The goal of this book is to make data science accessible to social scientists and City Planners, in particular. I hope to convince readers that one with strong domain expertise plus intermediate data skills can have a greater impact in government than the sharpest computer scientist who has never studied economics, sociology, public health, political science, criminology etc. Link: https://urbanspatial.github.io/PublicPolicyAnalytics/ 24.8 Social Data Science with R by Daniel Anderson, Brendan Cullen, Ouafaa Hmaddi Heres an intro about why R is great and the cool things you can do with it and new problems you can address. Link: https://www.sds.pub/index.html 24.9 The Plain Persons Guide to Plain Text Social Science by Kieran Healy As a beginning graduate student in the social sciences, what sort of software should you use to do your work?1 More importantly, what principles should guide your choices? I offer some general considerations and specific answers. Link: https://plain-text.co/index.html#introduction 24.10 Using R for Data Analysis in Social Sciences: A Research Project-Oriented Approach by Quan Li This book seeks to teach undergraduate and graduate students in social sciences how to use R to manage, visualize, and analyze data in order to answer substantive questions and replicate published findings. This book distinguishes itself from other introductory R or statistics books in three ways. First, targeting an audience rarely exposed to statistical programming, it adopts a minimalist approach and covers only the most important functions and skills in R that one will need for conducting reproducible research projects. Second, it emphasizes meeting the practical needs of students using R in research projects. Specifically, it teaches students how to import, inspect, and manage data; understand the logic of statistical inference; visualize data and findings via histograms, boxplots, scatterplots, and diagnostic plots; and analyze data using one-sample t-test, difference-of-means test, covariance, correlation, ordinary least squares (OLS) regression, and model assumption diagnostics. Third, it teaches students how to replicate the findings in published journal articles and diagnose model assumption violations. Paid: Incl listing of library availability $40 Link: https://www.worldcat.org/title/using-r-for-data-analysis-in-social-sciences-a-research-project-oriented-approach/oclc/1048009316 "],["sport-analytics.html", "25 Sport analytics 25.1 Basketball Data Science with Applications in R 25.2 Coding for sports analytics: get started resources 25.3 Exploring Baseball Data with R 25.4 Visualising WRC Rally Stages With rayshader and R: A RallyDataJunkie Adventure 25.5 Visualising WRC Rally Timing and Results Data: A RallyDataJunkie Adventure 25.6 Wrangling F1 Data With R: A Data Junkies Guide", " 25 Sport analytics 25.1 Basketball Data Science with Applications in R by Paola Zuccolotto, Marica Manisera Using data from one season of NBA games, Basketball Data Science: With Applications in R is the perfect book for anyone interested in learning and applying data analytics in basketball. Whether assessing the spatial performance of an NBA players shots or doing an analysis of the impact of high pressure game situations on the probability of scoring, this book discusses a variety of case studies and hands-on examples using a custom R package. The codes are supplied so readers can reproduce the analyses themselves or create their own. Assuming a basic statistical knowledge, Basketball Data Science with R is suitable for students, technicians, coaches, data analysts and applied researchers. Paid: $35 Link: https://www.routledge.com/Basketball-Data-Science-With-Applications-in-R/Zuccolotto-Manisera/p/book/9781138600799 25.2 Coding for sports analytics: get started resources Given the lack of sport-focussed R books, Ive added this collection of blog posts. Link: https://brendankent.com/2020/09/15/coding-for-sports-analytics-resources-to-get-started/ 25.3 Exploring Baseball Data with R by Max Marchi, Jim Albert, Max Marchi, Benjamin S. Baumer This book introduces R to sabermetricians, baseball enthusiasts, and students interested in exploring the richness of baseball data. It equips you with the necessary skills and software tools to perform all the analysis steps, from importing the data to transforming them into an appropriate format to visualizing the data via graphs to performing a statistical analysis. Paid: $50 Link: https://baseballwithr.wordpress.com/about/ 25.4 Visualising WRC Rally Stages With rayshader and R: A RallyDataJunkie Adventure by Tony Hirst Taking a simple rally route dataset, what can we do with it? This book describes a wide range of techniques for working with geodata, including routes and elevantion rasters. From 2D and 3D mapping, to a wide range of route analysis techniques, the techniques described are also relevant to a wide range of othr route analysis contexts, including ecological trail analysis. Link: https://rallydatajunkie.com/visualising-rally-stages 25.5 Visualising WRC Rally Timing and Results Data: A RallyDataJunkie Adventure by Tony Hirst A handy guide to visualising a wide range of motorsport timing and results data, concentrating on rally data associated with the FIA World Rally Championship (WRC). Link: https://rallydatajunkie.com/visualising-wrc-rally-results/ 25.6 Wrangling F1 Data With R: A Data Junkies Guide by Tony Hirst Taking a simple rally route dataset, what can we do with it? This book describes a wide range of techniques for working with geodata, including routes and elevantion rasters. From 2D and 3D mapping, to a wide range of route analysis techniques, the techniques described are also relevant to a wide range of othr route analysis contexts, including ecological trail analysis. Link: https://rallydatajunkie.com/visualising-rally-stages/ "],["statistics.html", "26 Statistics 26.1 A Business Analysts Introduction to Business Analytics 26.2 An Introduction to Statistical and Data Sciences via R 26.3 An Introduction to Statistical Learning 26.4 Answering questions with data 26.5 Bayes rules! 26.6 Common statistical tests are linear models: a work through 26.7 Doing meta-analysis with R: A hands-on guide 26.8 End-to-End Solved Problems With R: a catalog of 26 examples using statistical inference 26.9 Foundations of Statistics with R 26.10 Foundations of Statistics with R 26.11 Handbook of Regression Modeling in People Analytics 26.12 Introduction to Modern Statistics 26.13 ISLR tidymodels Labs 26.14 Learning statistics with R: A tutorial for psychology students and other beginners 26.15 Mixed Models with R : Getting started with random effects 26.16 Model Estimation by Example: Demonstrations with R 26.17 Modern Statistics with R 26.18 One Way ANOVA with R: Completely Randomized Design - Between Groups 26.19 OpenIntro Statistics 26.20 Statistical inference for data science 26.21 Statistical Rethinking 26.22 Statistical Rethinking with brms, ggplot2, and the tidyverse: Second edition 26.23 Statistical Thinking in the 21st Century 26.24 Statistics (The Easier Way) With R, 3rd. Ed. (TIDYVERSION) 26.25 Statistics and Data with R: An Applied Approach Through Examples 26.26 Teacups, Giraffes and Statistics 26.27 The Effect: An Introduction to Research Design and Causality 26.28 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling", " 26 Statistics 26.1 A Business Analysts Introduction to Business Analytics by Adam Fleischhacker This textbook goes farther than just teaching you to make computational models using software or mathematical models using statistics. It teaches you how to align computational and mathematical models with real-world scenarios; empowering you to communicate with and leverage the expertise of business stakeholders while using modern software stacks and statistical workflows. In this book, you do not learn business analytics to make models; you learn business analytics to add tangible value in the real-world. Link: https://www.causact.com/ 26.2 An Introduction to Statistical and Data Sciences via R by Chester Ismay, Albert Kim An incredibly beginner friendly introduction to both datascience and statistics concepts as well as R. Link: https://moderndive.com/ 26.3 An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani As the scale and scope of data collection continue to increase across virtually all fields, statistical learning has become a critical toolkit for anyone who wishes to understand data. An Introduction to Statistical Learning provides a broad and less technical treatment of key topics in statistical learning. Each chapter includes an R lab. This book is appropriate for anyone who wishes to use contemporary tools for data analysis. Link: https://www.statlearning.com/ 26.4 Answering questions with data by Matthew J. Crump This is a free textbook teaching introductory statistics for undergraduates in Psychology. This textbook is part of a larger OER course package for teaching undergraduate statistics in Psychology, including this textbook, a lab manual, and a course website. (Oscars note:Looks like a comprehensive stats resource!) Link: https://crumplab.github.io/statistics/ 26.5 Bayes rules! by Alicia A. Johnson, Miles Ott, Mine Dogucu The primary goal of Bayes Rules! is to make modern Bayesian thinking, modeling, and computing accessible to a broad audience. Bayes Rules! empowers readers to weave Bayesian approaches into an everyday modern practice of statistics and data science. The overall spirit is very applied: the book utilizes modern computing resources and a reproducible pipeline; the discussion emphasizes conceptual understanding; the material is motivated by data-driven inquiry; and the delivery blends traditional content with activity. Link: https://www.bayesrulesbook.com/ 26.6 Common statistical tests are linear models: a work through by Steve Doogue This is a reworking of the book Common statistical tests are linear models (or: how to teach stats), written by Jonas Lindeløv. The book beautifully demonstrates how many common statistical tests (such as the t-test, ANOVA and chi-squared) are special cases of the linear model. The book also demonstrates that many non-parametric tests, which are needed when certain test assumptions do not hold, can be approximated by linear models using the rank of values. Link: https://steverxd.github.io/Stat_tests/ 26.7 Doing meta-analysis with R: A hands-on guide by Mathias Harrer, Pim Cuijpers, Toshi A. Furukawa, David D. Ebert This book serves as an accessible introduction into how meta-analyses can be conducted in R. Essential steps for meta-analysis are covered, including pooling of outcome measures, forest plots, heterogeneity diagnostics, subgroup analyses, meta-regression, methods to control for publication bias, risk of bias assessments and plotting tools. Advanced, but highly relevant topics such as network meta-analysis, multi-/three-level meta-analyses, Bayesian meta-analysis approaches, SEM meta-analysis are also covered. Link: https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/ 26.8 End-to-End Solved Problems With R: a catalog of 26 examples using statistical inference by Nicole Radziwill Lots of worked problems, analytically and in R! Useful supplement for an introductory applied stats class. https://amzn.to/2EREAn2 - used for $4-18, new $19-20 https://www.e-junkie.com/ecom/gb.php?c=single&cl=147256&i=1548704 - $10 for PDF only Paid: $15 Link: https://amzn.to/2EREAn2 26.9 Foundations of Statistics with R by Darrin Speegle, Bryan Clair This book represents a fundamental rethinking of a calculus based first course in probability and statistics. We offer a breadth first approach, where the fundamentals of probability and statistics can be taught in one semester. The statistical programming language R plays an essential role throughout the text through simulations, data wrangling, visualizations and statistical procedures. Data sets from a variety of sources, including many from recent, open source scientific articles, are used in examples and exercises. Demonstrations of important facts are given through simulations, with some formal mathematical proofs as well. This book is an excellent choice for students studying data science, statistics, engineering, computer science, mathematics, science, business, or any field which requires the two semesters of calculus needed to read this book. Link: https://mathstat.slu.edu/~speegle/_book/preface.html 26.10 Foundations of Statistics with R by Darrin Speegle This book represents a fundamental rethinking of a calculus based first course in probability and statistics. We offer a breadth first approach, where the fundamentals of probability and statistics can be taught in one semester.1 The statistical programming language R plays an essential role throughout the text through simulations, data wrangling, visualizations and statistical procedures. Data sets from a variety of sources, including many from recent, open source scientific articles, are used in examples and exercises. Demonstrations of important facts are given through simulations, with some formal mathematical proofs as well. Link: https://mathstat.slu.edu/~speegle/_book/preface.html 26.11 Handbook of Regression Modeling in People Analytics by Keith McNulty It is the authors firm belief that all people analytics professionals should have a strong understanding of regression models and how to implement and interpret them in practice, and the aim with this book is to provide those who need it with help in getting there. For accompanying solutions to some of the questions: https://keithmcnulty.github.io/peopleanalytics-regression-book/solutions/ Link: http://peopleanalytics-regression-book.org/index.html 26.12 Introduction to Modern Statistics by Mine Çetinkaya-Rundel, Johanna Hardin We hope readers will take away three ideas from this book in addition to forming a foundation of statistical thinking and methods. Statistics is an applied field with a wide range of practical applications. You dont have to be a math guru to learn from interesting, real data. Data are messy, and statistical tools are imperfect. However, when you understand the strengths and weaknesses of these tools, you can use them to learn interesting things about the~world. Link: https://openintro-ims.netlify.app/ 26.13 ISLR tidymodels Labs by Emil Hvitfeldt This book aims to be a complement to the 1st version An Introduction to Statistical Learning book with translations of the labs into using the tidymodels set of packages. The labs will be mirrored quite closely to stay true to the original material. Link: https://emilhvitfeldt.github.io/ISLR-tidymodels-labs/index.html 26.14 Learning statistics with R: A tutorial for psychology students and other beginners by Danielle Navarro Learning Statistics with R covers the contents of an introductory statistics class, as typically taught to undergraduate psychology students, focusing on the use of the R statistical software. The book discusses how to get started in R as well as giving an introduction to data manipulation and writing scripts. From a statistical perspective, the book discusses descriptive statistics and graphing <U+FB01>rst, followed by chapters on probability theory, sampling and estimation, and null hypothesis testing. After introducing the theory, the book covers the analysis of contingency tables, t-tests, ANOVAs and regression. Bayesian statistics are covered at the end of the book. Link: https://learningstatisticswithr-bookdown.netlify.app/ 26.15 Mixed Models with R : Getting started with random effects by Michael Clark Mixed models are an extremely useful modeling tool for situations in which there is some dependency among observations in the data, where the correlation typically arises from the observations being clustered in some way. Link: https://m-clark.github.io/mixed-models-with-R/ 26.16 Model Estimation by Example: Demonstrations with R by Michael Clark This document provides by-hand demonstrations of various models and algorithms. The goal is to take away some of the mystery of them by providing clean code examples that are easy to run and compare with other tools. The code was collected over several years, so is not exactly consistent in style, but now has been cleaned up to make it more so. Within each demo, you will generally find some imported/simulated data, a primary estimating function, a comparison of results with some R package, and a link to the old code that was the initial demonstration. Link: https://m-clark.github.io/models-by-example/ 26.17 Modern Statistics with R by Måns Thulin This book covers the fundamentals of data science and statistics. The first half deals with the basics of R and R coding, data wrangling, exploratory data analysis and more advandced programming. The second half deals with modern statistics (favouring permutation tests, the bootstrap and Bayesian methods over traditional asymptotic methods), regression models and predictive modelling. It also contains information about debugging and explanations of 25 commonly encountered error messages in R. In addition, there are 170 or so exercises with fully worked solutions. Link: http://www.modernstatisticswithr.com/ 26.18 One Way ANOVA with R: Completely Randomized Design - Between Groups by Bruce Dudek This document can be a standalone how-to document for R users. However, it is primarily intended for students in the APSY510/511 statistics sequence at the University at Albany. It is a fairly thorough treatment of graphical and inferential evaluation of one-factor designs. It presumes prior background coverage of the ANOVA logic from standard textbooks such as Howell or Maxwell, Delaney and Kelley (2017). The analyses are intended to parallel and exhaust the methods already covered with SPSS, and to extend them to additional topics. Link: https://bcdudek.net/anova/oneway_anova_basics.pdf 26.19 OpenIntro Statistics by David Diez, Mine Cetinkaya-Rundel, Christopher Barr, OpenIntro A complete foundation for Statistics, also serving as a foundation for Data Science. Leanpub revenue supports OpenIntro (US-based nonprofit) so we can provide free desk copies to teachers interested in using OpenIntro Statistics in the classroom and expand the project to support free textbooks in other subjects. More resources: openintro.org. Paid: Pay what you want for the ebook, minimum $0.00, however if you are able to, please consider the cause above. Thanks! $15 Link: https://leanpub.com/openintro-statistics 26.20 Statistical inference for data science by Brian Caffo This book gives a brief, but rigorous, treatment of statistical inference intended for practicing Data Scientists. Paid: Free or pay what you want $15 Link: https://leanpub.com/LittleInferenceBook 26.21 Statistical Rethinking A Bayesian Course with Examples in R and Stan Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds your knowledge of and confidence in making inferences from data. Reflecting the need for scripting in todays model-based statistics, the book pushes you to perform step-by-step calculations that are usually automated. This unique computational approach ensures that you understand enough of the details to make reasonable choices and interpretations in your own modeling work. Link: https://xcelab.net/rm/statistical-rethinking/ 26.22 Statistical Rethinking with brms, ggplot2, and the tidyverse: Second edition by A Solomon Kurz This ebook is based on the second edition of Richard McElreaths (2020) text, Statistical rethinking: A Bayesian course with examples in R and Stan. My contributions show how to fit the models he covered with Paul Bürkners brms package, which makes it easy to fit Bayesian regression models in R using Hamiltonian Monte Carlo. I also prefer plotting and data wrangling with the packages from the tidyverse. So well be using those methods, too. Link: https://bookdown.org/content/4857/ 26.23 Statistical Thinking in the 21st Century by Russell Poldrack This textbook aims to cover modern methods that take advantage of todays increased computing power, while also balancing the accessibility of the material for students not wanting to wade through a lot of story to get to the statistical knowledge while reading Andy Fields graphic novel statistics books, An Adventure in Statistics. The main site below has companion sites in R and Python: R companion https://statsthinking21.github.io/statsthinking21-R-site/ Python companion https://statsthinking21.github.io/statsthinking21-python/ Link: https://statsthinking21.github.io/statsthinking21-core-site/ 26.24 Statistics (The Easier Way) With R, 3rd. Ed. (TIDYVERSION) by Nicole Radziwill This introductory applied statistics handbook shows you how to run tests analytically, and then how to run exactly the same steps using R. No steps are skipped, making this particularly well suited for beginners or people who need a quick lookup. Used at 30+ universities around the globe. https://amzn.to/3b9ha8s - varies between $37-43 https://www.e-junkie.com/ecom/gb.php?&c=single&cl=147256&i=1614407 - $25 for PDF only Paid: $37 Link: https://amzn.to/3b9ha8s 26.25 Statistics and Data with R: An Applied Approach Through Examples by Yosef Cohen, Jeremiah Y. Cohen R, an Open Source software, has become the de facto statistical computing environment. It has an excellent collection of data manipulation and graphics capabilities. It is extensible and comes with a large number of packages that allow statistical analysis at all levels from simple to advanced and in numerous fields including Medicine, Genetics, Biology, Environmental Sciences, Geology, Social Sciences and much more. The software is maintained and developed by academicians and professionals and as such, is continuously evolving and up to date. Statistics and Data with R presents an accessible guide to data manipulations, statistical analysis and graphics using R. Paid: The E-Book costs $97.00 while the print version costs $121.75 $97 Link: https://www.wiley.com/en-us/Statistics+and+Data+with+R%3A+An+Applied+Approach+Through+Examples-p-9780470758052 26.26 Teacups, Giraffes and Statistics by Hasse Walum, Desirée De Leon A delightful series of beautifully illustrated modules to learn statistics and R coding for students, scientists, and stats-enthusiasts. Link: https://tinystats.github.io/teacups-giraffes-and-statistics/index.html 26.27 The Effect: An Introduction to Research Design and Causality by Nick Huntington-Klein The Effect is a book intended to introduce students (and non-students) to the concepts of research design and causality in the context of observational data. The book is written in an intuitive and approachable way and doesnt overload on technical detail. Why teach regression and research design at the same time when they are fundamentally different things? First learn why you want to structure a design in a certain way, and what it is you want to do to the data, and then afterwards learn the technical details of how to run the appropriate model. Link: https://theeffectbook.net/ 26.28 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling by Andrew B. Lawson Progressively more and more attention has been paid to how location affects health outcomes. The area of disease mapping focusses on these problems, and the Bayesian paradigm has a major role to play in the understanding of the complex interplay of context and individual predisposition in such studies of disease. Using R for Bayesian Spatial and Spatio-Temporal Health Modeling provides a major resource for those interested in applying Bayesian methodology in small area health data studies. Paid: $100 Link: https://www.routledge.com/Using-R-for-Bayesian-Spatial-and-Spatio-Temporal-Health-Modeling/Lawson/p/book/9780367490126 "],["teaching.html", "27 Teaching 27.1 Data Science in a Box 27.2 rstudio4edu 27.3 Teaching Tech Together 27.4 What they forgot to teach you about teaching R", " 27 Teaching 27.1 Data Science in a Box by Mine Çetinkaya-Rundel This book focuses on how to efficiently teach data science to students with little to no background in computing and statistical thinking. The core content of the course focuses on data acquisition and wrangling, exploratory data analysis, data visualization, inference, modelling, and effective communication of results. Link: https://datasciencebox.org/ 27.2 rstudio4edu by Desirée De Leon, Alison Hill A book for educators in the data science space who wish to create educational materials that are engaging for students and inspiring to other educators. This book is a cookbook for generating materials for R Markdown lessons R packages R Markdown websites Distill sites Bookdown books Blogdown sites Link: https://rstudio4edu.github.io/rstudio4edu-book/ 27.3 Teaching Tech Together by Greg Wilson (Oscars note: Not an R book per se, but comes highly recommended about how to teach programming.) Grassroots groups have sprung up around the world to teach programming, web design, robotics, and other skills to free-range learners. These groups exist so that people dont have to learn these things on their own, but ironically, their founders and teachers are often teaching themselves how to teach. Theres a better way. Just as knowing a few basic facts about germs and nutrition can help you stay healthy, knowing a few things about cognitive psychology, instructional design, inclusivity, and community organization can help you be a more effective teacher. This book presents key ideas you can use right now, explains why we believe they are true, and points you at other resources that will help you go further Link: http://teachtogether.tech/en/index.html 27.4 What they forgot to teach you about teaching R by Desiree de Leon This book is offered at rstudio::global(2021), as part of the Diversity Scholars program. In this workshop, you will learn about using the RStudio IDE to its full potential for teaching R. Whether youre an educator by profession, or you do education as part of collaborations or outreach, or you want to improve your workflow for giving talks, demos, and workshops, there is something for you in this workshop. During the workshop we will cover live coding best practices, tips for using RStudio Cloud for teaching and building learnr tutorials, and R Markdown based tools for developing instructor and student facing teaching materials. Link: https://wtf-teach.netlify.app/ "],["text-analysis.html", "28 Text analysis 28.1 Supervised Machine Learning for Text Analysis in R 28.2 Text Mining with R 28.3 Text Mining With Tidy Data Principles", " 28 Text analysis 28.1 Supervised Machine Learning for Text Analysis in R by Emil Hvitfeldt, Julia Silge Modeling as a statistical practice can encompass a wide variety of activities. This book focuses on supervised or predictive modeling for text, using text data to make predictions about the world around us. We use the tidymodels framework for modeling, a consistent and flexible collection of R packages developed to encourage good statistical practice. Link: https://smltar.com/ 28.2 Text Mining with R by Julia Silge, David Robinson This book serves as an introduction of text mining using the tidytext package and other tidy tools in R. The functions provided by the tidytext package are relatively simple; what is important are the possible applications. Thus, this book provides compelling examples of real text mining problems. Link: https://www.tidytextmining.com/ 28.3 Text Mining With Tidy Data Principles by Julia Silge Text data sets are diverse and ubiquitous, and tidy data principles provide an approach to make text mining easier, more effective, and consistent with tools already in wide use. In this tutorial, you will develop your text mining skills using the tidytext package in R, along with other tidyverse tools. Link: https://juliasilge.shinyapps.io/learntidytext/ "],["time-series-analysis-and-forecasting.html", "29 Time Series Analysis and Forecasting 29.1 Applied Time Series Analysis for Fisheries and Environmental Sciences 29.2 Fisheries Catch Forecasting 29.3 Forecasting: Principles and Practice 29.4 Hands-On Time Series Analysis with R 29.5 Practical Time Series Forecasting with R: A Hands-On Guide 29.6 Time Series - A Data Analysis Approach Using R 29.7 Time Series Analysis and Its Applications", " 29 Time Series Analysis and Forecasting 29.1 Applied Time Series Analysis for Fisheries and Environmental Sciences by E. E. Holmes, M. D. Scheuerell, E. J. Ward This is material that was developed as part of a course we teach at the University of Washington on applied time series analysis for fisheries and environmental data. Link: https://atsa-es.github.io/atsa-labs/ 29.2 Fisheries Catch Forecasting by Elizabeth Holmes The focus of this book is on analysis of univariate time series. However multivariate regression with autocorrelated errors and multivariate autoregressive models (MAR) will be covered more briefly. For an indepth discussion of multivariate autoregressive models and multivariate autoregressive state-space models, see Holmes, Ward and Scheuerell (2018). Link: https://fish-forecast.github.io/Fish-Forecast-Bookdown/index.html 29.3 Forecasting: Principles and Practice by Rob J Hyndman, George Athanasopoulos This textbook is intended to provide a comprehensive introduction to forecasting methods and to present enough information about each method for readers to be able to use them sensibly. The book is written for three audiences: (1) people finding themselves doing forecasting in business when they may not have had any formal training in the area; (2) undergraduate students studying business; (3) MBA students doing a forecasting elective. Second edition supporting the forecast package: https://otexts.com/fpp2/ Third edition supporting the fable package: https://otexts.com/fpp3/ Link: https://otexts.com/fpp3/ 29.4 Hands-On Time Series Analysis with R by Rami Krispin The book provides an introduction for time series analysis with R. It covers the general workflow of time series analysis - working and handling time series data, descriptive analysis, predictive analysis, modeling strategies, etc. This book is designed for data scientists who wish to learn time series analysis and forecasting or data analysts who use Excel-based forecasting methods and wish to use more robust methods. Paid: $30 Link: https://www.packtpub.com/product/hands-on-time-series-analysis-with-r/9781788629157 29.5 Practical Time Series Forecasting with R: A Hands-On Guide by Galit Shmueli, Kenneth C. Lichtendahl, Jr Practical Time Series Forecasting with R provides an applied approach to time-series forecasting. Forecasting is an essential component of predictive analytics. Balancing theory and practice, the books introduce popular forecasting methods and approaches used in a variety of business applications, and are ideal for Business Analytics, MBA, Executive MBA, and Data Analytics programs in business schools. Paid: $30 Link: http://www.forecastingbook.com/ 29.6 Time Series - A Data Analysis Approach Using R by Robert H. Shumway, David S. Stoffer The goals of this text are to develop the skills and an appreciation for the richness and versatility of modern time series analysis as a tool for analyzing dependent data. A useful feature of the presentation is the inclusion of nontrivial data sets illustrating the richness of potential applications to problems in the biological, physical, and social sciences as well as medicine. The text presents a balanced and comprehensive treatment of both time and frequency domain methods with an emphasis on data analysis. Paid: $40 Link: https://www.routledge.com/Time-Series-A-Data-Analysis-Approach-Using-R/Shumway-Stoffer/p/book/9780367221096 29.7 Time Series Analysis and Its Applications by Robert H. Shumway, David S. Stoffer The book is designed as a textbook for graduate level students in the physical, biological, and social sciences and as a graduate level text in statistics. Some parts may also serve as an undergraduate introductory course. Theory and methodology are separated to allow presentations on different levels. In addition to coverage of classical methods of time series regression, ARIMA models, spectral analysis and state-space models, the text includes modern developments including categorical time series analysis, multivariate spectral methods, long memory series, nonlinear models, resampling techniques, GARCH models, ARMAX models, stochastic volatility, wavelets, and Markov chain Monte Carlo integration methods. Link: https://www.stat.pitt.edu/stoffer/tsa4/index.html "],["version-control.html", "30 Version control 30.1 Git and Github for Advanced Ecological Data Analysis 30.2 Github actions with R 30.3 Github learning lab 30.4 Happy Git and GitHub for the useR 30.5 The Beginners Guide to Git and GitHub", " 30 Version control 30.1 Git and Github for Advanced Ecological Data Analysis by Alexa Fredston This material was prepared for a three-hour virtual session to teach Git and Github to a graduate-level course on Advanced Ecological Data Analysis taught at Rutgers University by Malin Pinsky and Rachael Winfree. (However, the only course-specific material is Section 4; the rest should be applicable to any reader.) Link: https://afredston.github.io/learn-git/learn-git.htm 30.2 Github actions with R by Chris Brown, Murray Cadzow, Paula A Martinez, Rhydwyn McGuire, David Neuzerling, David Wilkinson, Saras Windecker GitHub actions allow us to trigger automated steps after we launch GitHub interactions such as when we push, pull, submit a pull request, or write an issue. Link: https://ropenscilabs.github.io/actions_sandbox/ 30.3 Github learning lab Not R specific or even a book, but looks like a good resource to learn git. Link: https://lab.github.com/ 30.4 Happy Git and GitHub for the useR by Jenny Bryan, Jim Hester, the STAT 545 TAs Happy Git provides opinionated instructions on how to: Install Git and get it working smoothly with GitHub, in the shell and in the RStudio IDE. Develop a few key workflows that cover your most common tasks. Integrate Git and GitHub into your daily work with R and R Markdown. The target reader is someone who uses R for data analysis or who works on R packages, although some of the content may be useful to those working in adjacent areas. Link: https://happygitwithr.com/ 30.5 The Beginners Guide to Git and GitHub by Thomas Mailund A quick beginners guide to using Git and GitHub.You have heard about git and GitHub and want to know what the buzz is about. That is what I am here to tell you. Or, at least, I am here to give you a quick overview of what you can do with git and GitHub. I wont be able, in the space here, to give you an exhaustive list of featuresin all honesty, I dont know enough myself to be able to claim expertise with these tools. I am only a frequent user, but I can get you started and give you some pointers for where to learn more. That is what this booklet is for. Paid: $5 Link: https://amzn.to/2Nt0rDY "],["workflow.html", "31 Workflow 31.1 Agile Data Science with R 31.2 Github actions with R 31.3 How I Use R 31.4 The Data Validation Cookbook 31.5 The targets R Package User Manual", " 31 Workflow 31.1 Agile Data Science with R by Edwin Thoen I joined a Scrum team (frontend, backend, ux designer, product owner, second data scientist) to create a machine learning model that we brought to production using the Agile principles. It was an inspiring experience from which I learned a great deal. My colleagues patiently explained the principles of Agile software development and together we applied them to the data science context.All these experiences culminated in the workflow that we now adhere to at work and I think it is worthwhile to share it. It is heavily based on the principles of Agile software production, hence the title. We have explored which of the concepts from Agile did and did not work for data science and we got hands-on experience in working from these principles in an R project that actually got to production. Link: https://edwinth.github.io/ADSwR/ 31.2 Github actions with R by Chris Brown, Murray Cadzow, Paula A Martinez, Rhydwyn McGuire, David Neuzerling, David Wilkinson, Saras Windecker GitHub actions allow us to trigger automated steps after we launch GitHub interactions such as when we push, pull, submit a pull request, or write an issue. Link: https://ropenscilabs.github.io/actions_sandbox/ 31.3 How I Use R by David Keyes There are many great learning resources at the beginner stage and some incredible tutorials to master complex tasks in R. But, drawing from a concept in urban planning, there are far fewer resources in the middle. Stretching the metaphor perhaps to its breaking point, new R users at the detached single-family home stage cant get to the advanced mid-rise level without going through the middle stage. The missing middle in the R neighborhood is the lack of resources to that answer the types of nuts and bolts questions that new R users often have. Things like: How should I organize my file structure when creating a new project? Should I do data cleaning in an RMarkdown file or an R script file? How do I find packages? How do I know if the packages I find are high quality? This book is my attempt to provide answers to these types of questions. Link: https://howiuser.com/ 31.4 The Data Validation Cookbook by Mark P.J. van der Loo The purposes of this book include demonstrating the main tools and workflows of the validate package, giving examples of common data validation tasks, and showing how to analyze data validation results. Link: https://data-cleaning.github.io/validate/ 31.5 The targets R Package User Manual by Will Landau The targets package is a Make-like pipeline toolkit for Statistics and data science in R. With targets, you can maintain a reproducible workflow without repeating yourself. targets learns how your pipeline fits together, skips costly runtime for tasks that are already up to date, runs only the necessary computation, supports implicit parallel computing, abstracts files as R objects, and shows tangible evidence that the results match the underlying code and data. Link: https://books.ropensci.org/targets/ "],["other-compendiums.html", "32 Other compendiums 32.1 Awesome network analysis 32.2 Bookdown archive 32.3 CRAN doc collections 32.4 Data Science with R: A Resource Compendium 32.5 R on the web 32.6 R project book compendium 32.7 Use R! Springer series", " 32 Other compendiums 32.1 Awesome network analysis Not a book, but a compendium of resources that look really valuable. Link: https://github.com/briatte/awesome-network-analysis 32.2 Bookdown archive An archive all books published via bookdown.org. Its a very very big repo. Link: https://bookdown.org/home/archive/ 32.3 CRAN doc collections Note these projects are frozen, but they do contain a lot of resources in multiple languages. Many of these are quite old publications, but it doesnt mean theyre outdated or not useful. If youre really digging for a specific resource that you cant find anywhere else, it may be here. Good luck! https://cran.r-project.org/other-docs.html Link: https://www.r-project.org/doc/bib/R-books.html 32.4 Data Science with R: A Resource Compendium by Martin Monkman This book grew out of my evergrowing collection of reference materials that was saved as an expanding array of markdown files in a github repo. By assembling it as a book, I hope that it will be more accessible and useful to other R users. Link: https://bookdown.org/martin_monkman/DataScienceResources_book/ 32.5 R on the web by Guillaume Coquere Useful links for people interested in R. Link: https://github.com/shokru/rstats/blob/master/material/R_links.md 32.6 R project book compendium A searchable archive of 180+ books. Link: https://www.r-project.org/doc/bib/R-jabref.html 32.7 Use R! Springer series This is a collection of some 70+ books. This series of inexpensive and focused books on R will publish shorter books aimed at practitioners. Books can discuss the use of R in a particular subject area (e.g., epidemiology, econometrics, psychometrics) or as it relates to statistical topics (e.g., missing data, longitudinal data). Paid: All are paid products Link: https://www.springer.com/series/6991?detailsPage=titles "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["index.html", "Big Book of R 1 Welcome :) 1.1 Your last-ever bookmark 1.2 Searching 1.3 Contributing 1.4 Contributors 1.5 Licence 1.6 Live stats 1.7 About me", " Big Book of R Oscar Baruffa 02 January, 2022 1 Welcome :) 1.1 Your last-ever bookmark Thanks for stopping by. If youre like me, you cant help but bookmark every R-related programming book you find in the hopes that one day you, or someone you know, might find it useful. Hopefully this is the only bookmark youll need in future ;). When I initially released this collection in late August 2020, it contained about 100 books that Id been collecting over the previous two years. Since then Ive found a few more and there have been contributions from many people. The collection now stands at about 250 books. Most of these are free. Some are paid but usually quite affordable. 1.2 Searching If theres something specific youre looking for, use the menu or search using the magnifying glass icon at the top of the screen. 1.3 Contributing Please feel free to contribute paid and free books - see GitHub. 1.4 Contributors If youve contributed, add your name and Twitter / blog link below! Oscar Baruffa, Mohit Sharma, Vebash Naidoo, Julia Silge, Erik Gahner Larsen, Nicole Radziwill, Nistara Randhawa, Antoine Fabri, Jon Calder, Mike Smith, Ben Bolker, Maëlle Salmon, Laura Ellis, Bryan Shalloway, Antonio Uzal, Louis Aslett, Lluís Revilla Sancho, Brendan Cullen, Rami Krispin, Michael Dorman, Ezekiel Adebayo Ogundepo, Shamsuddeen Hassan Muhammad, Eric Leung, Isabella Velásquez, Matt Roumaya, Legana Fingerhut, Robert D. Brown III 1.5 Licence This website/book is free to use, and is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. 1.6 Live stats Who says you cant have privacy AND transparency?? Im guessing that if youre interested in R then you also like data ;). I initially used Google Analytics for this site but as Im keen to enhance user privacy I switched to Plausible Analytics from 30 December 2020 onward. You can view the old Google Analytics summary report PDF here. TLDR 22k unique visitors and 33k sessions between August 2020 and December 2020 :D!! Note that unique visits will be higher in Plausible than youd find with Google Analytics. Because Plausible is GDPR compliant and privacy focused, each user is identified for only 1 day. If someone visits the site 2 days in a row, thats counted as 2 uniques whereas in Google Analytics it would only be counted as 1 unique visitor because of the presence of persistent cookies and such that allows for tracking of users. From now on, you can view the LIVE site stats right here. 1.7 About me Im Oscar. Fairly new to R and loving it. If you like this book, feel free to say Hi! on Twitter. If you want to stay in the loop on other data-related products I create, or major updates to this book, sign up to my newsletter. "],["new-to-r-start-here.html", "2 New to R? Start here", " 2 New to R? Start here If youre new to R and want to learn how to use it, this library might be a little daunting. Theres so much choice! If you arent sure where to start, then try one of these two options: 2.0.1 Book: R for Data Science This book is an excellent introduction to R programming and gets you started with visualizing data so you see some exciting stuff, and the power of R, right away. The book is free to read at https://r4ds.had.co.nz/ Theres an accompanying exercise solution book at https://jrnold.github.io/r4ds-exercise-solutions/ For a different take on the solutions, check out Yet another R for Data Science study guide which can also be found at https://brshallo.github.io/r4ds_solutions/ If youd like more of a roadmap which incorporates this book, have a look at my blogpost: https://oscarbaruffa.com/a-roadmap-for-getting-started-with-r/ 2.0.2 Video Course: Getting started with R If you prefer video instruction with progress tracking, check out this course from R for the Rest of Us called Getting Started with R. https://rfortherestofus.com/courses/getting-started/ "],["book-clubs.html", "3 Book Clubs 3.1 NHS-R community 3.2 R4DS Slack Community 3.3 R-ladies Netherlands - Advanced R by Hadley Wickham", " 3 Book Clubs Just like the book clubs you know and love, except that people actually talk about the book theyre busy reading! R book clubs are usually a group of people who follow along together in working though the same book, with some sort of periodic check-in (often weekly, often via video) discussing the text, exercises and solutions. Below is a list of book clubs. These usually have a specific start and end date, so it may happen that a book club has already ended even though its listed here. If you are running a book club, feel free to add it. 3.1 NHS-R community If youre one of the estimated 10 000 data analysts working in the NHS or someone who works closely with the NHS or health data, heres a blog post introducing the NHS-R Community book club. The book club is coordinated through the NHS-R Slack Group and the specific channel is #book-club. Certain email addresses can just join the Slack group (like @nhs.net) but if you have an email address that needs approval please contact NHS-R Community through their contact details on the website. The book club has covered statistics books like The Art of Statistics by David Spiegelhalter and The Book of Why by Judea Pearl and presentations given at the meetings can be found on the GitHub repository. The Community will be coordinating another book club for the R4DS book and the channel for that is #r4ds-book-club. 3.2 R4DS Slack Community The R4Ds slack Community has a number of running book clubs. Once youve joined the slack group, you can search for channels. They also have a channel specifically for book club facilitators! Theyve recorded the sessions of cohorts so you can pick your way through one, or catch up on the current one! 3.3 R-ladies Netherlands - Advanced R by Hadley Wickham A collaboration of multiple Netherlands-based R-ladies groups ran a club on Hadley Wickhams Advanced R book. The github repo contains all the slides from the sessions. "],["career-community.html", "4 Career & Community 4.1 Ace The Data Science Interview 4.2 Build Your Career in Data Science 4.3 Conversations On Data Science 4.4 Essays on Data Analysis 4.5 Executive Data Science 4.6 Getting Started in Data Science 4.7 Hiring Data Scientists and Machine Learning Engineers 4.8 Introduction to Machine Learning Interviews Book 4.9 Project Management Fundamentals for Data Analysts 4.10 Telling Stories With Data 4.11 The Programmers Brain : What every programmer needs to know about cognition 4.12 Twitter for R Programmers 4.13 Twitter for Scientists", " 4 Career & Community These books arent all strictly R focussed, but they do have a lot of relevance for many R programmers. 4.1 Ace The Data Science Interview by Kevin Huo, Nick Singh Authored by two Ex-Facebook employees, Ace the Data Science Interview is the best way to prepare for Data Science, Data Analyst, and Machine Learning interviews, so that you can land your dream job at FAANG, tech startups, or Wall Street. Paid: $30 Link: https://www.acethedatascienceinterview.com/ 4.2 Build Your Career in Data Science by Emily Robinson, Jacqueline Nolis You are going to need more than technical knowledge to succeed as a data scientist. Build a Career in Data Science teaches you what school leaves out, from how to land your first job to the lifecycle of a data science project, and even how to become a manager. Paid: Lots of free preview available $20 Link: https://www.manning.com/books/build-a-career-in-data-science 4.3 Conversations On Data Science by Roger Peng, Hilary Parker This book collects many of their discussions from the podcast Not So Standard Deviations and distills them into a readable format. Paid: Pay what you want for the ebook, minimum $0 Link: https://leanpub.com/conversationsondatascience 4.4 Essays on Data Analysis by Roger Peng This book draws a complete picture of the data analysis process, filling out many details that are missing from previous presentations. It presents a new perspective on what makes for a successful data analysis and how the quality of data analyses can be judged. Paid: Pay what you want for the ebook, minimum $0 Link: https://leanpub.com/dataanalysisessays 4.5 Executive Data Science by Brian Caffo, Roger D. Peng, Jeffrey Leek A Guide to Training and Managing the Best Data Scientists. Learn what you need to know to begin assembling and leading a data science enterprise. Paid: Pay what you want for the PDF, minimum $0 Link: https://leanpub.com/eds 4.6 Getting Started in Data Science by Ayodele Odubela This book is for anyone intersted in Data Science, but is unsure where to start. Cut through the noise and learn my best tips for understanding Machine Learning with insight from my 4 years of industry experience. Learn the math as it applies to real-life data projects and get an understanding of fairness, ethics, and accounability in AI. Paid: $20 Link: https://gumroad.com/l/getting-started-in-data-science 4.7 Hiring Data Scientists and Machine Learning Engineers by Roy Keyes Its quite possible that the only thing more confusing than defining data science is actually hiring data scientists. Hiring Data Scientists and Machine Learning Engineers is a concise, practical guide to cut through the confusion. Whether youre the founder of a brand new startup, the senior vice president in charge of digital transformation at a global industrial company, the leader of a new analytics effort at a non-profit, or a junior manager of a machine learning team at a tech giant, this book will help walk you through the important questions you need to answer to determine what role and which skills you should hire for, how to source applicants, how to assess those applicants skills, and how to set your new hires up for success. Special emphasis is placed on in-office vs remote hiring situations. Paid: varies $25 Link: https://dshiring.com 4.8 Introduction to Machine Learning Interviews Book by Chip Huyen This book is the result of the collective wisdom of many people who have sat on both sides of the table and who have spent a lot of time thinking about the hiring process. It was written with candidates in mind, but hiring managers who saw the early drafts told me that they found it helpful to learn how other companies are hiring, and to rethink their own process. The book consists of two parts. The first part provides an overview of the machine learning interview process, what types of machine learning roles are available, what skills each role requires, what kinds of questions are often asked, and how to prepare for them. This part also explains the interviewers mindset and what kind of signals they look for. The second part consists of over 200 knowledge questions, each noted with its level of difficulty interviews for more senior roles should expect harder questions that cover important concepts and common misconceptions in machine learning. Link: https://huyenchip.com/ml-interviews-book/ 4.9 Project Management Fundamentals for Data Analysts by Oscar Baruffa In Project Management Fundamentals for Data Analysts, Ive boiled the concepts down to the bare essentials which can be read in under 15 minutes you can certainly fit that into your crazy schedule (and it will help your future schedule not be so chaotic!). These concepts can be used to great effect on their own if you wish to never read another word on the topic. Itll also provide a solid foundation if you want to dive deeper into more formal courses or sophisticated theory. Paid: $12 Link: https://oscarbaruffa.com/pm/ 4.10 Telling Stories With Data by Rohan Alexander This aim of this book is to help you learn how to tell stories with data. It establishes a foundation on which you can build and share knowledge, based on data, about an aspect of the world of interest to you. In this book we explore, prod, push, manipulate, knead, and ultimately, try to understand the implications of, data. The motto of the university from which I took my PhD is Naturam primum cognoscere rerum or roughly first to learn the nature of things, and we will indeed attempt to do that. But the original quote continues temporis aeterni quoniam, or roughly for eternal time, and it is tools, approaches, and workflows that enable you to establish lasting knowledge that I focus on in this book. Link: https://www.tellingstorieswithdata.com/ 4.11 The Programmers Brain : What every programmer needs to know about cognition by Felienne Hermans Explores the way your brain works when its thinking about code. In it, youll master practical ways to apply these cognitive principles to your daily programming life. Youll improve your code comprehension by turning confusion into a learning tool, and pick up awesome techniques for reading code and quickly memorizing syntax. This practical guide includes tips for creating your own flashcards and study resources that can be applied to any new language you want to master. By the time youre done, youll not only be better at teaching yourselfyoull be an expert at bringing new colleagues and junior programmers up to speed. Paid: Free preview $30 Link: https://www.manning.com/books/the-programmers-brain 4.12 Twitter for R Programmers by Oscar Baruffa, Veerle van Son The R community is very active on Twitter. You can learn a lot about the language, about new approaches to problems, make friends and even land a job or next contract. Its a real-time pulse of the R community.What can you gain from becoming active on Twitter? This book will talk about the benefits and it will show you how to use Twitter. Link: https://www.t4rstats.com 4.13 Twitter for Scientists by Daniel S. Quintana Paid: I believe that Twitter can provide extraordinary opportunities for scientists, regardless of their seniority, mentors, or institution. By actively contributing to Twitter, Ive kept up-to-date with emerging methods, several doors have opened for research collaborations, and Ive been introduced to a supportive community of like-minded scientists. Most important, Ive received valuable feedback on my work and been able to share my research to people that would have not otherwise seen it. In fact, if it wasnt for Twitter I dont think Id still be in academia. Link: https://t4scientists.com/ "],["archeology.html", "5 Archeology 5.1 How To Do Archaeological Science Using R 5.2 Quantitative Methods in Archaeology Using R", " 5 Archeology 5.1 How To Do Archaeological Science Using R by Ben Marwick (editor) Archaeological science is becoming increasingly complex, and progress in this area is slowed by critical limitation of journal articles lacking the space to communicate new methods in enough detail to allow others to reproduce and reuse new research. One solution to this is to use a programming language such as R to analyse archaeological data, with authors sharing their R code with their publications to communicate our methods. This practice is becoming widespread in many other disciplines, but few archaeologists currently know how to use R or have an opportunity to learn during their training. In this forum we tackle this problem by discussing ubiquitous research methods of immediate relevance to most archaeologists, by using interactive, live-coded demonstrations of R code by archaeologists who program with R. Topics include getting data into R, working with C14 dates, spatial analysis and map-making, conducting simulations, and exploratory data visualizations. Link: https://benmarwick.github.io/How-To-Do-Archaeological-Science-Using-R/ 5.2 Quantitative Methods in Archaeology Using R The first hands-on guide to using the R statistical computing system written specifically for archaeologists. It shows how to use the system to analyze many types of archaeological data. Part I includes tutorials on R, with applications to real archaeological data showing how to compute descriptive statistics, create tables, and produce a wide variety of charts and graphs. Part II addresses the major multivariate approaches used by archaeologists, including multiple regression (and the generalized linear model); multiple analysis of variance and discriminant analysis; principal components analysis; correspondence analysis; distances and scaling; and cluster analysis. Part III covers specialized topics in archaeology, including intra-site spatial analysis, seriation, and assemblage diversity. Paid: Loan or buy $100 Link: https://www.cambridge.org/core/books/quantitative-methods-in-archaeology-using-r/DEAE593FA2418EA3B8ECD538C34ED2D5?fbclid=IwAR0guclfEtttfDkVKNUJWfhQ1wgUlXSKAIA3f_6D3hS_9EkUKivSY9AyFD8 "],["art.html", "6 Art 6.1 Thinking Outside The Grid - A bare bones intro to Rtistry concepts in R using ggplot.", " 6 Art There are no books available covering art, but there are some blog posts available. This first one is is a good intro. 6.1 Thinking Outside The Grid - A bare bones intro to Rtistry concepts in R using ggplot. by Megan Harris Recently Ive discovered the courage to dive into creative coding and generative aRt in R. Something that the R community calls Rtistry. My Rtistry journey so far has been an amazing and tranquil expedition into a world that seemed intimidating and scary on the outside but is honestly just a bottomless pit of fun and creativity on the inside. Im going to talk about some very basic concepts and perspectives you can think about while starting your own Rtistry journey in ggplot. This includes basics on geoms, aesthetics, layering, etc. But then Im also going to walk you through two of my Rtistry examples and code to get you started. This article is intended for those who have some experience with ggplot building in R but may not have realized how to transition from making regular visuals to Rtistry. This article goes over basic concepts that more seasoned users may already know. Link: https://www.thetidytrekker.com/post/thinking-outside-the-grid "],["big-data.html", "7 Big Data 7.1 Exploring, Visualizing, and Modeling Big Data with R 7.2 Mastering Spark with R", " 7 Big Data 7.1 Exploring, Visualizing, and Modeling Big Data with R by Okan Bulut, Christopher Desjardins Working with BIG DATA requires a particular suite of data analytics tools and advanced techniques, such as machine learning (ML). Many of these tools are readily and freely available in R. This full-day session will provide participants with a hands-on training on how to use data analytics tools and machine learning methods available in R to explore, visualize, and model big data. Link: https://okanbulut.github.io/bigdata/ 7.2 Mastering Spark with R by Javier Luraschi, Kevin Kuo, Edgar Ruiz In this book you will learn how to use Apache Spark with R. The book intends to take someone unfamiliar with Spark or R and help you become proficient by teaching you a set of tools, skills and practices applicable to large-scale data science. PS the first chapter has a Jon Snow quote ;) Link: https://therinspark.com/ "],["blogdown.html", "8 Blogdown 8.1 blogdown: Creating Websites with R Markdown 8.2 Create, Publish, and Analyze Personal Websites Using R and RStudio 8.3 https://r4sites-book.netlify.app/", " 8 Blogdown 8.1 blogdown: Creating Websites with R Markdown by Yihui Xie, Amber Thomas, Alison Presmanes Hill We introduce an R package, blogdown, in this short book, to teach you how to create websites using R Markdown and Hugo. Link: https://bookdown.org/yihui/blogdown/ 8.2 Create, Publish, and Analyze Personal Websites Using R and RStudio by Danny Morris A free, digital handbook with step-by-step instructions for launching your own personal website using R, RStudio, and other freely available technologies including GitHub, Hugo, Netlify, and Google Analytics. Link: https://r4sites-book.netlify.app/ 8.3 https://r4sites-book.netlify.app/ by Yihui Xie This short book introduces an R package, bookdown, to change your workflow of writing books. It should be technically easy to write a book, visually pleasant to view the book, fun to interact with the book, convenient to navigate through the book, straightforward for readers to contribute or leave feedback to the book author(s), and more importantly, authors should not always be distracted by typesetting details. Link: https://bookdown.org/yihui/bookdown/ "],["bookdown.html", "9 Bookdown 9.1 A Minimal Book Example", " 9 Bookdown 9.1 A Minimal Book Example This is a sample book written in Markdown. Link: https://benmarwick.github.io/bookdown-ort/ "],["data-science.html", "10 Data Science 10.1 A Business Analysts Introduction to Business Analytics 10.2 An Introduction to Data Analysis 10.3 APS 135: Introduction to Exploratory Data Analysis with R 10.4 Beginning Data Science in R 10.5 Business Case Analysis with R - Simulation Tutorials to Support Complex Business Decisions 10.6 Business Intelligence with R 10.7 Data Science at the Command Line, 2e 10.8 Data Science: A First Introduction 10.9 DevOps for Data Science 10.10 edav.info/ 10.11 Everyday Data Science 10.12 Exploratory Data Analysis with R 10.13 Introduction to Data Science 10.14 Model-Based Clustering and Classification for Data Science 10.15 Modern Data Science with R 10.16 Modern Statistics with R 10.17 Practical Data Science with R, Second Edition 10.18 R Data Science Quick Reference 10.19 R for Data Science 10.20 R for Data Science Solutions 10.21 R Programming for Data Science 10.22 The Art of Data Science 10.23 The Elements of Data Analytic Style 10.24 Yet another R for Data Science study guide", " 10 Data Science 10.1 A Business Analysts Introduction to Business Analytics by Adam Fleischhacker This textbook goes farther than just teaching you to make computational models using software or mathematical models using statistics. It teaches you how to align computational and mathematical models with real-world scenarios; empowering you to communicate with and leverage the expertise of business stakeholders while using modern software stacks and statistical workflows. In this book, you do not learn business analytics to make models; you learn business analytics to add tangible value in the real-world. Link: https://www.causact.com/ 10.2 An Introduction to Data Analysis by Michael Franke This book provides basic reading material for an introduction to data analysis. It uses R to handle, plot and analyze data. After covering the use of R for data wrangling and plotting, the book introduces key concepts of data analysis from a Bayesian and a frequentist tradition. This text is intended for use as a first introduction to statistics for an audience with some affinity towards programming, but no prior exposition to R. Link: https://michael-franke.github.io/intro-data-analysis/index.html 10.3 APS 135: Introduction to Exploratory Data Analysis with R by Dylan Z. Childs This is the online course book for the Introduction to Exploratory Data Analysis with R component of APS 135, a module taught by the Department and Animal and Plant Sciences at the University of Sheffield. You will be introduced to the R ecosystem.You will learn how to use R to carry out data manipulation and visualisation.This book provides a foundation for learning statistics later on. Link: https://dzchilds.github.io/eda-for-bio/ 10.4 Beginning Data Science in R by Thomas Mailund Beginning Data Science in R details how data science is a combination of statistics, computational science, and machine learning. Youll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. Those with some data science or analytics background, but not necessarily experience with the R programming language Paid: $40 Link: https://amzn.to/2Ns1HHi 10.5 Business Case Analysis with R - Simulation Tutorials to Support Complex Business Decisions by Robert D. Brown III Business case analysis, often conducted in spreadsheets, exposes decision makers to additional risks that arise just from the use of the spreadsheet environment. This book discusses how to use the statistical programming language R to develop a business case simulation and analysis. It presents a methodology that minimizes decision delay by focusing stakeholders on what matters most and suggests pathways for minimizing the risk in strategic and capital allocation decisions. Paid: Apress/Springer-Nature eBook $24.99, Softcover $34.99 $25 Link: https://www.apress.com/us/book/9781484234945# 10.6 Business Intelligence with R by Dwight Barry) A desktop reference for busy professionals, giving you fingertip access to a variety of BI analytic methods done in R as simply as possible. All proceeds will support mitochondrial disorder research at Seattle Childrens Hospital. Paid: Free or up to $20 for a good cause! $20 Link: https://leanpub.com/businessintelligencewithr 10.7 Data Science at the Command Line, 2e by Jeroen Janssens This book is about doing data science at the command line. Our aim is to make you a more efficient and productive data scientist by teaching you how to leverage the power of the command line. Link: https://www.datascienceatthecommandline.com/2e/ 10.8 Data Science: A First Introduction by Tiffany-Anne Timbers, Trevor Campbell, Melissa Lee This is an open source textbook aimed at introducing undergraduate students to data science. It was originally written for the University of British Columbias DSCI 100 - Introduction to Data Science course. In this book, we define data science as the study and development of reproducible, auditable processes to obtain value (i.e., insight) from data. Link: https://ubc-dsci.github.io/introduction-to-datascience/ 10.9 DevOps for Data Science by Alex Gold At some point, most data scientists reach the point where they want to show their work to others. But the skills and tools to deploy data science are completely different from the skills and tools needed to do data science. If youre a data scientist who wants to get your work in front of the right people, this book aims to equip you with all the technical things you need to know that arent data science. Hopefully, once youve read this book, youll understand how to deploy your data science, whether youre building a DIY deployment system or trying to work with your organizations IT/DevOps/SysAdmin/SRE group to make that happen. Link: https://akgold.github.io/do4ds/index.html 10.10 edav.info/ by Zach Bogart, Joyce Robbins With this resource, we try to give you a curated collection of tools and references that will make it easier to learn how to work with data in R. In addition, we include sections on basic chart types/tools so you can learn by doing. There are also several walkthroughs where we work with data and discuss problems as well as some tips/tricks that will help you. Link: https://edav.info/ 10.11 Everyday Data Science by Andrew Carr Everyday data science is a collection of tools and techniques you can use to master data science in your day-to-day life. There are case studies, tutorials, code snippets, pictures, math, and jokes. All designed as a fun introduction to the world of data science. Some example chapters include, A/B testing to make perfect lemonade, word vectors to improve your resume, differential equations for weight loss, and how a man used statistics to qualify for the Olympics. Life is full of decisions. We, as people, have the remarkable ability to make decisions in the face of uncertainty. We, as humans, have only recently developed the ability to use computers to process vast amounts of data to improve our decision making. This innovation has led to the development of the field of Data Science. This book is written to give tools and inspiration to aspiring decision makers. You make decisions daily and the methodology of data science can help. Paid: $8 Link: https://gumroad.com/l/everydaydata 10.12 Exploratory Data Analysis with R by Roger Peng This book teaches you to use R to effectively visualize and explore complex datasets. Exploratory data analysis is a key part of the data science process because it allows you to sharpen your question and refine your modeling strategies. This book is based on the industry-leading Johns Hopkins Data Science Specialization Paid: Free or Pay what you want $15 Link: https://leanpub.com/exdata 10.13 Introduction to Data Science by Rafael A Irizarry The demand for skilled data science practitioners in industry, academia, and government is rapidly growing. This book introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling with dplyr, data visualization with ggplot2, algorithm building with caret, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation with knitr and R markdown. Bookdown version https://rafalab.github.io/dsbook/ Paid: Free or pay what you want $50 Link: https://leanpub.com/datasciencebook 10.14 Model-Based Clustering and Classification for Data Science by Charles Bouveyron, Gilles Celeux, T. Brendan Murphy, Adrian E. Raftery Among the broad field of statistical and machine learning, model-based techniques for clustering and classification have a central position for anyone interested in exploiting those data. This text book focuses on the recent developments in model-based clustering and classification while providing a comprehensive introduction to the field. It is aimed at advanced undergraduates, graduates or first year PhD students in data science, as well as researchers and practitioners. Link: https://math.unice.fr/~cbouveyr/MBCbook/ 10.15 Modern Data Science with R by Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton This book is intended for readers who want to develop the appropriate skills to tackle complex data science projects and think with data (as coined by Diane Lambert of Google). The desire to solve problems using data is at the heart of our approach. We acknowledge that it is impossible to cover all these topics in any level of detail within a single book: Many of the chapters could productively form the basis for a course or series of courses. Instead, our goal is to lay a foundation for analysis of real-world data and to ensure that analysts see the power of statistics and data analysis. After reading this book, readers will have greatly expanded their skill set for working with these data, and should have a newfound confidence about their ability to learn new technologies on-the-fly. This book was originally conceived to support a one-semester, 13-week undergraduate course in data science. We have found that the book will be useful for more advanced students in related disciplines, or analysts who want to bolster their data science skills. At the same time, Part I of the book is accessible to a general audience with no programming or statistics experience. Link: https://mdsr-book.github.io/mdsr2e/ 10.16 Modern Statistics with R by Måns Thulin This book covers the fundamentals of data science and statistics. The first half deals with the basics of R and R coding, data wrangling, exploratory data analysis and more advandced programming. The second half deals with modern statistics (favouring permutation tests, the bootstrap and Bayesian methods over traditional asymptotic methods), regression models and predictive modelling. It also contains information about debugging and explanations of 25 commonly encountered error messages in R. In addition, there are 170 or so exercises with fully worked solutions. Link: http://www.modernstatisticswithr.com/ 10.17 Practical Data Science with R, Second Edition by Nina Zumel, John Mount Practical Data Science with R, Second Edition takes a practice-oriented approach to explaining basic principles in the ever expanding field of data science. Youll jump right to real-world use cases as you apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business intelligence, and decision support. Paid: Free preview $25 Link: https://www.manning.com/books/practical-data-science-with-r-second-edition#toc 10.18 R Data Science Quick Reference by Thomas Mailund In this book, youll learn about the following APIs and packages that deal specifically with data science applications: readr, dibble, forecasts, lubridate, stringr, tidyr, magnittr, dplyr, purrr, ggplot2, modelr, and more. Paid: $30 Link: https://amzn.to/2WN1mQy 10.19 R for Data Science by Hadley Wickham, Garret Grolemund This book will teach you how to do data science with R: Youll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, youll learn how to clean data and draw plotsand many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. Youll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. Youll also learn how to manage cognitive resources to facilitate discoveries when wrangling, visualising, and exploring data. Link: https://r4ds.had.co.nz/ 10.20 R for Data Science Solutions by Jeffrey B. Arnold Solutions for the hadley and Grolemund R4Ds book Link: https://jrnold.github.io/r4ds-exercise-solutions/ 10.21 R Programming for Data Science by Roger Peng This book is about the fundamentals of R programming. You will get started with the basics of the language, learn how to manipulate datasets, how to write functions, and how to debug and optimize code. With the fundamentals provided in this book, you will have a solid foundation on which to build your data science toolbox. Link: https://bookdown.org/rdpeng/rprogdatascience/ 10.22 The Art of Data Science by Roger D. Peng, Elizabeth Matsui A Guide for Anyone Who Works with Data This book describes the process of analyzing data. The authors have extensive experience both managing data analysts and conducting their own data analyses, and this book is a distillation of their experience in a format that is applicable to both practitioners and managers in data science. Paid: Free (excl lecture videos) or pay what you want $15 Link: https://leanpub.com/artofdatascience 10.23 The Elements of Data Analytic Style by Jeffrey Leek Data analysis is at least as much art as it is science. This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks. It is based in part on the authors blog posts, lecture materials, and tutorials. Paid: Free or pay what you want $10 Link: https://leanpub.com/datastyle 10.24 Yet another R for Data Science study guide by Bryan Shalloway This book contains my solutions and notes to Garrett Grolemund and Hadley Wickhams excellent book, R for Data Science (Grolemund and Wickham 2017). R for Data Science (R4DS) is my go-to recommendation for people getting started in R programming, data science, or the tidyverse. Link: https://brshallo.github.io/r4ds_solutions/ "],["data-visualization.html", "11 Data Visualization 11.1 A ggplot2 Tutorial for Beautiful Plotting in R 11.2 BBC Visual and Data Journalism cookbook for R graphics 11.3 Data Processing & Visualization 11.4 Data visualisation using R, for researchers who dont use R 11.5 Data Visualization - A practical introduction 11.6 Data Visualization in R 11.7 Data Visualization with R 11.8 Fundamentals of Data Visualization 11.9 ggplot2 in 2 11.10 ggplot2: Elegant Graphics for Data Analysis 11.11 Graphical Data Analysis with R 11.12 Hands-On Data Visualization: Interactive Storytelling from Spreadsheets to Code 11.13 JavaScript for R 11.14 plotly Interactive web-based data visualization with R, plotly, and shiny 11.15 R Graphics Cookbook, 2nd edition 11.16 Solutions to ggplot2: Elegant Graphics for Data Analysis", " 11 Data Visualization 11.1 A ggplot2 Tutorial for Beautiful Plotting in R by Cédric Sherer (Oscar: Not a book per se, but it should be, so Im adding !) A mega tutorial of creating great ggplot2 visuals. Link: https://cedricscherer.netlify.app/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ 11.2 BBC Visual and Data Journalism cookbook for R graphics At the BBC data team, we have developed an R package and an R cookbook to make the process of creating publication-ready graphics in our in-house style using Rs ggplot2 library a more reproducible process, as well as making it easier for people new to R to create graphics. Link: https://bbc.github.io/rcookbook/ 11.3 Data Processing & Visualization by Michael Clark This document provides some tools, demonstrations, and more to make data processing, programming, modeling, visualization, and presentation easier.While the programming language focus is on R, where applicable (which is most of the time), Python notebooks are also available. Link: https://m-clark.github.io/data-processing-and-visualization/ 11.4 Data visualisation using R, for researchers who dont use R by Emily Nordmann, Phil McAleer, Wilhelmiina Toivo, Helena Paterson, Lisa DeBruine In this tutorial, we aim to provide a practical introduction to data visualisation using R, specifically aimed at researchers who have little to no prior experience of using R. First we detail the rationale for using R for data visualisation and introduce the grammar of graphics that underlies data visualisation using the ggplot package. The tutorial then walks the reader through how to replicate plots that are commonly available in point-and-click software such as histograms and boxplots, as well as showing how the code for these basic plots can be easily extended to less commonly available options such as violin-boxplots. Link: https://psyteachr.github.io/introdataviz/ 11.5 Data Visualization - A practical introduction by Kieran Healy This book is a hands-on introduction to the principles and practice of looking at and presenting data using R and ggplot. Link: https://socviz.co/ 11.6 Data Visualization in R by Brooke Anderson Workshop for the 2019 Navy and Marine Corps Public Health Conference. I have based this workshop on examples for you to try yourself, because you wont be able to learn how to program unless you try it out. Ive picked example data that I hope will be interesting to Navy and Marine Corp public health researchers and practitioners. Link: https://geanders.github.io/navy_public_health/index.html#prerequisites 11.7 Data Visualization with R by Rob Kabakoff This book helps you create the most popular visualizations - from quick and dirty plots to publication-ready graphs. The text relies heavily on the ggplot2 package for graphics, but other approaches are covered as well. Link: https://rkabacoff.github.io/datavis/ 11.8 Fundamentals of Data Visualization by Claus Wilke The book is meant as a guide to making visualizations that accurately reflect the data, tell a story, and look professional. Link: https://clauswilke.com/dataviz/ 11.9 ggplot2 in 2 by Lucy DAgostino McGowan Really good overview of ggplot2. The premise is that youll cover the fundamentals in 2 hours. Oscar Baruffa made a sped-up screencast while working through it. It did take 2 hours :). Paid: Pay what you want, minimum $4.99 $5 Link: https://leanpub.com/ggplot2in2 11.10 ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham ggplot2 is an R package for producing statistical, or data, graphics. Unlike most other graphics packages, ggplot2 has an underlying grammar, based on the Grammar of Graphics (Wilkinson 2005), that allows you to compose graphs by combining independent components. This makes ggplot2 powerful. Rather than being limited to sets of pre-defined graphics, you can create novel graphics that are tailored to your specific problem. Link: https://ggplot2-book.org/ 11.11 Graphical Data Analysis with R by Antony Unwin The main aim of the book is to show, using real datasets, what information graphical displays can reveal in data. The target readership includes anyone carrying out data analyses who wants to understand their data using graphics. The book is published by CRC Press and available to purchase, but all the examples and code are freely available on a comprehensive website accompanying the text at http://www.gradaanwr.net/ Link: http://www.gradaanwr.net/ 11.12 Hands-On Data Visualization: Interactive Storytelling from Spreadsheets to Code by Jack Dougherty, Ilya Ilyankou (Oscar: looks like am amazing resource and includes code templates!) In this book, youll learn how to create true and meaningful data visualizations through chapters that blend design principles and step-by-step tutorials, in order to make your information-based analysis and arguments more insightful and compelling. Just as sentences become more persuasive with supporting evidence and source notes, your data-driven writing becomes more powerful when paired with appropriate tables, charts, or maps. Words tell us stories, but visualizations show us data stories by transforming quantitative, relational, or spatial patterns into images. When visualizations are well-designed, they draw our attention to what is most important in the data in ways that would be difficult to communicate through text alone. Link: https://handsondataviz.org/ 11.13 JavaScript for R by John Coene Learn how to build your own data visualisation packages, improve shiny with JavaScript, and use JavaScript for computations. Link: https://javascript-for-r.com 11.14 plotly Interactive web-based data visualization with R, plotly, and shiny by Carson Sievert In this book, youll gain insight and practical skills for creating interactive and dynamic web graphics for data analysis from R. It makes heavy use of plotly for rendering graphics, but youll also learn about other R packages that augment a data science workflow, such as the tidyverse and shiny. Along the way, youll gain insight into best practices for visualization of high-dimensional data, statistical graphics, and graphical perception. Link: https://plotly-r.com/ 11.15 R Graphics Cookbook, 2nd edition by Winston Chang The goal of the cookbook is to provide solutions to common tasks and problems in analyzing data. Link: https://r-graphics.org/ 11.16 Solutions to ggplot2: Elegant Graphics for Data Analysis by Howard Baek This is the website for Solutions to ggplot2: Elegant Graphics for Data Analysis, a solution manual to the exercises in the 3rd edition of ggplot2: Elegant Graphics for Data Analysis, written by Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen. While there are bookdown solution manuals to Hadley Wickhams Advanced R and Mastering Shiny, there is no such thing for the ggplot2 book. This website is an attempt to fill this missing void. Link: https://ggplot2-book-solutions-3ed.netlify.app/index.html "],["field-specific.html", "12 Field specific 12.1 An introduction to quantitative analysis of political data in R 12.2 Analyzing Financial and Economic Data with R 12.3 Computer-age Calculus with R 12.4 Crime by the Numbers: A Criminologists Guide to R 12.5 Cryptocurrency Research: Open Source R Tutorial 12.6 Data Science in Education Using R 12.7 Data Skills for Reproducible Science 12.8 Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data 12.9 Handbook of Graphs and Networks in People Analytics: With Examples in R and Python 12.10 Handbook of Regression Modeling in People Analytics 12.11 How to be a modern scientist 12.12 Introduction to Econometrics with R 12.13 Learning Microeconometrics with R 12.14 Machine Learning for Factor Investing 12.15 Public Policy Analytics: Code & Context for Data Science in Government 12.16 R for Excel users 12.17 R for SEO 12.18 R for Water Resources Data Science 12.19 R Programming with Minecraft 12.20 Technical Foundations of Informatics", " 12 Field specific 12.1 An introduction to quantitative analysis of political data in R by Erik Gahner Larsen, Zoltán Fazekas In this book, we aim to provide an easily accessible introduction to R for the collection, study and presentation of different types of political data. Specifically, the book will teach you how to get different types of political data into R and manipulate, analyze and visualize the output. In doing this, we will not only teach you how to get existing data into R, but also how to collect your own data. Link: http://qpolr.com/ 12.2 Analyzing Financial and Economic Data with R by Marcelo S. Perlin Not surprisingly, fields with abundant access to data and practical applications, such as economics and finance, it is expected that a graduate student or a data analyst has learned at least one programming language that allows him/her to do his work efficiently. Learning how to program is becoming a requisite for the job market. Link: https://www.msperlin.com/afedR/ 12.3 Computer-age Calculus with R by Daniel Kaplan R is closely associated with statistics, but not with calculus. It turns out that R is an excellent language for doing calculus. This book shows how to do common calculus calculations using R. Link: https://dtkaplan.github.io/RforCalculus/ 12.4 Crime by the Numbers: A Criminologists Guide to R by Jacob Kaplan This book introduces the programming language R and is meant for undergrads or graduate students studying criminology. R is a programming language that is well-suited to the type of work frequently done in criminology - taking messy data and turning it into useful information. While R is a useful tool for many fields of study, this book focuses on the skills criminologists should know and uses crime data for the example data sets. Link: https://crimebythenumbers.com 12.5 Cryptocurrency Research: Open Source R Tutorial by Riccardo (Ricky) Esclapon, John Chandler Johnson, Kai R. Larsen The tutorial is in R. For those without experience programming in R we have a high-level version to help you learn before attempting the full version. Scroll down for a breakdown of the individual sections for an overview of what you will learn throughout. You will get more familiar with tools from the tidyverse, including dplyr, ggplot2, tibble and purrr. These tools provide an excellent complete ecosystem to do data science in R. You will learn to create machine learning models and how to fairly assess their performance. Cryptocurrency Data: You will learn these tools analyzing the latest cryptocurrency data. The tutorial automatically refreshes every 12 hours and the data is publicly available and refreshed hourly. Link: https://cryptocurrencyresearch.org/ 12.6 Data Science in Education Using R by Ryan A. Estrellado, Emily A. Bovee, Jesse Mostipak, Isabella C. Velásquez Dear Data Scientists, Educators, and Data Scientists who are Educators: This book is a warm welcome and an invitation. If youre a data scientist in education or an educator in data science, your role isnt exactly straightforward. This book is our contribution to a growing movement to merge the paths of data analysis and education. We wrote this book to make your first step on that path a little clearer and a little less scary. Link: https://datascienceineducation.com/ 12.7 Data Skills for Reproducible Science by PsyTeachR team, University of Glasgow This course provides an overview of skills needed for reproducible research and open science using the statistical programming language R. Students will learn about data visualisation, data tidying and wrangling, archiving, iteration and functions, probability and data simulations, general linear models, and reproducible workflows. Learning is reinforced through weekly assignments that involve working with different types of data. Link: https://psyteachr.github.io/msc-data-skills/ 12.8 Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data by Michael Friendly, David Meyer Presents an applied treatment of modern methods for the analysis of categorical data, both discrete response data and frequency data. It explains how to use graphical methods for exploring data, spotting unusual features, visualizing fitted models, and presenting results. Paid: $80 Link: http://ddar.datavis.ca/ 12.9 Handbook of Graphs and Networks in People Analytics: With Examples in R and Python by Keith McNulty The technology of graphs is all around us, and enables so many of the ways in which we live our lives today. That same technology is also available to us at no cost as an analytic tool to allow us to better understand network structures and dynamics in the fields of science, technology, economics, sociology and psychology to name just a few. It is available to academics and practitioners alike, and can be used on problems ranging from a very small network analysis which takes a few minutes on a laptop, to massive scale network mining requiring days or weeks of processing time. But heres the problem: few people really know how to do network analysis. It is still considered by many as a deep specialism or even a dark art. It shouldnt be. This book aims to make the field of graph and network analysis more approachable to students and professionals by explaining the most important elements of theory and sharing common methodologies using open source programming languages like R and Python. It does so by explaining theory in as much detail as is necessary to support analytical curiosity and interpretation, and by using a wide array of example data sets and code snippets to demonstrate the specific implementation and interpretation of methodologies. Link: https://ona-book.org/ 12.10 Handbook of Regression Modeling in People Analytics by Keith McNulty It is the authors firm belief that all people analytics professionals should have a strong understanding of regression models and how to implement and interpret them in practice, and the aim with this book is to provide those who need it with help in getting there. For accompanying solutions to some of the questions: https://keithmcnulty.github.io/peopleanalytics-regression-book/solutions/ Link: http://peopleanalytics-regression-book.org/index.html 12.11 How to be a modern scientist by Jeffrey Leek A book about how to be a scientist the modern, open-source way. The face of academia is changing. It is no longer sufficient to just publish or perish. We are now in an era where Twitter, Github, Figshare, and Alt Metrics are regular parts of the scientific workflow. Here I give high level advice about which tools to use, how to use them, and what to look out for. This book is appropriate for scientists at all levels who want to stay on top of the current technological developments affecting modern scientific careers. Paid: Free or pay what you want $10 Link: https://leanpub.com/modernscientist 12.12 Introduction to Econometrics with R by Christoph Hanck, Martin Arnold, Alexander Gerber, Martin Schmelzer Instead of confronting students with pure coding exercises and complementary classic literature like the book by Venables & Smith (2010), we figured it would be better to provide interactive learning material that blends R code with the contents of the well-received textbook Introduction to Econometrics by Stock & Watson (2015) which serves as a basis for the lecture. Link: https://www.econometrics-with-r.org/ 12.13 Learning Microeconometrics with R by Christopher P. Adams This book provides an introduction to the field of microeconometrics through the use of R. The focus is on applying current learning from the field to real world problems. It uses R to both teach the concepts of the field and show the reader how the techniques can be used. It is aimed at the general reader with the equivalent of a bachelors degree in economics, statistics or some more technical field. It covers the standard tools of microeconometrics, OLS, instrumental variables, Heckman selection and difference in difference. In addition, it introduces bounds, factor models, mixture models and empirical Bayesian analysis. Paid: $100 Link: https://www.routledge.com/Learning-Microeconometrics-with-R/Adams/p/book/9780367255381 12.14 Machine Learning for Factor Investing by Guillaume Coqueret, Tony Guida This book is intended to cover some advanced modelling techniques applied to equity investment strategies that are built on firm characteristics. Link: http://www.mlfactor.com/ 12.15 Public Policy Analytics: Code & Context for Data Science in Government by Ken Steif, Ph.D The goal of this book is to make data science accessible to social scientists and City Planners, in particular. I hope to convince readers that one with strong domain expertise plus intermediate data skills can have a greater impact in government than the sharpest computer scientist who has never studied economics, sociology, public health, political science, criminology etc. Link: https://urbanspatial.github.io/PublicPolicyAnalytics/ 12.16 R for Excel users by Julie Lowndes, Allison Horst This course is for Excel users who want to add or integrate R and RStudio into their existing data analysis toolkit. It is a friendly intro to becoming a modern R user, full of tidyverse, RMarkdown, GitHub, collaboration & reproducibility. Link: https://rstudio-conf-2020.github.io/r-for-excel/ 12.17 R for SEO by François Joly Even though R is a terrific option for SEO, there are simply not enough resources out there. This guide is not here to deliver a course about R, there are plenty already. This guide is meant to be as practical as possible. How things should be done in an R-ish way is not the purpose of this guide. Grab what you want to grab and feel free to submit your own solution. Link: https://www.rforseo.com/ 12.18 R for Water Resources Data Science by Ryan Peek, Rich Pauloo Consists of 2 courses Introductory: This course is most relevant and targeted at folks who work with data, from analysts and program staff to engineers and scientists. This course provides an introduction to the power and possibility of a reproducible programming language (R) by demonstrating how to import, explore, visualize, analyze, and communicate different types of data. Using water resources based examples, this course guides participants through basic data science skills and strategies for continued learning and use of R. Intermediate: In this course, we will move more quickly, assume familiarity with basic R skills, and also assume that the participant has working experience with more complex workflows, operations, and code-bases. Each module in this course functions as a stand-alone lesson, and can be read linearly, or out of order according to your needs and interests. Each module doesnt necessarily require familiarity with the previous module. This course emphasizes intermediate scripting skills like iteration, functional programming, writing functions, and controlling project workflows for better reproducibility and efficiency. Approaches to working with more complex data structures like lists and timeseries data, the fundamentals of building Shiny Apps, pulling water resources data from APIs, intermediate mapmaking and spatial data processing, integrating version control in projects with git. Link: https://www.r4wrds.com/ 12.19 R Programming with Minecraft by Brooke Anderson, Karl Broman, Gergely Daróczi, Mario Inchiosa, David Smith, Ali Zaidi Minecraft is awesome fun, especially in creative mode, where you can build all sorts of crazy stuff. But ambitious building projects can be really tedious to create by hand. With the miner R package, you can write R code to manipulate your Minecraft world and create even more awesome stuff. Heres an introduction Rstats NYC conference talk on it: https://www.youtube.com/watch?v=r_JgPF8MJpY Link: https://kbroman.org/miner_book/?s=09 12.20 Technical Foundations of Informatics by Michael Freeman, Joel Ross This book covers the foundation skills necessary to start writing computer programs to work with data using modern and reproducible techniques. It requires no technical background. These materials were developed for the INFO 201: Technical Foundations of Informatics course taught at the University of Washington Information School; however they have been structured to be an online resource for anyone hoping to learn to work with information using programmatic approaches. Link: https://info201.github.io/ "],["geospatial.html", "13 Geospatial 13.1 Geocomputation with R 13.2 Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny 13.3 Introduction to Spatial Data Programming with R 13.4 Predictive Soil Mapping with R 13.5 Spatial Data Science 13.6 Spatial Microsimulation with R 13.7 Spatial Modelling for Data Scientists 13.8 Using R for Digital Soil Mapping", " 13 Geospatial 13.1 Geocomputation with R by Robin Lovelace, Jakub Nowosad, Jannes Muenchow This is the online home of Geocomputation with R, a book on geographic data analysis, visualization and modeling. Link: https://geocompr.robinlovelace.net/ 13.2 Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny by Paula Moraga This book describes spatial and spatio-temporal statistical methods and visualization techniques to analyze georeferenced health data in R. After a detailed introduction of geospatial data, the book shows how to develop Bayesian hierarchical models for disease mapping and apply computational approaches such as the integrated nested Laplace approximation (INLA) and the stochastic partial differential equation (SPDE) to analyze areal and geostatistical data. Link: https://www.paulamoraga.com/book-geospatial/ 13.3 Introduction to Spatial Data Programming with R by Michael Dorman This book introduces processing and analysis methods for working with spatial data in R. The book is composed of two parts. The first part gives an overview of the basic syntax and usage of the R language, required before we can start working with spatial data. The second part then covers spatial data workflows, including how to process rasters, vector layers, and both of them together, as well as two selected advanced topics: spatio-temporal data and spatial interpolation. Link: https://geobgu.xyz/r 13.4 Predictive Soil Mapping with R by Tom Heng, Robert A. MacMillan Predictive Soil Mapping (PSM) with R explains how to import, process and analyze soil data in R using the state-of-the-art soil and Machine Learning packages with ultimate objective to produce most objective spatial predictions of soil numeric and factor-type variables. Especial focus has been put on using R in combination with the Open Source GIS such as GDAL, SAGA GIS and similar, and on using Machine Learning packages ranger, xgboost, SuperLearner and similar. This book is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contributions of new chapters are welcome. Link: https://soilmapper.org 13.5 Spatial Data Science by Edzer Pebesma, Roger Bivand This book introduces and explains the concepts underlying spatial data: points, lines, polygons, rasters, coverages, geometry attributes, data cubes, reference systems, as well as higher-level concepts including how attributes relate to geometries and how this affects analysis. Link: https://keen-swartz-3146c4.netlify.app/ 13.6 Spatial Microsimulation with R by Robin Lovelace, Morgane Dumont Imagine a world in which data on companies, households and governments were widely available. Imagine, further, that researchers and decision-makers acting in the public interest had tools enabling them to test and model such data to explore different scenarios of the future. People would be able to make more informed decisions, based on the best available evidence. In this technocratic dreamland pressing problems such as climate change, inequality and poor human health could be solved. These are the types of real-world issues that we hope the methods in this book will help to address. Spatial microsimulation can provide new insights into complex problems and, ultimately, lead to better decision-making. By shedding new light on existing information, the methods can help shift decision-making processes away from ideological bias and towards evidence-based policy. Link: https://spatial-microsim-book.robinlovelace.net/index.html 13.7 Spatial Modelling for Data Scientists by Francisco Rowe, Dani Arribas-Bel This is the website for Spatial Modeling for Data Scientists. This is a course taught by Dr. Francisco Rowe and Dr. Dani Arribas-Bel in the Second Semester of 2020/21 at the University of Liverpool, United Kingdom. You will learn how to analyse and model different types of spatial data as well as gaining an understanding of the various challenges arising from manipulating such data. Link: https://gdsl-ul.github.io/san/ 13.8 Using R for Digital Soil Mapping by Malone, Brendan P., Minasny, Budiman, McBratney, Alex B Describes in detail, with ample exercises, how digital soil mapping is done This work includes a number of work-flows that direct users how to create digital soil maps for their own projects This work includes tutorials for users to learn the fundamentals of R, but with a focus on how to use it for digital soil mapping Paid: $90 Link: https://www.springer.com/gp/book/9783319443256 "],["getting-cleaning-and-wrangling-data.html", "14 Getting, cleaning and wrangling data 14.1 21 Recipes for Mining Twitter Data with rtweet 14.2 A Beginners Guide to Clean Data 14.3 Spreadsheet Munging Strategies 14.4 Text Mining with R 14.5 Text Mining With Tidy Data Principles", " 14 Getting, cleaning and wrangling data 14.1 21 Recipes for Mining Twitter Data with rtweet by Bob Rudis The recipes contained in this book use the rtweet package by Michael W. Kearney. Link: https://rud.is/books/21-recipes/ 14.2 A Beginners Guide to Clean Data by Benjamin Greve This book will help you to become a better data scientist by showing you the things that can go wrong when working with data - particularly low-quality data. A key difference between a junior and a senior data scientist is the awareness of potential pitfalls. The experienced data scientist will expect them, navigate around them and avoid costly iteration cycles. After reading this book, you will be able to spot data quality problems and deal with them before they can break your work, saving yourself a lot of time. Link: https://b-greve.gitbook.io/beginners-guide-to-clean-data/ 14.3 Spreadsheet Munging Strategies by Duncan Garmonsway This is a work-in-progress book about getting data out of spreadsheets, no matter how peculiar. The book is designed primarily for R users who have to extract data from spreadsheets and who are already familiar with the tidyverse. It has a cookbook structure, and can be used as a reference, but readers who begin in the middle might have to work backwards from time to time. Link: https://nacnudus.github.io/spreadsheet-munging-strategies/ 14.4 Text Mining with R by Julia Silge, David Robinson This book serves as an introduction of text mining using the tidytext package and other tidy tools in R. The functions provided by the tidytext package are relatively simple; what is important are the possible applications. Thus, this book provides compelling examples of real text mining problems. Link: https://www.tidytextmining.com/ 14.5 Text Mining With Tidy Data Principles by Julia Silge Text data sets are diverse and ubiquitous, and tidy data principles provide an approach to make text mining easier, more effective, and consistent with tools already in wide use. In this tutorial, you will develop your text mining skills using the tidytext package in R, along with other tidyverse tools. Link: https://juliasilge.shinyapps.io/learntidytext/ "],["journalism.html", "15 Journalism 15.1 Practical R for Mass Communication and Journalism 15.2 Using R for Data Journalism", " 15 Journalism 15.1 Practical R for Mass Communication and Journalism by Sharon Machlis Welcome to this excerpt from Practical R for Mass Communication and Journalism. In these sample chapters, youll: learn how to find your way around R and RStudio, see how much you can do in just a few lines of code, start doing some basic data exploration, and get some ideas and sample code for using R in analyzing election results. I hope you find this excerpt useful! If you do and would like to read more, you can order the complete book from CRC Press or Amazon. Paid: Free samples $55 Link: http://www.machlis.com/R4Journalists/index.html 15.2 Using R for Data Journalism by Andrew Ba Tran This site will help you learn how to use the statistical computing and graphics language R to enhance your data analysis and reporting process. It was originally part of a free MOOC offered by the Knight Center at the University of Texas Link: https://learn.r-journalism.com/en/ "],["life-sciences.html", "16 Life Sciences 16.1 An Open Compendium of Soil Datasets 16.2 Assigning cell types with SingleR 16.3 Computational Genomics with R 16.4 Data Analysis and Visualization in R for Ecologists 16.5 Data Analysis for the Life Sciences 16.6 Data Science for the Biomedical Sciences 16.7 Experimental Design for Laboratory Biologists: Maximising Information and Improving Reproducibility 16.8 Git and Github for Advanced Ecological Data Analysis 16.9 Hydroinformatics at VT 16.10 Introduction to Data Analysis with R 16.11 Modern Statistics for Modern Biology 16.12 Numerical Ecology with R 16.13 Orchestrating Single-Cell Analysis with Bioconductor 16.14 R for applied epidemiology and public health 16.15 R for Conservation and Development Projects: A Primer for Practitioners 16.16 R for Health Data Science 16.17 Reproducible Medical Research with R 16.18 Statistics in R for Biodiversity Conservation Paperback 16.19 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling 16.20 WEHI Intro to Tidy R Course", " 16 Life Sciences 16.1 An Open Compendium of Soil Datasets by Tomislav Hengl (Not R specific but looks really relevant) This is a public compendium of global, regional, national and sub-national soil samples and/or soil profile datasets (points with Observations and Measurements of soil properties and characteristics). Datasets listed here, assuming compatible open license, are afterwards imported into the Global compilation of soil chemical and physical properties and soil classes and eventually used to create a better open soil information across countries. The specific objectives of this initiative are: To enable data digitization, import and binding + harmonization, To accelerate research collaboration and networking, To enable development of more accurate / more usable global and regional soil property and class maps (typically published via https://OpenLandMap.org), Link: https://opengeohub.github.io/SoilSamples/ 16.2 Assigning cell types with SingleR by Aaron Lun and contributors This book covers the use of SingleR, one implementation of an automated annotation method for cell type annotation. Link: https://bioconductor.org/books/3.12/SingleRBook/ 16.3 Computational Genomics with R by Altuna Akalin The aim of this book is to provide the fundamentals for data analysis for genomics. We developed this book based on the computational genomics courses we are giving every year. Link: http://compgenomr.github.io/book/ 16.4 Data Analysis and Visualization in R for Ecologists by François Michonneau, Auriel Fournier Data Carpentrys aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with ecology data in R. This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from R. This lesson assumes no prior knowledge of R or RStudio and no programming experience. Link: https://datacarpentry.org/R-ecology-lesson/ 16.5 Data Analysis for the Life Sciences by Rafael A Irizarry, Michael I Love Data analysis is now part of practically every research project in the life sciences. In this book we use data and computer code to teach the necessary statistical concepts and programming skills to become a data analyst. Instead of showing theory first and then applying it to toy examples, we start with actual applications. http://genomicsclass.github.io/book/ Paid: Free or pay what you want $40 Link: https://leanpub.com/dataanalysisforthelifesciences 16.6 Data Science for the Biomedical Sciences by Daniel Chen, Anne Brown We hope this book provides a gentle introduction to data science. The main goal is to understand how to work with spreadsheet data and how data can be manipulated for multiple purposes. If nothing else, the book hopes to help you plan how to structure your own datasets for your own analysis. Even if you never go on to program on your own, understanding the way data can be manipulated and having a plan for your own dataset in the processing pipeline, will go a long ways when leaning and doing the analysis on your own, and/or working with collegues and collaborators on a project. Link: https://ds4biomed.tech/ 16.7 Experimental Design for Laboratory Biologists: Maximising Information and Improving Reproducibility by Stanley E. Lazic This practical guide shows biologists how to design reproducible experiments that have low bias, high precision, and results that are widely applicable. With specific examples using both cell cultures and model organisms, it shows how to plan a successful experiment. It demonstrates how to control biological and technical factors that can introduce bias or add noise, and covers rarely discussed topics such as graphical data exploration, choosing outcome variables, data quality control checks, and data pre-processing. It also shows how to use R for analysis, and is designed for those with no prior experience. This is an ideal guide for anyone conducting lab-based biological research. Paid: $52 Link: https://stanlazic.github.io/EDLB.html 16.8 Git and Github for Advanced Ecological Data Analysis by Alexa Fredston This material was prepared for a three-hour virtual session to teach Git and Github to a graduate-level course on Advanced Ecological Data Analysis taught at Rutgers University by Malin Pinsky and Rachael Winfree. (However, the only course-specific material is Section 4; the rest should be applicable to any reader.) Link: https://afredston.github.io/learn-git/learn-git.htm 16.9 Hydroinformatics at VT by JP Gannon This bookdown contains the notes and most exercises for a course on data analysis techniques in hydrology using the programming language R. The material will be updated each time the course is taught. If new topics are added, the topics they replace will be left, in case they are useful to others. Link: https://vt-hydroinformatics.github.io/ 16.10 Introduction to Data Analysis with R by Jannik Buhr This is a video lecture series with accompanying lecture script that is designed to read much like a book. The lecture is held in English for biochemists at Heidelberg University, Germany, but the examples covered are no specific to life sciences in order to enable a focus on learning the techniques with R. Link: https://jmbuhr.de/dataIntro20 16.11 Modern Statistics for Modern Biology by Susan Holmes, Wolfgang Huber The aim of this book is to enable scientists working in biological research to quickly learn many of the important ideas and methods that they need to make the best of their experiments and of other available data. Link: https://www.huber.embl.de/msmb/ 16.12 Numerical Ecology with R by Daniel Borcard, François Gillet, Pierre Legendre This new edition of Numerical Ecology with R guides readers through an applied exploration of the major methods of multivariate data analysis, as seen through the eyes of three ecologists. It provides a bridge between a textbook of numerical ecology and the implementation of this discipline in the R language. The book begins by examining some exploratory approaches. Paid: $60 Link: https://www.springer.com/us/book/9783319714035 16.13 Orchestrating Single-Cell Analysis with Bioconductor by Aaron Lun, Robert Amezquita, Stephanie Hicks, Raphael Gottardo This is the website for Orchestrating Single-Cell Analysis with Bioconductor, a book that teaches users some common workflows for the analysis of single-cell RNA-seq data (scRNA-seq). Link: https://osca.bioconductor.org/ 16.14 R for applied epidemiology and public health by EpiR authors This handbook is produced by a collaboration of epidemiologists from around the world drawing upon experience with organizations including local, state, provincial, and national health agencies, the World Health Organization (WHO), Médecins Sans Frontières / Doctors without Borders (MSF), hospital systems, and academic institutions. Written by epidemiologists, for epidemiologists. Link: https://epirhandbook.com/ 16.15 R for Conservation and Development Projects: A Primer for Practitioners by Nathan Whitmore This book is aimed at conservation and development practitioners who need to learn and use R in a part-time professional context. It gives people with a non-technical background a set of skills to graph, map, and model in R. It also provides background on data integration in project management and covers fundamental statistical concepts. The book aims to demystify R and give practitioners the confidence to use it. Key Features: Viewing data science as part of a greater knowledge and decision making system Foundation sections on inference, evidence, and data integration Plain English explanations of R functions Relatable examples which are typical of activities undertaken by conservation and development organisations in the developing world Worked examples showing how data analysis can be incorporated into project reports Paid: $60 Link: https://www.routledge.com/R-for-Conservation-and-Development-Projects-A-Primer-for-Practitioners/Whitmore/p/book/9780367205485 16.16 R for Health Data Science by Ewan Harrison, Riinu Pius In this age of information, the manipulation, analysis and interpretation of data have become a fundamental part of professional life. Nowhere more so than in the delivery of healthcare. From the understanding of disease and the development of new treatments, to the diagnosis and management of individual patients, the use of data and technology are now an integral part of the business of healthcare. Those working in healthcare interact daily with data, often without realising it. The conversion of this avalanche of information to useful knowledge is essential for high-quality patient care. An important part of this information revolution is the opportunity for everybody to become involved in data analysis. This democratisation is driven in part by the open source software movement no longer do we require expensive specialised software to do this. The statistical programming language, R, is firmly at the heart of this. This book will take an individual with little or no experience in data science all the way through to the execution of sophisticated analyses. We emphasise the importance of truly understanding the underlying data with liberal use of plotting, rather than relying on opaque and possibly poorly understood statistical tests. There are numerous examples included that can be adapted for your own data, together with our own R packages with easy-to-use functions. Link: https://argoshare.is.ed.ac.uk/healthyr_book/ 16.17 Reproducible Medical Research with R by Peter D.R. Higgins, MD, PhD, MSc This is a book for anyone in the medical field interested in analyzing the data available to them to better understand health, disease, or the delivery of care. This could include nurses, dieticians, psychologists, and PhDs in related fields, as well as medical students, residents, fellows, or doctors in practice. I expect that most learners will be using this book in their spare time at night and on weekends, as the health training curricula are already packed full of information, and there is no room to add skills in reproducible research to the standard curriculum. This book is designed for self-teaching, and many hints and solutions will be provided to avoid roadblocks and frustration. Many learners find themselves wanting to develop reproducible research skills after they have finished their training, and after they have become comfortable with their clinical role. This is the time when they identify and want to address problems faced by patients in their practice with the data they have before them. This book is for you. Link: https://bookdown.org/pdr_higgins/rmrwr/ 16.18 Statistics in R for Biodiversity Conservation Paperback by Carl Smith, Antonio Uzal, Mark Warren A practical handbook to introduce data analysis and model fitting using R to ecologists and conservation biologists. The book is aimed at undergraduate and post-graduate students and provides access to datasets and RScript. Paid: $10 Link: https://www.amazon.co.uk/dp/B08HBLYHQL/ref=cm_sw_r_cp_apa_i_g0luFb86PXJ9Z 16.19 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling by Andrew B. Lawson Progressively more and more attention has been paid to how location affects health outcomes. The area of disease mapping focusses on these problems, and the Bayesian paradigm has a major role to play in the understanding of the complex interplay of context and individual predisposition in such studies of disease. Using R for Bayesian Spatial and Spatio-Temporal Health Modeling provides a major resource for those interested in applying Bayesian methodology in small area health data studies. Paid: $100 Link: https://www.routledge.com/Using-R-for-Bayesian-Spatial-and-Spatio-Temporal-Health-Modeling/Lawson/p/book/9780367490126 16.20 WEHI Intro to Tidy R Course by Brendan Ansell A complete beginners introduction to tidy R for data transformation, visualization and analysis automation with applications in experimental biology. This book is based on a short course developed for biomedical scientists at the WEHI Medical Research Institute. The content is designed to make learners comfortable with using R for exploratory analysis of large data sets, but does not cover statistics. The material and teaching examples draw on popular (non-biological) data sets, as well as gene expression and drug screening data types. Link: https://bookdown.org/ansellbr/WEHI_tidyR_course_book/ "],["machine-learning.html", "17 Machine Learning 17.1 A Minimal rTorch Book 17.2 Explanatory Model Analysis 17.3 Feature Engineering and Selection: A Practical Approach for Predictive Models 17.4 Hands-On Machine Learning with R 17.5 Interpretable Machine Learning 17.6 Lightweight Machine Learning Classics with R Marek Gagolewski 17.7 Machine Learning for Factor Investing 17.8 Mathematics and Programming for Machine Learning with R: From the Ground Up 1st Edition, Kindle 17.9 mlr3 book 17.10 Supervised Machine Learning for Text Analysis in R 17.11 The caret Package 17.12 Tidy Modeling with R", " 17 Machine Learning 17.1 A Minimal rTorch Book by Alfonso R. Reyes Practically, you can do everything you could with PyTorch within the R ecosystem. Link: https://f0nzie.github.io/rtorch-minimal-book/ 17.2 Explanatory Model Analysis by Przemyslaw Biecek, Tomasz Burzykowski Responsible, Fair and Explainable Predictive Modeling with examples in R and Python Link: https://pbiecek.github.io/ema/ 17.3 Feature Engineering and Selection: A Practical Approach for Predictive Models by Max Kuhn, Kjell Johnson The goals of Feature Engineering and Selection are to provide tools for re-representing predictors, to place these tools in the context of a good predictive modeling framework, and to convey our experience of utilizing these tools in practice. Link: http://www.feat.engineering/index.html 17.4 Hands-On Machine Learning with R by Bradley Boehmke, Brandon Greenwell This book provides hands-on modules for many of the most common machine learning methods to include: Generalized low rank models, Clustering algorithms, Autoencoders, Regularized models, Random forests, Gradient boosting machines, Deep neural networks, Stacking / super learners and more! Link: https://bradleyboehmke.github.io/HOML/ 17.5 Interpretable Machine Learning by Christoph Molnar A Guide for Making Black Box Models Explainable Online book Paid: Free or pay what you want $42 Link: https://leanpub.com/interpretable-machine-learning 17.6 Lightweight Machine Learning Classics with R Marek Gagolewski In this book we will take an unpretentious glance at the most fundamental algorithms that have stood the test of time and which form the basis for state-of-the-art solutions of modern AI, which is principally (big) data-driven. Link: https://lmlcr.gagolewski.com/ 17.7 Machine Learning for Factor Investing by Guillaume Coqueret, Tony Guida This book is intended to cover some advanced modelling techniques applied to equity investment strategies that are built on firm characteristics. Link: http://www.mlfactor.com/ 17.8 Mathematics and Programming for Machine Learning with R: From the Ground Up 1st Edition, Kindle by William B. Claster Based on the authors experience in teaching data science for more than 10 years, Mathematics and Programming for Machine Learning with R: From the Ground Up reveals how machine learning algorithms do their magic and explains how these algorithms can be implemented in code. It is designed to provide readers with an understanding of the reasoning behind machine learning algorithms as well as how to program them. Written for novice programmers, the book progresses step-by-step, providing the coding skills needed to implement machine learning algorithms in R. Paid: $40 Link: https://www.amazon.com/Mathematics-Programming-Machine-Learning-Ground-ebook-dp-B08JHDCX9Y/dp/B08JHDCX9Y 17.9 mlr3 book by Michel Lang The mlr3 package and ecosystem provide a generic, object-oriented, and extensible framework for classification, regression, survival analysis, and other machine learning tasks for the R language. They do not implement any learners, but provide a unified interface to many existing learners in R. Link: https://mlr3book.mlr-org.com/ 17.10 Supervised Machine Learning for Text Analysis in R by Emil Hvitfeldt, Julia Silge Modeling as a statistical practice can encompass a wide variety of activities. This book focuses on supervised or predictive modeling for text, using text data to make predictions about the world around us. We use the tidymodels framework for modeling, a consistent and flexible collection of R packages developed to encourage good statistical practice. Link: https://smltar.com/ 17.11 The caret Package by Max Kuhn The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models. Link: https://topepo.github.io/caret/index.html 17.12 Tidy Modeling with R by Max Kuhn, Julia Silge This book provides an introduction to how to use the tidymodels suite of packages to create models using a tidyverse approach and encourages good methodology and statistical practice throughout demonstrated using series of applied examples. Link: https://www.tmwr.org/ "],["network-analysis.html", "18 Network analysis 18.1 Awesome network analysis 18.2 Handbook of Graphs and Networks in People Analytics: With Examples in R and Python 18.3 Network Analysis in R Cookbook 18.4 Statistical Analysis of Network Data with R", " 18 Network analysis 18.1 Awesome network analysis Not a book, but a compendium of resources that look really valuable. Link: https://github.com/briatte/awesome-network-analysis 18.2 Handbook of Graphs and Networks in People Analytics: With Examples in R and Python by Keith McNulty The technology of graphs is all around us, and enables so many of the ways in which we live our lives today. That same technology is also available to us at no cost as an analytic tool to allow us to better understand network structures and dynamics in the fields of science, technology, economics, sociology and psychology to name just a few. It is available to academics and practitioners alike, and can be used on problems ranging from a very small network analysis which takes a few minutes on a laptop, to massive scale network mining requiring days or weeks of processing time. But heres the problem: few people really know how to do network analysis. It is still considered by many as a deep specialism or even a dark art. It shouldnt be. This book aims to make the field of graph and network analysis more approachable to students and professionals by explaining the most important elements of theory and sharing common methodologies using open source programming languages like R and Python. It does so by explaining theory in as much detail as is necessary to support analytical curiosity and interpretation, and by using a wide array of example data sets and code snippets to demonstrate the specific implementation and interpretation of methodologies. Link: https://ona-book.org/ 18.3 Network Analysis in R Cookbook by Sacha Epskamp [Oscar Baruffa: Note this resource is a bit out of date, but because there are so few available on this topic, and it might still be good as a reference, itll stay in Big Book of R for now.] Link: https://web.archive.org/web/20210414173702/http://sachaepskamp.com/files/Cookbook.html 18.4 Statistical Analysis of Network Data with R by Kolaczyk, Eric D., Csárdi, Gábor This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. Paid: $65 Link: https://www.springer.com/us/book/9781493909834#otherversion=9781493909827 "],["packages.html", "19 Packages 19.1 A Minimal Book Example 19.2 A Minimal rTorch Book 19.3 ComplexHeatmap Complete Reference 19.4 Create, Publish, and Analyze Personal Websites Using R and RStudio 19.5 data.table in R The Complete Beginners Guide 19.6 ggplot2: Elegant Graphics for Data Analysis 19.7 GT Cookbook 19.8 Highcharter Cookbook 19.9 knitr 19.10 mlr3 book 19.11 The caret Package 19.12 The Data Validation Cookbook 19.13 The lidR package 19.14 The targets R Package User Manual 19.15 The Tidyverse Cookbook", " 19 Packages 19.1 A Minimal Book Example This is a sample book written in Markdown. Link: https://benmarwick.github.io/bookdown-ort/ 19.2 A Minimal rTorch Book by Alfonso R. Reyes Practically, you can do everything you could with PyTorch within the R ecosystem. Link: https://f0nzie.github.io/rtorch-minimal-book/ 19.3 ComplexHeatmap Complete Reference by Zuguang Gu The ComplexHeatmap package is used to generate heatmap visualizations. It is a highly flexible tool to arrange multiple heatmaps and supports various annotation graphics for high-dimensional data. These visualizations are efficient to visualize visualizations between different sources of data sets and reveal potential patterns. This book here contains the full documentation to using the ComplexHeatmap package effectively with plenty of small and complex examples to help you create your own complex heatmap data vizualization. Link: https://jokergoo.github.io/ComplexHeatmap-reference/book/ 19.4 Create, Publish, and Analyze Personal Websites Using R and RStudio by Danny Morris A free, digital handbook with step-by-step instructions for launching your own personal website using R, RStudio, and other freely available technologies including GitHub, Hugo, Netlify, and Google Analytics. Link: https://r4sites-book.netlify.app/ 19.5 data.table in R The Complete Beginners Guide by Selva Prabhakaran data.table is a package is used for working with tabular data in R. It provides the efficient data.table object which is a much improved version of the default data.frame. It is super fast and has intuitive and terse syntax. If you know R language and havent picked up the data.table package yet, then this tutorial guide is a great place to start. Link: https://www.machinelearningplus.com/data-manipulation/datatable-in-r-complete-guide/ 19.6 ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham ggplot2 is an R package for producing statistical, or data, graphics. Unlike most other graphics packages, ggplot2 has an underlying grammar, based on the Grammar of Graphics (Wilkinson 2005), that allows you to compose graphs by combining independent components. This makes ggplot2 powerful. Rather than being limited to sets of pre-defined graphics, you can create novel graphics that are tailored to your specific problem. Link: https://ggplot2-book.org/ 19.7 GT Cookbook by Thomas Mock This cookbook attempts to walk through many of the example usecases for gt, and provide useful commentary around the use of the various gt functions. The full gt documentation has other more succinct examples and full function arguments. For advanced use cases, make sure to check out the Advanced Cookbook Link: https://themockup.blog/static/gt-cookbook.html 19.8 Highcharter Cookbook by Tom Bishop Highcharter is an R implementation of the highcharts javascript library, enabled by Rs htmlwidgets package. Most of the highcharts functionality is implemented through highcharter however the documentation is a little light. This guide will provide examples on how to create and customise various graphs whilst providing some tips on how to think about the package that will help you build and debug your more ambitious charts. Link: https://www.tmbish.me/lab/highcharter-cookbook/ 19.9 knitr by Yihui Xie Dynamic documents with R and knitr! The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package. Link: https://yihui.org/knitr/ 19.10 mlr3 book by Michel Lang The mlr3 package and ecosystem provide a generic, object-oriented, and extensible framework for classification, regression, survival analysis, and other machine learning tasks for the R language. They do not implement any learners, but provide a unified interface to many existing learners in R. Link: https://mlr3book.mlr-org.com/ 19.11 The caret Package by Max Kuhn The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models. Link: https://topepo.github.io/caret/index.html 19.12 The Data Validation Cookbook by Mark P.J. van der Loo The purposes of this book include demonstrating the main tools and workflows of the validate package, giving examples of common data validation tasks, and showing how to analyze data validation results. Link: https://data-cleaning.github.io/validate/ 19.13 The lidR package by Jean-Romain Roussel, Tristan R.H. Goodbody, Piotr Tompalski lidR is an R package for manipulating and visualizating airborne laser scanning (ALS) data with an emphasis on forestry applications. The package is entirely open source and is integrated within the geospatial R ecosytem (i.e. raster, sp, sf, rgdal etc.). This guide has been written to help both the ALS novice, as well as seasoned point cloud processing veterans. Link: https://jean-romain.github.io/lidRbook/ 19.14 The targets R Package User Manual by Will Landau The targets package is a Make-like pipeline toolkit for Statistics and data science in R. With targets, you can maintain a reproducible workflow without repeating yourself. targets learns how your pipeline fits together, skips costly runtime for tasks that are already up to date, runs only the necessary computation, supports implicit parallel computing, abstracts files as R objects, and shows tangible evidence that the results match the underlying code and data. Link: https://books.ropensci.org/targets/ 19.15 The Tidyverse Cookbook by Edited by Garrett Grolemund This book collects code recipes for doing data science with Rs tidyverse. Each recipe solves a single common task, with a minimum of discussion. Link: https://rstudio-education.github.io/tidyverse-cookbook/ "],["r-package-development.html", "20 R package development 20.1 HTTP testing in R 20.2 R packages 20.3 rOpenSci Packages: Development, Maintenance, and Peer Review", " 20 R package development 20.1 HTTP testing in R by Scott Chamberlain, Maëlle Salmon This book is meant to be a free, central reference for developers of R packages accessing web resources, to help them have a faster and more robust development. Our aim is to develop an useful guidance to go with the great recent tools that vcr, webmockr, httptest and presser are. Link: https://books.ropensci.org/http-testing/ 20.2 R packages by Hadley Wickham, Jenny Bryan Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. In this section youll learn how to turn your code into packages that others can easily download and use. Writing a package can seem overwhelming at first. So start with the basics and improve it over time. It doesnt matter if your first version isnt perfect as long as the next version is better. Link: https://r-pkgs.org/ 20.3 rOpenSci Packages: Development, Maintenance, and Peer Review by rOpenSci software review editorial team This book is a package development guide for authors, maintainers, reviewers and editors of rOpenSci. Link: https://devguide.ropensci.org/index.html "],["r-programming.html", "21 R programming 21.1 A sufficient Introduction to R 21.2 Advanced Object-Oriented Programming in R 21.3 Advanced R 21.4 Advanced R Solutions 21.5 An Introduction to Data Analysis 21.6 An Introduction to R 21.7 Another Book on Data Science : Learn R and Python in Parallel 21.8 Best Coding Practices for R 21.9 Book of R: A First Course in Programming and Statistics 21.10 Cookbook for R 21.11 Data Analytics with R: A Recipe book 21.12 Domain-Specific Languages in R 21.13 Efficient R programming 21.14 Field Guide to the R Ecosystem 21.15 Functional Data Structures in R 21.16 Functional Programming 21.17 Functional Programming in R 21.18 Hands-On Programming with R 21.19 Introduction to Programming with R 21.20 Introduction to R - R spatial 21.21 Mastering Software Development in R 21.22 Metaprogramming in R 21.23 Modern R with the tidyverse 21.24 R Cookbook - 2nd edition 21.25 R Development Guide 21.26 R for Excel users 21.27 R for Graduate Students 21.28 R language for programmers 21.29 Rcpp for everyone 21.30 stats545 Data wrangling, exploration, and analysis with R 21.31 The R Inferno 21.32 The R Language 21.33 The Tidyverse Cookbook 21.34 The tidyverse style guide 21.35 Tidy evaluation 21.36 Tidyverse design guide 21.37 Tidyverse Skills for Data Science 21.38 What They Forgot to Teach You About R 21.39 YaRrr! The Pirates Guide to R", " 21 R programming 21.1 A sufficient Introduction to R by Derek l. Sonderegger This book is intended to guide people that are completely new to programming along a path towards a useful skill level using R. I believe that while people can get by with just copying code chunks, that doesnt give them the background information to modify the code in non-trivial ways. Therefore we will spend more time on foundational details than a crash-course would. Link: https://dereksonderegger.github.io/570L/ 21.2 Advanced Object-Oriented Programming in R by Thomas Mailund Learn how to write object-oriented programs in R and how to construct classes and class hierarchies in the three object-oriented systems available in R. This book gives an introduction to object-oriented programming in the R programming language and shows you how to use and apply R in an object-oriented manner. You will then be able to use this powerful programming style in your own statistical programming projects to write flexible and extendable software. Paid: $20 Link: https://amzn.to/2wZnBbp 21.3 Advanced R by Hadley Whickham This is the companion website for Advanced R, a book in Chapman & Halls R Series. The book is designed primarily for R users who want to improve their programming skills and understanding of the language. It should also be useful for programmers coming to R from other languages, as it explains some of Rs quirks and shows how some parts that seem horrible do have a positive side. The book is free online. (Ignore the message redirecting you to the 2nd edition, this is the latest edition) Link: http://adv-r.had.co.nz/ 21.4 Advanced R Solutions by Malte Grosser, Henning Bumann, Hadley Wickham This book offers solutions to the exercises from Hadley Wickhams book Advanced R (Edition 2). It is work in progress and under active development. The 2nd edition of Advanced R has been published and we are currently working towards completion. Link: https://advanced-r-solutions.rbind.io/ 21.5 An Introduction to Data Analysis by Michael Franke This book provides basic reading material for an introduction to data analysis. It uses R to handle, plot and analyze data. After covering the use of R for data wrangling and plotting, the book introduces key concepts of data analysis from a Bayesian and a frequentist tradition. This text is intended for use as a first introduction to statistics for an audience with some affinity towards programming, but no prior exposition to R. Link: https://michael-franke.github.io/intro-data-analysis/index.html 21.6 An Introduction to R by Alex Douglas, Deon Roos, Ana Couto, Francesca Mancini, David Lusseau The aim of this book is to introduce you to using R, a powerful and flexible interactive environment for statistical computing and research. R in itself is not difficult to learn, but as with learning any new language (spoken or computer) the initial learning curve can be a little steep and somewhat daunting. We have tried to simplify the content of this book as much as possible and have based it on our own personal experience of teaching (and learning) R over the last 15 years. It is not intended to cover everything there is to know about R - that would be an impossible task. Neither is it intended to be an introductory statistics course, although you will be using some simple statistics to highlight some of Rs capabilities. The main aim of this book is to help you climb the initial learning curve and provide you with the basic skills and experience (and confidence!) to enable you to further your experience in using R. Link: https://intro2r.com/ 21.7 Another Book on Data Science : Learn R and Python in Parallel by Nailong Zhang There has been considerable debate over choosing R vs. Python for Data Science. Based on my limited knowledge/experience, both R and Python are great languages and are worth learning; so why not learn them together? Besides the side-by-side comparison of the two popular languages used in Data Science, this book also focuses on the translation from mathematical models to codes. In the book, the audience could find the applications/implementations of some important algorithms from scratch, such as maximum likelihood estimation, inversion sampling, copula simulation, simulated annealing, bootstrapping, linear regression (lasso/ridge regression), logistic regression, gradient boosting trees, etc. Link: https://www.anotherbookondatascience.com/ 21.8 Best Coding Practices for R by Vikram Singh Rawat) R is a huge language and I would like to share the little knowledge I have in the subject. I dont claim to be an expert but this book will guide you in the right path wherever possible. Most of the books about R programming language will tell you what are the possible ways to do one thing in R. This book will only tell you one way to do that thing correctly. Link: https://bookdown.org/content/d1e53ac9-28ce-472f-bc2c-f499f18264a3/ 21.9 Book of R: A First Course in Programming and Statistics by Tilman M. Davies The Book of R is a comprehensive, beginner-friendly guide to R, the worlds most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, youll find everything you need to begin using R effectively for statistical analysis. Youll start with the basics, like how to handle data and write simple programs, before moving on to more advanced topics, like producing statistical summaries of your data and performing statistical tests and modeling. Youll even learn how to create impressive data visualizations with Rs basic graphics tools and contributed packages, like ggplot2 and ggvis, as well as interactive 3D visualizations using the rgl package. Paid: $40 Link: https://nostarch.com/bookofr 21.10 Cookbook for R by Winston Chang The goal of the cookbook is to provide solutions to common tasks and problems in analyzing data. Not to be confused with R Cookbook Link: http://www.cookbook-r.com/ 21.11 Data Analytics with R: A Recipe book by Ryan Garnett The structure and design of this book is based on iterative learning, starting with the most basic and build by adding one new element concept. the book has been structured to be small easily consumable chunks similar to that of a recipe card. The concept for a recipe card is that they are self contained, providing all the ingredients, preparation, and instructions required to create a meal. While a cookbook may consist of many recipes, there is no expectation to read, understand, and master all the recipes in order to prepare a meal. Following this as the central theme the book, it has been designed as a number of data analytics recipes focusing on the R language. Link: https://ryangarnett.github.io/r-recipe-book 21.12 Domain-Specific Languages in R by Thomas Mailund Gain an accelerated introduction to domain-specific languages in R, including coverage of regular expressions. This compact, in-depth book shows you how DSLs are programming languages specialized for a particular purpose, as opposed to general purpose programming languages. Along the way, youll learn to specify tasks you want to do in a precise way and achieve programming goals within a domain-specific context. Domain-Specific Languages in R includes examples of DSLs including large data sets or matrix multiplication; pattern matching DSLs for application in computer vision; and DSLs for continuous time Markov chains and their applications in data science. After reading and using this book, youll understand how to write DSLs in R and have skills you can extrapolate to other programming languages. Paid: $25 Link: https://amzn.to/2CDqhAU 21.13 Efficient R programming by Colin Gillespie, Robin Lovelace This book is for anyone who wants to make their R code faster to type, faster to run and more scalable. These considerations generally come after learning the very basics of R for data analysis. Link: https://csgillespie.github.io/efficientR/ 21.14 Field Guide to the R Ecosystem by Mark Sellors This field guide aims to introduce the reader to the main components of the R ecosystem that may be encountered in the field.Whatever the reason, whilst there is a wealth of in-depth information for people actually using the language, I could find precious little information that provided the sort of overview of the ecosystem that I know Id have appreciated when I first came to the language. And with that thought, a field guide is born Link: https://fg2re.sellorm.com/ 21.15 Functional Data Structures in R by Thomas Mailund Get an introduction to functional data structures using R and write more effective code and gain performance for your programs. This book teaches you workarounds because data in functional languages is not mutable: for example youll learn how to change variable-value bindings by modifying environments, which can be exploited to emulate pointers and implement traditional data structures. Youll also see how, by abandoning traditional data structures, you can manipulate structures by building new versions rather than modifying them. Youll discover how these so-called functional data structures are different from the traditional data structures you might know, but are worth understanding to do serious algorithmic programming in a functional language such as R. Paid: $20 Link: https://amzn.to/2oUG2cP 21.16 Functional Programming by Sara Altman, Bill Behrman, Hadley Wickham This book is a practical introduction to functional programming using the tidyverse. Link: https://dcl-prog.stanford.edu/ 21.17 Functional Programming in R by Thomas Mailund Master functions and discover how to write functional programs in R. In this concise book, youll make your functions pure by avoiding side-effects; youll write functions that manipulate other functions, and youll construct complex functions using simpler functions as building blocks. Paid: $20 Link: https://amzn.to/2wY4m11 21.18 Hands-On Programming with R by Garrett Grolemund This book will teach you how to program in R, with hands-on examples. I wrote it for non-programmers to provide a friendly introduction to the R language. Youll learn how to load data, assemble and disassemble data objects, navigate Rs environment system, write your own functions, and use all of Rs programming tools. Throughout the book, youll use your newfound skills to solve practical data science problems. Link: https://rstudio-education.github.io/hopr/ 21.19 Introduction to Programming with R by Reto Stauffer, Joanna Chimiak-Opoka, Thorsten Simon, Achim Zeileis A learning resource for programming novices who want to learn programming using the statistical programming language R. While one of the major strengths of R is the broad variety of packages for statistics and data science, this resource focuses on learning and understanding basic programming concepts using base R. Only a couple of additional packages are used and/or briefly discussed for special tasks. This online book is specifically written for participants of the course Introduction to Programming: Programming in R offered by the Digital Science Center at Universität Innsbruck. Link: https://eeecon.uibk.ac.at/~discdown/rprogramming/index.html 21.20 Introduction to R - R spatial by R Spatial This document provides a concise introduction to R. It emphasizes what you need to know to be able to use the language in any context. There is no fancy statistical analysis here. We just present the basics of the R language itself. We do not assume that you have done any computer programming before (but we do assume that you think it is about time you did). Experienced R users obviously need not read this. But the material may be useful if you want to refresh your memory, if you have not used R much, or if you feel confused. Link: https://rspatial.org/intr/index.html 21.21 Mastering Software Development in R by Roger D. Peng, Sean Kross, Brooke Anderson This book covers R software development for building data science tools. This book provides rigorous training in the R language and covers modern software development practices for building tools that are highly reusable, modular, and suitable for use in a team-based environment or a community of developers. Paid: Free or pay what you want $20 Link: https://leanpub.com/msdr 21.22 Metaprogramming in R by Thomas Mailund Learn how to manipulate functions and expressions to modify how the R language interprets itself. This book is an introduction to metaprogramming in the R language, so you will write programs to manipulate other programs. Metaprogramming in R shows you how to treat code as data that you can generate, analyze, or modify. Paid: $20 Link: https://amzn.to/2x1cYUR 21.23 Modern R with the tidyverse by Bruno Rodrigues This book can be useful to different audiences. If you have never used R in your life, and want to start, start with Chapter 1 of this book. Chapter 1 to 3 are the very basics, and should be easy to follow up to Chapter 9. Starting with Chapter 9, it gets more technical, and will be harder to follow. But I suggest you keep on going, and do not hesitate to contact me for help if you struggle! Chapter 9 is also where you can start if you are already familiar with R and the {tidyverse}, but not functional programming. If you are familiar with R but not the {tidyverse} (or have no clue what the {tidyverse} is), then you can start with Chapter 4. If you are familiar with R, the {tidyverse} and functional programming, you might still be interested in this book, especially Chapter 9 and 10, which deal with package development and further advanced topics respectively. Link: https://b-rodrigues.github.io/modern_R/ 21.24 R Cookbook - 2nd edition by JD Long, Paul Teetor I have written software professionally in perhaps a dozen programming languages, and the hardest language for me to learn has been R. The language is actually fairly simple, but it is unconventional. These notes are intended to make the language easier to learn for someone used to more commonly used languages such as C++, Java, Perl, etc. Not to be confused with Cookbook for R Link: https://rc2e.com/index.html 21.25 R Development Guide by R Contribution Working Group This guide is heavily influenced by the Python Developer Guide, and is a comprehensive resource for contributing to R Core for both new and experienced contributors. It is maintained by the R Contribution Working Group. We welcome your contributions to R Core! Link: https://forwards.github.io/rdevguide/ 21.26 R for Excel users by Julie Lowndes, Allison Horst This course is for Excel users who want to add or integrate R and RStudio into their existing data analysis toolkit. It is a friendly intro to becoming a modern R user, full of tidyverse, RMarkdown, GitHub, collaboration & reproducibility. Link: https://rstudio-conf-2020.github.io/r-for-excel/ 21.27 R for Graduate Students by Y. Wendy Huynh Hello! My name is Wendy Huynh and I am a current PhD student working in the behavioral neurosciences. I began my R journey at the end of my first year of graduate school, slowly and painfully piecing together code. Although programming was never really part of my program, I now see it as an integral part of my work. Many fellow graduate students expressed interest in learning R, but didnt know where to begin. Programming with R is still relatively niche among my cohort and there are very few formal classes teaching this subject. Although there are many amazing guides/textbooks for R out there, very few of them featured examples relevant for my specific needs and were user-friendly enough for a true beginner. In the Fall of my second year, I began teaching a new graduate student in my lab everything I knew about R. However, I quickly found that teaching R even just to one person was very time consuming. I decided to write up assignments as a short guide to R. After writing a short 11 page first assignment and receiving positive feedback, I began writing up a second assignment. Then a third. Soon enough, I had written enough pages that I couldnt deny that this short guide had turned into a book. Link: https://bookdown.org/yih_huynh/Guide-to-R-Book/ 21.28 R language for programmers by John D Cook I have written software professionally in perhaps a dozen programming languages, and the hardest language for me to learn has been R. The language is actually fairly simple, but it is unconventional. These notes are intended to make the language easier to learn for someone used to more commonly used languages such as C++, Java, Perl, etc. Link: https://www.johndcook.com/blog/r_language_for_programmers/ 21.29 Rcpp for everyone by Masaki E. Tsuda Rcpp is a package that enables you to implement R functions in C++. It is easy to use even without deep knowledge of C++, because it is implemented so as to write your C++ code in a style similar to R. And Rcpp does not sacrifice execution speed for the ease of use, anyone can get high performance outcome. This document focuses on providing necessary information to users who are not familiar with C++. Therefore, in some cases, I explain usage of Rcpp conceptually rather than describing accurately from the viewpoint of C++, so that I hope readers can easily understand it. Link: https://teuder.github.io/rcpp4everyone_en/ 21.30 stats545 Data wrangling, exploration, and analysis with R by Jenny Bryan Learn how to: Explore, groom, visualize, and analyze data, make all of that reproducible, reusable, and shareable, using R. This site is about everything that comes up during data analysis except for statistical modelling and inference. Link: https://stat545.com/ 21.31 The R Inferno by Patrick Burns If Rs behaviour has ever suprised you, then this book is a guide for many more surprises, written in the style of Dante. Its a concise report on number of common-errors and unexpected behaviours in R. This book would make more sense, if you have been programming and are familiar with such behaviours (not all though), as there is little time spent on explaining why part of behaviour. As mentioned, its a concise book, 126 pages only. Link: https://www.burns-stat.com/pages/Tutor/R_inferno.pdf 21.32 The R Language by R Core team A collection of manuals: 1. An Introduction to R 1. The R Language Definition 1. Writing R Extensions 1. R Installation and Administration 1. R Data Import/Export 1. R Internals Link: https://stat.ethz.ch/R-manual/R-patched/doc/html/ 21.33 The Tidyverse Cookbook by Edited by Garrett Grolemund This book collects code recipes for doing data science with Rs tidyverse. Each recipe solves a single common task, with a minimum of discussion. Link: https://rstudio-education.github.io/tidyverse-cookbook/ 21.34 The tidyverse style guide by Hadley Whickham Good coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread. This site describes the style used throughout the tidyverse. It was derived from Googles original R Style Guide - but Googles current guide is derived from the tidyverse style guide. Link: https://style.tidyverse.org/ 21.35 Tidy evaluation by Lionel Henry, Hadley Wickham This guide is now superseded by more recent efforts at documenting tidy evaluation in a user-friendly way. We now recommend reading: The new Programming with dplyr vignette. The Using ggplot2 in packages vignette. (Oscars note: Im keeping this in for my own reference) Link: https://tidyeval.tidyverse.org/ 21.36 Tidyverse design guide by Tidyverse team The goal of this book is to help you write better R code. It has four main components: Design problems which lead to suboptimal outcomes. Useful patterns that help solve common problems. Key principles that help you balance conflicting patterns. Selected case studies that help you see how all the pieces fit together with real code. It is used by the tidyverse team to promote consistency across packages in the core tidyverse. Link: https://design.tidyverse.org/ 21.37 Tidyverse Skills for Data Science by Carrie Wright, Shannon E. Ellis, Stephanie C. Hicks, Roger D. Peng Book and Course formats This course introduces a powerful set of data science tools known as the Tidyverse. The Tidyverse has revolutionized the way in which data scientists do almost every aspect of their job. We will cover the simple idea of tidy data and how this idea serves to organize data for analysis and modeling. We will also cover how non-tidy data can be transformed to tidy data, the data science project life cycle, and the ecosystem of Tidyverse R packages that can be used to execute a data science project. Book format https://jhudatascience.org/tidyversecourse/ Ebook: https://leanpub.com/tidyverseskillsdatascience Course format https://www.coursera.org/specializations/tidyverse-data-science-r Link: https://jhudatascience.org/tidyversecourse/ 21.38 What They Forgot to Teach You About R by Jenny Bryan, Jim Hester The initial impetus for creating these materials is a two-day hands-on workshop. The target learner: Has a moderate amount of R and RStudio experience.Is largely self-taught.Suspects they have drifted into some idiosyncratic habits that may slow them down or make their work products more brittle.Is interested in (re)designing their R lifestyle, to be more effective and more self-sufficient. Link: https://rstats.wtf/ 21.39 YaRrr! The Pirates Guide to R by Nathaniel D. Phillips Learn R from the ground up. Let me make something very, very clear I did not write this book. This whole story started in the Summer of 2015. I was taking a late night swim on the Bodensee in Konstanz and saw a rusty object sticking out of the water. Upon digging it out, I realized it was an ancient usb-stick with the word YaRrr inscribed on the side. Intrigued, I brought it home and plugged it into my laptop. Inside the stick, I found a single pdf file written entirely in pirate-speak. After watching several pirate movies, I learned enough pirate-speak to begin translating the text to English. Sure enough, the book turned out to be an introduction to R called The Pirates Guide to R. Link: https://bookdown.org/ndphillips/YaRrr/ "],["reports-r-markdown-and-knitr.html", "22 Reports: R Markdown and knitr 22.1 Getting used to R, RStudio, and R Markdown 22.2 Introduction to R Markdown 22.3 knitr 22.4 Pimp my RMD: a few tips for R Markdown 22.5 R Markdown Cookbook 22.6 R Markdown: The Definitive Guide 22.7 Report Writing for Data Science in R 22.8 Reproducible Research with R and RStudio 22.9 RMarkdown for Scientists", " 22 Reports: R Markdown and knitr 22.1 Getting used to R, RStudio, and R Markdown by Chester Ismay This resource is designed to provide new users to R, RStudio, and R Markdown with the introductory steps needed to begin their own reproducible research. A review of many of the common R errors encountered (and what they mean in laymans terms) will also provided be provided. Link: https://bookdown.org/chesterismay/rbasics/ 22.2 Introduction to R Markdown by Michael Clark The goal is for you to be able to get quickly started with your own document, and understand the possibilities available to you. You will get a feel for the basic mechanics at play, as well as have ideas on how to customize the result to your own tastes. Link: https://m-clark.github.io/Introduction-to-Rmarkdown/ 22.3 knitr by Yihui Xie Dynamic documents with R and knitr! The knitr package was designed to be a transparent engine for dynamic report generation with R, solve some long-standing problems in Sweave, and combine features in other add-on packages into one package. Link: https://yihui.org/knitr/ 22.4 Pimp my RMD: a few tips for R Markdown by Yan Holtz R markdown creates interactive reports from R code. This post provides a few tips I use on a daily basis to improve the appearance of output documents. Link: https://holtzy.github.io/Pimp-my-rmd/ 22.5 R Markdown Cookbook by Yihui Xie, Christophe Dervieux, Emily Riederer This book showcases short, practical examples of lesser-known tips and tricks to helps users get the most out of these tools. After reading this book, you will understand how R Markdown documents are transformed from plain text and how you may customize nearly every step of this processing. For example, you will learn how to dynamically create content from R code, reference code in other documents or chunks, control the formatting with customer templates, fine-tune how your code is processed, and incorporate multiple languages into your analysis. Link: https://bookdown.org/yihui/rmarkdown-cookbook/ 22.6 R Markdown: The Definitive Guide by Yihui Xie, J. J. Allaire, Garrett Grolemund The first official book authored by the core R Markdown developers that provides a comprehensive and accurate reference to the R Markdown ecosystem. With R Markdown, you can easily create reproducible data analysis reports, presentations, dashboards, interactive applications, books, dissertations, websites, and journal articles, while enjoying the simplicity of Markdown and the great power of R and other languages. Link: https://bookdown.org/yihui/rmarkdown/ 22.7 Report Writing for Data Science in R by [Roger D. Peng]](https://twitter.com/rdpeng) This book teaches the fundamental concepts and tools behind reporting modern data analyses in a reproducible manner. As data analyses become increasingly complex, the need for clear and reproducible report writing is greater than ever. Paid: Free or pay what you want $10 Link: https://leanpub.com/reportwriting 22.8 Reproducible Research with R and RStudio by Christopher Gandrud This book present all the Tools for Gathering and Analyzing Data and Presenting Results Reproducible Research with R and RStudio through practical examples. The book can be reproduced by using the R package bookdown. You can buy a copy at: https://www.routledge.com/Reproducible-Research-with-R-and-RStudio/Gandrud/p/book/9780367143985 Link: https://github.com/christophergandrud/Rep-Res-Book> Also, you can buy the copy. 22.9 RMarkdown for Scientists by Nicholas Tierney This is a book on rmarkdown, aimed for scientists. It was initially developed as a 3 hour workshop, but is now developed into a resource that will grow and change over time as a living book. Link: https://rmd4sci.njtierney.com/ "],["shiny.html", "23 Shiny 23.1 A gRadual intRoduction to Shiny 23.2 Engineering Production-Grade Shiny Apps 23.3 JavaScript 4 Shiny - Field Notes 23.4 JavaScript for R 23.5 Mastering Shiny 23.6 Mastering Shiny Solutions 23.7 Outstanding User Interfaces with Shiny 23.8 Shiny Production with AWS Book 23.9 Supplement to Shiny in Production", " 23 Shiny 23.1 A gRadual intRoduction to Shiny by Ted Laderas, Jessica Minnier By the end of this workshop, you should be able to: Browse examples in the shiny gallery and understand how they work.Understand the components of a Shiny app and how they communicate.Learn three basic design patterns to the shiny apps. Link: https://laderast.github.io/gradual_shiny/ 23.2 Engineering Production-Grade Shiny Apps by Colin Fay, Sébastien Rochette, Vincent Guyader, Cervan Girard This book will not get you started with Shiny, nor talk how to work with Shiny once it is sent to production. What well see is the process of building an application that will later be sent to production. Link: https://engineering-shiny.org/ 23.3 JavaScript 4 Shiny - Field Notes by Colin Fay JavaScript in practice for Shiny users. Link: https://connect.thinkr.fr/js4shinyfieldnotes/ 23.4 JavaScript for R by John Coene Learn how to build your own data visualisation packages, improve shiny with JavaScript, and use JavaScript for computations. Link: https://javascript-for-r.com 23.5 Mastering Shiny by Hadley Wickham This book complements Shinys online documentation and is intended to help app authors develop a deeper understanding of Shiny. After reading this book, youll be able to write apps that have more customized UI, more maintainable code, and better performance and scalability. Link: https://mastering-shiny.org/ 23.6 Mastering Shiny Solutions by Maya Gans, Marly Gotti This book offers solutions to the exercises from Hadley Wickhams book Mastering Shiny. It is a work in progress and under active development. Link: https://mastering-shiny-solutions.org 23.7 Outstanding User Interfaces with Shiny by David Granjon This book will help you to: Manipulate Shiny tags from R to create custom layouts. Harness the power of CSS and JavaScript to quickly design apps standing out from the pack. Discover the steps to import and convert existing web frameworks like Bootstrap 4, framework7 and more Learn how Shiny internally deals with inputs. Learn more about less documented Shiny mechanisms (websockets, sessions, ) Link: https://divadnojnarg.github.io/outstanding-shiny-ui/ 23.8 Shiny Production with AWS Book by Matt Doncho A big problem exists No one teaches Data Scientists how to deploy web applications. You spend all of this time building Shiny web applications. And then [silence]. This book alongside the Shiny Developer with AWS Course (DS4B 202A-R) solves this problem - teaching Data Scientists how to deploy, host, and maintain web applications. Link: https://business-science.github.io/shiny-production-with-aws-book/ 23.9 Supplement to Shiny in Production This document is full of supplemental resources and content from the Shiny in Production Workshop delievered at rstudio::conf 2019. Link: https://kellobri.github.io/shiny-prod-book/ "],["social-science.html", "24 Social Science 24.1 Analyzing US Census Data: Methods, Maps, and Models in R 24.2 Composite Indicator Development and Analysis in R with COINr 24.3 Computing for the Social Sciences 24.4 Crime by the Numbers: A Criminologists Guide to R 24.5 Crime by the Numbers: A Criminologists Guide to R 24.6 Introduction to R for Social Scientists:A Tidy Programming Approach 24.7 Public Policy Analytics: Code & Context for Data Science in Government 24.8 Social Data Science with R 24.9 The Plain Persons Guide to Plain Text Social Science 24.10 Using R for Data Analysis in Social Sciences: A Research Project-Oriented Approach", " 24 Social Science 24.1 Analyzing US Census Data: Methods, Maps, and Models in R by Kyle Walker Census data are widely used in the United States across numerous research and applied fields, including education, business, journalism, and many others. Until recently, the process of working with US Census data has required the use of a wide array of web interfaces and software platforms to prepare, map, and present data products. The goal of this book is to illustrate the utility of the R programming language for handling these tasks, allowing Census data users to manage their projects in a single computing environment. Link: https://walker-data.com/census-r/ 24.2 Composite Indicator Development and Analysis in R with COINr by William Becker Composite indicators are aggregations of indicators which aim to measure (usually socio-economic) complex and multidimensional concepts which are difficult to define, and cannot be measured directly. Examples include innovation, human development, environmental performance, and so on. This book gives a detailed guide on building composite indicators in R, focusing on the recent COINr package, which is an end-to-end development environment for composite indicators. Although COINr is the main tool used in the book, it also gives general explanation and guidance on composite indicator construction and analysis in R, ranging from normalisation, aggregation, multivariate analysis and global sensitivity analysis. Link: https://bluefoxr.github.io/COINrDoc/ 24.3 Computing for the Social Sciences by Dr. Benjamin Soltoff The goal of this course is to teach you basic computational skills and provide you with the means to learn what you need to know for your own research. I start from the perspective that you want to analyze data, and programming is a means to that end. You will not become an expert programmer - that is a given. But you will learn the basic skills and techniques necessary to conduct computational social science, and gain the confidence necessary to learn new techniques as you encounter them in your research. We will cover many different topics in this course, including: Elementary programming techniques (e.g. loops, conditional statements, functions) Writing reusable, interpretable code Problem-solving - debugging programs for errors Obtaining, importing, and munging data from a variety of sources Performing statistical analysis Visualizing information Creating interactive reports Generating reproducible research Link: https://cfss.uchicago.edu/notes/intro-to-course/ 24.4 Crime by the Numbers: A Criminologists Guide to R by Jacob Kaplan This book introduces the programming language R and is meant for undergrads or graduate students studying criminology. R is a programming language that is well-suited to the type of work frequently done in criminology - taking messy data and turning it into useful information. While R is a useful tool for many fields of study, this book focuses on the skills criminologists should know and uses crime data for the example data sets. Link: https://crimebythenumbers.com 24.5 Crime by the Numbers: A Criminologists Guide to R by Jacob Kaplan This book introduces the programming language R and is meant for undergrads or graduate students studying criminology. R is a programming language that is well-suited to the type of work frequently done in criminology - taking messy data and turning it into useful information. While R is a useful tool for many fields of study, this book focuses on the skills criminologists should know and uses crime data for the example data sets. Link: https://crimebythenumbers.com/ 24.6 Introduction to R for Social Scientists:A Tidy Programming Approach by Ryan Kennedy, Philip Waggoner Introduction to R for Social Scientists: A Tidy Programming Approach introduces the Tidy approach to programming in R for social science research to help quantitative researchers develop a modern technical toolbox. The Tidy approach is built around consistent syntax, common grammar, and stacked code, which contribute to clear, efficient programming. The authors include hundreds of lines of code to demonstrate a suite of techniques for developing and debugging an efficient social science research workflow. Link: https://i2rss.weebly.com/# 24.7 Public Policy Analytics: Code & Context for Data Science in Government by Ken Steif, Ph.D The goal of this book is to make data science accessible to social scientists and City Planners, in particular. I hope to convince readers that one with strong domain expertise plus intermediate data skills can have a greater impact in government than the sharpest computer scientist who has never studied economics, sociology, public health, political science, criminology etc. Link: https://urbanspatial.github.io/PublicPolicyAnalytics/ 24.8 Social Data Science with R by Daniel Anderson, Brendan Cullen, Ouafaa Hmaddi Heres an intro about why R is great and the cool things you can do with it and new problems you can address. Link: https://www.sds.pub/index.html 24.9 The Plain Persons Guide to Plain Text Social Science by Kieran Healy As a beginning graduate student in the social sciences, what sort of software should you use to do your work?1 More importantly, what principles should guide your choices? I offer some general considerations and specific answers. Link: https://plain-text.co/index.html#introduction 24.10 Using R for Data Analysis in Social Sciences: A Research Project-Oriented Approach by Quan Li This book seeks to teach undergraduate and graduate students in social sciences how to use R to manage, visualize, and analyze data in order to answer substantive questions and replicate published findings. This book distinguishes itself from other introductory R or statistics books in three ways. First, targeting an audience rarely exposed to statistical programming, it adopts a minimalist approach and covers only the most important functions and skills in R that one will need for conducting reproducible research projects. Second, it emphasizes meeting the practical needs of students using R in research projects. Specifically, it teaches students how to import, inspect, and manage data; understand the logic of statistical inference; visualize data and findings via histograms, boxplots, scatterplots, and diagnostic plots; and analyze data using one-sample t-test, difference-of-means test, covariance, correlation, ordinary least squares (OLS) regression, and model assumption diagnostics. Third, it teaches students how to replicate the findings in published journal articles and diagnose model assumption violations. Paid: Incl listing of library availability $40 Link: https://www.worldcat.org/title/using-r-for-data-analysis-in-social-sciences-a-research-project-oriented-approach/oclc/1048009316 "],["sport-analytics.html", "25 Sport analytics 25.1 Basketball Data Science with Applications in R 25.2 Coding for sports analytics: get started resources 25.3 Exploring Baseball Data with R 25.4 Visualising WRC Rally Stages With rayshader and R: A RallyDataJunkie Adventure 25.5 Visualising WRC Rally Timing and Results Data: A RallyDataJunkie Adventure 25.6 Wrangling F1 Data With R: A Data Junkies Guide", " 25 Sport analytics 25.1 Basketball Data Science with Applications in R by Paola Zuccolotto, Marica Manisera Using data from one season of NBA games, Basketball Data Science: With Applications in R is the perfect book for anyone interested in learning and applying data analytics in basketball. Whether assessing the spatial performance of an NBA players shots or doing an analysis of the impact of high pressure game situations on the probability of scoring, this book discusses a variety of case studies and hands-on examples using a custom R package. The codes are supplied so readers can reproduce the analyses themselves or create their own. Assuming a basic statistical knowledge, Basketball Data Science with R is suitable for students, technicians, coaches, data analysts and applied researchers. Paid: $35 Link: https://www.routledge.com/Basketball-Data-Science-With-Applications-in-R/Zuccolotto-Manisera/p/book/9781138600799 25.2 Coding for sports analytics: get started resources Given the lack of sport-focussed R books, Ive added this collection of blog posts. Link: https://brendankent.com/2020/09/15/coding-for-sports-analytics-resources-to-get-started/ 25.3 Exploring Baseball Data with R by Max Marchi, Jim Albert, Max Marchi, Benjamin S. Baumer This book introduces R to sabermetricians, baseball enthusiasts, and students interested in exploring the richness of baseball data. It equips you with the necessary skills and software tools to perform all the analysis steps, from importing the data to transforming them into an appropriate format to visualizing the data via graphs to performing a statistical analysis. Paid: $50 Link: https://baseballwithr.wordpress.com/about/ 25.4 Visualising WRC Rally Stages With rayshader and R: A RallyDataJunkie Adventure by Tony Hirst Taking a simple rally route dataset, what can we do with it? This book describes a wide range of techniques for working with geodata, including routes and elevantion rasters. From 2D and 3D mapping, to a wide range of route analysis techniques, the techniques described are also relevant to a wide range of othr route analysis contexts, including ecological trail analysis. Link: https://rallydatajunkie.com/visualising-rally-stages 25.5 Visualising WRC Rally Timing and Results Data: A RallyDataJunkie Adventure by Tony Hirst A handy guide to visualising a wide range of motorsport timing and results data, concentrating on rally data associated with the FIA World Rally Championship (WRC). Link: https://rallydatajunkie.com/visualising-wrc-rally-results/ 25.6 Wrangling F1 Data With R: A Data Junkies Guide by Tony Hirst Taking a simple rally route dataset, what can we do with it? This book describes a wide range of techniques for working with geodata, including routes and elevantion rasters. From 2D and 3D mapping, to a wide range of route analysis techniques, the techniques described are also relevant to a wide range of othr route analysis contexts, including ecological trail analysis. Link: https://rallydatajunkie.com/visualising-rally-stages/ "],["statistics.html", "26 Statistics 26.1 A Business Analysts Introduction to Business Analytics 26.2 An Introduction to Statistical and Data Sciences via R 26.3 An Introduction to Statistical Learning 26.4 Answering questions with data 26.5 Bayes rules! 26.6 Common statistical tests are linear models: a work through 26.7 Doing meta-analysis with R: A hands-on guide 26.8 End-to-End Solved Problems With R: a catalog of 26 examples using statistical inference 26.9 Foundations of Statistics with R 26.10 Foundations of Statistics with R 26.11 Handbook of Regression Modeling in People Analytics 26.12 Introduction to Modern Statistics 26.13 ISLR tidymodels Labs 26.14 Learning statistics with R: A tutorial for psychology students and other beginners 26.15 Mixed Models with R : Getting started with random effects 26.16 Model Estimation by Example: Demonstrations with R 26.17 Modern Statistics with R 26.18 One Way ANOVA with R: Completely Randomized Design - Between Groups 26.19 OpenIntro Statistics 26.20 Statistical inference for data science 26.21 Statistical Rethinking 26.22 Statistical Rethinking with brms, ggplot2, and the tidyverse: Second edition 26.23 Statistical Thinking in the 21st Century 26.24 Statistics (The Easier Way) With R, 3rd. Ed. (TIDYVERSION) 26.25 Statistics and Data with R: An Applied Approach Through Examples 26.26 Teacups, Giraffes and Statistics 26.27 The Effect: An Introduction to Research Design and Causality 26.28 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling", " 26 Statistics 26.1 A Business Analysts Introduction to Business Analytics by Adam Fleischhacker This textbook goes farther than just teaching you to make computational models using software or mathematical models using statistics. It teaches you how to align computational and mathematical models with real-world scenarios; empowering you to communicate with and leverage the expertise of business stakeholders while using modern software stacks and statistical workflows. In this book, you do not learn business analytics to make models; you learn business analytics to add tangible value in the real-world. Link: https://www.causact.com/ 26.2 An Introduction to Statistical and Data Sciences via R by Chester Ismay, Albert Kim An incredibly beginner friendly introduction to both datascience and statistics concepts as well as R. Link: https://moderndive.com/ 26.3 An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani As the scale and scope of data collection continue to increase across virtually all fields, statistical learning has become a critical toolkit for anyone who wishes to understand data. An Introduction to Statistical Learning provides a broad and less technical treatment of key topics in statistical learning. Each chapter includes an R lab. This book is appropriate for anyone who wishes to use contemporary tools for data analysis. Link: https://www.statlearning.com/ 26.4 Answering questions with data by Matthew J. Crump This is a free textbook teaching introductory statistics for undergraduates in Psychology. This textbook is part of a larger OER course package for teaching undergraduate statistics in Psychology, including this textbook, a lab manual, and a course website. (Oscars note:Looks like a comprehensive stats resource!) Link: https://crumplab.github.io/statistics/ 26.5 Bayes rules! by Alicia A. Johnson, Miles Ott, Mine Dogucu The primary goal of Bayes Rules! is to make modern Bayesian thinking, modeling, and computing accessible to a broad audience. Bayes Rules! empowers readers to weave Bayesian approaches into an everyday modern practice of statistics and data science. The overall spirit is very applied: the book utilizes modern computing resources and a reproducible pipeline; the discussion emphasizes conceptual understanding; the material is motivated by data-driven inquiry; and the delivery blends traditional content with activity. Link: https://www.bayesrulesbook.com/ 26.6 Common statistical tests are linear models: a work through by Steve Doogue This is a reworking of the book Common statistical tests are linear models (or: how to teach stats), written by Jonas Lindeløv. The book beautifully demonstrates how many common statistical tests (such as the t-test, ANOVA and chi-squared) are special cases of the linear model. The book also demonstrates that many non-parametric tests, which are needed when certain test assumptions do not hold, can be approximated by linear models using the rank of values. Link: https://steverxd.github.io/Stat_tests/ 26.7 Doing meta-analysis with R: A hands-on guide by Mathias Harrer, Pim Cuijpers, Toshi A. Furukawa, David D. Ebert This book serves as an accessible introduction into how meta-analyses can be conducted in R. Essential steps for meta-analysis are covered, including pooling of outcome measures, forest plots, heterogeneity diagnostics, subgroup analyses, meta-regression, methods to control for publication bias, risk of bias assessments and plotting tools. Advanced, but highly relevant topics such as network meta-analysis, multi-/three-level meta-analyses, Bayesian meta-analysis approaches, SEM meta-analysis are also covered. Link: https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/ 26.8 End-to-End Solved Problems With R: a catalog of 26 examples using statistical inference by Nicole Radziwill Lots of worked problems, analytically and in R! Useful supplement for an introductory applied stats class. https://amzn.to/2EREAn2 - used for $4-18, new $19-20 https://www.e-junkie.com/ecom/gb.php?c=single&cl=147256&i=1548704 - $10 for PDF only Paid: $15 Link: https://amzn.to/2EREAn2 26.9 Foundations of Statistics with R by Darrin Speegle, Bryan Clair This book represents a fundamental rethinking of a calculus based first course in probability and statistics. We offer a breadth first approach, where the fundamentals of probability and statistics can be taught in one semester. The statistical programming language R plays an essential role throughout the text through simulations, data wrangling, visualizations and statistical procedures. Data sets from a variety of sources, including many from recent, open source scientific articles, are used in examples and exercises. Demonstrations of important facts are given through simulations, with some formal mathematical proofs as well. This book is an excellent choice for students studying data science, statistics, engineering, computer science, mathematics, science, business, or any field which requires the two semesters of calculus needed to read this book. Link: https://mathstat.slu.edu/~speegle/_book/preface.html 26.10 Foundations of Statistics with R by Darrin Speegle This book represents a fundamental rethinking of a calculus based first course in probability and statistics. We offer a breadth first approach, where the fundamentals of probability and statistics can be taught in one semester.1 The statistical programming language R plays an essential role throughout the text through simulations, data wrangling, visualizations and statistical procedures. Data sets from a variety of sources, including many from recent, open source scientific articles, are used in examples and exercises. Demonstrations of important facts are given through simulations, with some formal mathematical proofs as well. Link: https://mathstat.slu.edu/~speegle/_book/preface.html 26.11 Handbook of Regression Modeling in People Analytics by Keith McNulty It is the authors firm belief that all people analytics professionals should have a strong understanding of regression models and how to implement and interpret them in practice, and the aim with this book is to provide those who need it with help in getting there. For accompanying solutions to some of the questions: https://keithmcnulty.github.io/peopleanalytics-regression-book/solutions/ Link: http://peopleanalytics-regression-book.org/index.html 26.12 Introduction to Modern Statistics by Mine Çetinkaya-Rundel, Johanna Hardin We hope readers will take away three ideas from this book in addition to forming a foundation of statistical thinking and methods. Statistics is an applied field with a wide range of practical applications. You dont have to be a math guru to learn from interesting, real data. Data are messy, and statistical tools are imperfect. However, when you understand the strengths and weaknesses of these tools, you can use them to learn interesting things about the~world. Link: https://openintro-ims.netlify.app/ 26.13 ISLR tidymodels Labs by Emil Hvitfeldt This book aims to be a complement to the 1st version An Introduction to Statistical Learning book with translations of the labs into using the tidymodels set of packages. The labs will be mirrored quite closely to stay true to the original material. Link: https://emilhvitfeldt.github.io/ISLR-tidymodels-labs/index.html 26.14 Learning statistics with R: A tutorial for psychology students and other beginners by Danielle Navarro Learning Statistics with R covers the contents of an introductory statistics class, as typically taught to undergraduate psychology students, focusing on the use of the R statistical software. The book discusses how to get started in R as well as giving an introduction to data manipulation and writing scripts. From a statistical perspective, the book discusses descriptive statistics and graphing <U+FB01>rst, followed by chapters on probability theory, sampling and estimation, and null hypothesis testing. After introducing the theory, the book covers the analysis of contingency tables, t-tests, ANOVAs and regression. Bayesian statistics are covered at the end of the book. Link: https://learningstatisticswithr-bookdown.netlify.app/ 26.15 Mixed Models with R : Getting started with random effects by Michael Clark Mixed models are an extremely useful modeling tool for situations in which there is some dependency among observations in the data, where the correlation typically arises from the observations being clustered in some way. Link: https://m-clark.github.io/mixed-models-with-R/ 26.16 Model Estimation by Example: Demonstrations with R by Michael Clark This document provides by-hand demonstrations of various models and algorithms. The goal is to take away some of the mystery of them by providing clean code examples that are easy to run and compare with other tools. The code was collected over several years, so is not exactly consistent in style, but now has been cleaned up to make it more so. Within each demo, you will generally find some imported/simulated data, a primary estimating function, a comparison of results with some R package, and a link to the old code that was the initial demonstration. Link: https://m-clark.github.io/models-by-example/ 26.17 Modern Statistics with R by Måns Thulin This book covers the fundamentals of data science and statistics. The first half deals with the basics of R and R coding, data wrangling, exploratory data analysis and more advandced programming. The second half deals with modern statistics (favouring permutation tests, the bootstrap and Bayesian methods over traditional asymptotic methods), regression models and predictive modelling. It also contains information about debugging and explanations of 25 commonly encountered error messages in R. In addition, there are 170 or so exercises with fully worked solutions. Link: http://www.modernstatisticswithr.com/ 26.18 One Way ANOVA with R: Completely Randomized Design - Between Groups by Bruce Dudek This document can be a standalone how-to document for R users. However, it is primarily intended for students in the APSY510/511 statistics sequence at the University at Albany. It is a fairly thorough treatment of graphical and inferential evaluation of one-factor designs. It presumes prior background coverage of the ANOVA logic from standard textbooks such as Howell or Maxwell, Delaney and Kelley (2017). The analyses are intended to parallel and exhaust the methods already covered with SPSS, and to extend them to additional topics. Link: https://bcdudek.net/anova/oneway_anova_basics.pdf 26.19 OpenIntro Statistics by David Diez, Mine Cetinkaya-Rundel, Christopher Barr, OpenIntro A complete foundation for Statistics, also serving as a foundation for Data Science. Leanpub revenue supports OpenIntro (US-based nonprofit) so we can provide free desk copies to teachers interested in using OpenIntro Statistics in the classroom and expand the project to support free textbooks in other subjects. More resources: openintro.org. Paid: Pay what you want for the ebook, minimum $0.00, however if you are able to, please consider the cause above. Thanks! $15 Link: https://leanpub.com/openintro-statistics 26.20 Statistical inference for data science by Brian Caffo This book gives a brief, but rigorous, treatment of statistical inference intended for practicing Data Scientists. Paid: Free or pay what you want $15 Link: https://leanpub.com/LittleInferenceBook 26.21 Statistical Rethinking A Bayesian Course with Examples in R and Stan Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds your knowledge of and confidence in making inferences from data. Reflecting the need for scripting in todays model-based statistics, the book pushes you to perform step-by-step calculations that are usually automated. This unique computational approach ensures that you understand enough of the details to make reasonable choices and interpretations in your own modeling work. Link: https://xcelab.net/rm/statistical-rethinking/ 26.22 Statistical Rethinking with brms, ggplot2, and the tidyverse: Second edition by A Solomon Kurz This ebook is based on the second edition of Richard McElreaths (2020) text, Statistical rethinking: A Bayesian course with examples in R and Stan. My contributions show how to fit the models he covered with Paul Bürkners brms package, which makes it easy to fit Bayesian regression models in R using Hamiltonian Monte Carlo. I also prefer plotting and data wrangling with the packages from the tidyverse. So well be using those methods, too. Link: https://bookdown.org/content/4857/ 26.23 Statistical Thinking in the 21st Century by Russell Poldrack This textbook aims to cover modern methods that take advantage of todays increased computing power, while also balancing the accessibility of the material for students not wanting to wade through a lot of story to get to the statistical knowledge while reading Andy Fields graphic novel statistics books, An Adventure in Statistics. The main site below has companion sites in R and Python: R companion https://statsthinking21.github.io/statsthinking21-R-site/ Python companion https://statsthinking21.github.io/statsthinking21-python/ Link: https://statsthinking21.github.io/statsthinking21-core-site/ 26.24 Statistics (The Easier Way) With R, 3rd. Ed. (TIDYVERSION) by Nicole Radziwill This introductory applied statistics handbook shows you how to run tests analytically, and then how to run exactly the same steps using R. No steps are skipped, making this particularly well suited for beginners or people who need a quick lookup. Used at 30+ universities around the globe. https://amzn.to/3b9ha8s - varies between $37-43 https://www.e-junkie.com/ecom/gb.php?&c=single&cl=147256&i=1614407 - $25 for PDF only Paid: $37 Link: https://amzn.to/3b9ha8s 26.25 Statistics and Data with R: An Applied Approach Through Examples by Yosef Cohen, Jeremiah Y. Cohen R, an Open Source software, has become the de facto statistical computing environment. It has an excellent collection of data manipulation and graphics capabilities. It is extensible and comes with a large number of packages that allow statistical analysis at all levels from simple to advanced and in numerous fields including Medicine, Genetics, Biology, Environmental Sciences, Geology, Social Sciences and much more. The software is maintained and developed by academicians and professionals and as such, is continuously evolving and up to date. Statistics and Data with R presents an accessible guide to data manipulations, statistical analysis and graphics using R. Paid: The E-Book costs $97.00 while the print version costs $121.75 $97 Link: https://www.wiley.com/en-us/Statistics+and+Data+with+R%3A+An+Applied+Approach+Through+Examples-p-9780470758052 26.26 Teacups, Giraffes and Statistics by Hasse Walum, Desirée De Leon A delightful series of beautifully illustrated modules to learn statistics and R coding for students, scientists, and stats-enthusiasts. Link: https://tinystats.github.io/teacups-giraffes-and-statistics/index.html 26.27 The Effect: An Introduction to Research Design and Causality by Nick Huntington-Klein The Effect is a book intended to introduce students (and non-students) to the concepts of research design and causality in the context of observational data. The book is written in an intuitive and approachable way and doesnt overload on technical detail. Why teach regression and research design at the same time when they are fundamentally different things? First learn why you want to structure a design in a certain way, and what it is you want to do to the data, and then afterwards learn the technical details of how to run the appropriate model. Link: https://theeffectbook.net/ 26.28 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling by Andrew B. Lawson Progressively more and more attention has been paid to how location affects health outcomes. The area of disease mapping focusses on these problems, and the Bayesian paradigm has a major role to play in the understanding of the complex interplay of context and individual predisposition in such studies of disease. Using R for Bayesian Spatial and Spatio-Temporal Health Modeling provides a major resource for those interested in applying Bayesian methodology in small area health data studies. Paid: $100 Link: https://www.routledge.com/Using-R-for-Bayesian-Spatial-and-Spatio-Temporal-Health-Modeling/Lawson/p/book/9780367490126 "],["teaching.html", "27 Teaching 27.1 Data Science in a Box 27.2 rstudio4edu 27.3 Teaching Tech Together 27.4 What they forgot to teach you about teaching R", " 27 Teaching 27.1 Data Science in a Box by Mine Çetinkaya-Rundel This book focuses on how to efficiently teach data science to students with little to no background in computing and statistical thinking. The core content of the course focuses on data acquisition and wrangling, exploratory data analysis, data visualization, inference, modelling, and effective communication of results. Link: https://datasciencebox.org/ 27.2 rstudio4edu by Desirée De Leon, Alison Hill A book for educators in the data science space who wish to create educational materials that are engaging for students and inspiring to other educators. This book is a cookbook for generating materials for R Markdown lessons R packages R Markdown websites Distill sites Bookdown books Blogdown sites Link: https://rstudio4edu.github.io/rstudio4edu-book/ 27.3 Teaching Tech Together by Greg Wilson (Oscars note: Not an R book per se, but comes highly recommended about how to teach programming.) Grassroots groups have sprung up around the world to teach programming, web design, robotics, and other skills to free-range learners. These groups exist so that people dont have to learn these things on their own, but ironically, their founders and teachers are often teaching themselves how to teach. Theres a better way. Just as knowing a few basic facts about germs and nutrition can help you stay healthy, knowing a few things about cognitive psychology, instructional design, inclusivity, and community organization can help you be a more effective teacher. This book presents key ideas you can use right now, explains why we believe they are true, and points you at other resources that will help you go further Link: http://teachtogether.tech/en/index.html 27.4 What they forgot to teach you about teaching R by Desiree de Leon This book is offered at rstudio::global(2021), as part of the Diversity Scholars program. In this workshop, you will learn about using the RStudio IDE to its full potential for teaching R. Whether youre an educator by profession, or you do education as part of collaborations or outreach, or you want to improve your workflow for giving talks, demos, and workshops, there is something for you in this workshop. During the workshop we will cover live coding best practices, tips for using RStudio Cloud for teaching and building learnr tutorials, and R Markdown based tools for developing instructor and student facing teaching materials. Link: https://wtf-teach.netlify.app/ "],["text-analysis.html", "28 Text analysis 28.1 Supervised Machine Learning for Text Analysis in R 28.2 Text Mining with R 28.3 Text Mining With Tidy Data Principles", " 28 Text analysis 28.1 Supervised Machine Learning for Text Analysis in R by Emil Hvitfeldt, Julia Silge Modeling as a statistical practice can encompass a wide variety of activities. This book focuses on supervised or predictive modeling for text, using text data to make predictions about the world around us. We use the tidymodels framework for modeling, a consistent and flexible collection of R packages developed to encourage good statistical practice. Link: https://smltar.com/ 28.2 Text Mining with R by Julia Silge, David Robinson This book serves as an introduction of text mining using the tidytext package and other tidy tools in R. The functions provided by the tidytext package are relatively simple; what is important are the possible applications. Thus, this book provides compelling examples of real text mining problems. Link: https://www.tidytextmining.com/ 28.3 Text Mining With Tidy Data Principles by Julia Silge Text data sets are diverse and ubiquitous, and tidy data principles provide an approach to make text mining easier, more effective, and consistent with tools already in wide use. In this tutorial, you will develop your text mining skills using the tidytext package in R, along with other tidyverse tools. Link: https://juliasilge.shinyapps.io/learntidytext/ "],["time-series-analysis-and-forecasting.html", "29 Time Series Analysis and Forecasting 29.1 Applied Time Series Analysis for Fisheries and Environmental Sciences 29.2 Fisheries Catch Forecasting 29.3 Forecasting: Principles and Practice 29.4 Hands-On Time Series Analysis with R 29.5 Practical Time Series Forecasting with R: A Hands-On Guide 29.6 Time Series - A Data Analysis Approach Using R 29.7 Time Series Analysis and Its Applications", " 29 Time Series Analysis and Forecasting 29.1 Applied Time Series Analysis for Fisheries and Environmental Sciences by E. E. Holmes, M. D. Scheuerell, E. J. Ward This is material that was developed as part of a course we teach at the University of Washington on applied time series analysis for fisheries and environmental data. Link: https://atsa-es.github.io/atsa-labs/ 29.2 Fisheries Catch Forecasting by Elizabeth Holmes The focus of this book is on analysis of univariate time series. However multivariate regression with autocorrelated errors and multivariate autoregressive models (MAR) will be covered more briefly. For an indepth discussion of multivariate autoregressive models and multivariate autoregressive state-space models, see Holmes, Ward and Scheuerell (2018). Link: https://fish-forecast.github.io/Fish-Forecast-Bookdown/index.html 29.3 Forecasting: Principles and Practice by Rob J Hyndman, George Athanasopoulos This textbook is intended to provide a comprehensive introduction to forecasting methods and to present enough information about each method for readers to be able to use them sensibly. The book is written for three audiences: (1) people finding themselves doing forecasting in business when they may not have had any formal training in the area; (2) undergraduate students studying business; (3) MBA students doing a forecasting elective. Second edition supporting the forecast package: https://otexts.com/fpp2/ Third edition supporting the fable package: https://otexts.com/fpp3/ Link: https://otexts.com/fpp3/ 29.4 Hands-On Time Series Analysis with R by Rami Krispin The book provides an introduction for time series analysis with R. It covers the general workflow of time series analysis - working and handling time series data, descriptive analysis, predictive analysis, modeling strategies, etc. This book is designed for data scientists who wish to learn time series analysis and forecasting or data analysts who use Excel-based forecasting methods and wish to use more robust methods. Paid: $30 Link: https://www.packtpub.com/product/hands-on-time-series-analysis-with-r/9781788629157 29.5 Practical Time Series Forecasting with R: A Hands-On Guide by Galit Shmueli, Kenneth C. Lichtendahl, Jr Practical Time Series Forecasting with R provides an applied approach to time-series forecasting. Forecasting is an essential component of predictive analytics. Balancing theory and practice, the books introduce popular forecasting methods and approaches used in a variety of business applications, and are ideal for Business Analytics, MBA, Executive MBA, and Data Analytics programs in business schools. Paid: $30 Link: http://www.forecastingbook.com/ 29.6 Time Series - A Data Analysis Approach Using R by Robert H. Shumway, David S. Stoffer The goals of this text are to develop the skills and an appreciation for the richness and versatility of modern time series analysis as a tool for analyzing dependent data. A useful feature of the presentation is the inclusion of nontrivial data sets illustrating the richness of potential applications to problems in the biological, physical, and social sciences as well as medicine. The text presents a balanced and comprehensive treatment of both time and frequency domain methods with an emphasis on data analysis. Paid: $40 Link: https://www.routledge.com/Time-Series-A-Data-Analysis-Approach-Using-R/Shumway-Stoffer/p/book/9780367221096 29.7 Time Series Analysis and Its Applications by Robert H. Shumway, David S. Stoffer The book is designed as a textbook for graduate level students in the physical, biological, and social sciences and as a graduate level text in statistics. Some parts may also serve as an undergraduate introductory course. Theory and methodology are separated to allow presentations on different levels. In addition to coverage of classical methods of time series regression, ARIMA models, spectral analysis and state-space models, the text includes modern developments including categorical time series analysis, multivariate spectral methods, long memory series, nonlinear models, resampling techniques, GARCH models, ARMAX models, stochastic volatility, wavelets, and Markov chain Monte Carlo integration methods. Link: https://www.stat.pitt.edu/stoffer/tsa4/index.html "],["version-control.html", "30 Version control 30.1 Git and Github for Advanced Ecological Data Analysis 30.2 Github actions with R 30.3 Github learning lab 30.4 Happy Git and GitHub for the useR 30.5 The Beginners Guide to Git and GitHub", " 30 Version control 30.1 Git and Github for Advanced Ecological Data Analysis by Alexa Fredston This material was prepared for a three-hour virtual session to teach Git and Github to a graduate-level course on Advanced Ecological Data Analysis taught at Rutgers University by Malin Pinsky and Rachael Winfree. (However, the only course-specific material is Section 4; the rest should be applicable to any reader.) Link: https://afredston.github.io/learn-git/learn-git.htm 30.2 Github actions with R by Chris Brown, Murray Cadzow, Paula A Martinez, Rhydwyn McGuire, David Neuzerling, David Wilkinson, Saras Windecker GitHub actions allow us to trigger automated steps after we launch GitHub interactions such as when we push, pull, submit a pull request, or write an issue. Link: https://ropenscilabs.github.io/actions_sandbox/ 30.3 Github learning lab Not R specific or even a book, but looks like a good resource to learn git. Link: https://lab.github.com/ 30.4 Happy Git and GitHub for the useR by Jenny Bryan, Jim Hester, the STAT 545 TAs Happy Git provides opinionated instructions on how to: Install Git and get it working smoothly with GitHub, in the shell and in the RStudio IDE. Develop a few key workflows that cover your most common tasks. Integrate Git and GitHub into your daily work with R and R Markdown. The target reader is someone who uses R for data analysis or who works on R packages, although some of the content may be useful to those working in adjacent areas. Link: https://happygitwithr.com/ 30.5 The Beginners Guide to Git and GitHub by Thomas Mailund A quick beginners guide to using Git and GitHub.You have heard about git and GitHub and want to know what the buzz is about. That is what I am here to tell you. Or, at least, I am here to give you a quick overview of what you can do with git and GitHub. I wont be able, in the space here, to give you an exhaustive list of featuresin all honesty, I dont know enough myself to be able to claim expertise with these tools. I am only a frequent user, but I can get you started and give you some pointers for where to learn more. That is what this booklet is for. Paid: $5 Link: https://amzn.to/2Nt0rDY "],["workflow.html", "31 Workflow 31.1 Agile Data Science with R 31.2 Github actions with R 31.3 How I Use R 31.4 The Data Validation Cookbook 31.5 The targets R Package User Manual", " 31 Workflow 31.1 Agile Data Science with R by Edwin Thoen I joined a Scrum team (frontend, backend, ux designer, product owner, second data scientist) to create a machine learning model that we brought to production using the Agile principles. It was an inspiring experience from which I learned a great deal. My colleagues patiently explained the principles of Agile software development and together we applied them to the data science context.All these experiences culminated in the workflow that we now adhere to at work and I think it is worthwhile to share it. It is heavily based on the principles of Agile software production, hence the title. We have explored which of the concepts from Agile did and did not work for data science and we got hands-on experience in working from these principles in an R project that actually got to production. Link: https://edwinth.github.io/ADSwR/ 31.2 Github actions with R by Chris Brown, Murray Cadzow, Paula A Martinez, Rhydwyn McGuire, David Neuzerling, David Wilkinson, Saras Windecker GitHub actions allow us to trigger automated steps after we launch GitHub interactions such as when we push, pull, submit a pull request, or write an issue. Link: https://ropenscilabs.github.io/actions_sandbox/ 31.3 How I Use R by David Keyes There are many great learning resources at the beginner stage and some incredible tutorials to master complex tasks in R. But, drawing from a concept in urban planning, there are far fewer resources in the middle. Stretching the metaphor perhaps to its breaking point, new R users at the detached single-family home stage cant get to the advanced mid-rise level without going through the middle stage. The missing middle in the R neighborhood is the lack of resources to that answer the types of nuts and bolts questions that new R users often have. Things like: How should I organize my file structure when creating a new project? Should I do data cleaning in an RMarkdown file or an R script file? How do I find packages? How do I know if the packages I find are high quality? This book is my attempt to provide answers to these types of questions. Link: https://howiuser.com/ 31.4 The Data Validation Cookbook by Mark P.J. van der Loo The purposes of this book include demonstrating the main tools and workflows of the validate package, giving examples of common data validation tasks, and showing how to analyze data validation results. Link: https://data-cleaning.github.io/validate/ 31.5 The targets R Package User Manual by Will Landau The targets package is a Make-like pipeline toolkit for Statistics and data science in R. With targets, you can maintain a reproducible workflow without repeating yourself. targets learns how your pipeline fits together, skips costly runtime for tasks that are already up to date, runs only the necessary computation, supports implicit parallel computing, abstracts files as R objects, and shows tangible evidence that the results match the underlying code and data. Link: https://books.ropensci.org/targets/ "],["other-compendiums.html", "32 Other compendiums 32.1 Awesome network analysis 32.2 Bookdown archive 32.3 CRAN doc collections 32.4 Data Science with R: A Resource Compendium 32.5 R on the web 32.6 R project book compendium 32.7 Use R! Springer series", " 32 Other compendiums 32.1 Awesome network analysis Not a book, but a compendium of resources that look really valuable. Link: https://github.com/briatte/awesome-network-analysis 32.2 Bookdown archive An archive all books published via bookdown.org. Its a very very big repo. Link: https://bookdown.org/home/archive/ 32.3 CRAN doc collections Note these projects are frozen, but they do contain a lot of resources in multiple languages. Many of these are quite old publications, but it doesnt mean theyre outdated or not useful. If youre really digging for a specific resource that you cant find anywhere else, it may be here. Good luck! https://cran.r-project.org/other-docs.html Link: https://www.r-project.org/doc/bib/R-books.html 32.4 Data Science with R: A Resource Compendium by Martin Monkman This book grew out of my evergrowing collection of reference materials that was saved as an expanding array of markdown files in a github repo. By assembling it as a book, I hope that it will be more accessible and useful to other R users. Link: https://bookdown.org/martin_monkman/DataScienceResources_book/ 32.5 R on the web by Guillaume Coquere Useful links for people interested in R. Link: https://github.com/shokru/rstats/blob/master/material/R_links.md 32.6 R project book compendium A searchable archive of 180+ books. Link: https://www.r-project.org/doc/bib/R-jabref.html 32.7 Use R! Springer series This is a collection of some 70+ books. This series of inexpensive and focused books on R will publish shorter books aimed at practitioners. Books can discuss the use of R in a particular subject area (e.g., epidemiology, econometrics, psychometrics) or as it relates to statistical topics (e.g., missing data, longitudinal data). Paid: All are paid products Link: https://www.springer.com/series/6991?detailsPage=titles "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] diff --git a/_remodel_branch b/_remodel_branch deleted file mode 100644 index 56690980..00000000 --- a/_remodel_branch +++ /dev/null @@ -1 +0,0 @@ -Just a file named after the branch so I can tell in my local folder which branch is active