This book started out as the class notes used in the HarvardX Data Science Series.

The link for the online version of the book is

The R markdown code used to generate the book are available on GitHub. Note that the individual files are not self contained since we run the code included in this file before each one while creating the book. In particular, most chapters require these packages to be loaded:


The graphical theme used for plots throughout the book can be recreated using the ds_theme_set() function from dslabs package.

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

We make annonucements related to the book on Twitter. For updates follow @rafalab