Home

Awesome

RStartHere

A guide to some of the most useful R Packages that we know about, organized by their role in data science.

Click here to suggest packages.

Data Science Workflow

Each data science project is different, but each follows the same general steps. You:

"The data science workflow"

  1. Import your data into R

  2. Tidy it

  3. Understand your data by iteratively

    1. visualizing
    2. tranforming and
    3. modeling your data
  4. Infer how your understanding applies to other data sets (including future data, i.e. predictions)

  5. Communicate your results to an audience, or

  6. Automate your analysis for easy reuse

  7. Program the whole way through, since you do each of these things on a computer

Below we list the most useful R packages that we know of for each step.

Import

These packages help you import data into R and save data.

Tidy

These packages help you wrangle your data into a form that is easy to analyze in R.

Visualize

These packages help you visualize your data.

Transform

These packages help you transform your data into new types of data.

Model/Infer

These packages help you build models and make inferences. Often the same packages will focus on both topics.

Communicate

These packages help you communicate the results of data science to your audiences.

Automate

These packages help you create data science products that automate your analyses.

Program

These packages make it easier to program with the R language.

Data

These packages contain data sets to use as training data or toy examples.

Criteria

What makes an R Package useful? A useful R package should perform a useful task, and it should do it well. Here are some criteria that we used to make the list.

For other useful choices, please check out our list of popular packages that did not quite meet these criteria.

You can learn more about packages in R with the CRAN task views.