Ready-to-hand

Dean Eckles on people, technology & inference

Exploratory data analysis: Our free online course

Moira Burke, Solomon Messing, Chris Saden, and I have created a new online course on exploratory data analysis (EDA) as part of Udacity’s “Data Science” track. It is designed to teach students how to explore data sets. Students learn how to do EDA using R and the visualization package ggplot.

We emphasize the value of EDA for building and testing intuitions about a data set, identifying problems or surprises in data, summarizing variables and relationships, and supporting other data analysis tasks. The course materials are all free, and you can also sign up for tutoring, grading (especially useful for the final project), and certification.

Between providing general advice on data analysis and visualization, stepping students through exactly how to produce particular plots, and reasoning about how the data can answer questions of interest, the course includes interviews with four of our amazing colleagues on the Facebook Data Science team:

One unique feature of this course is that one of the data sets we use is a “pseudo-Facebook” data set that Moira and I created to share many features with real Facebook data, but to not describe any particular real Facebook users or reveal certain kinds of information about aggregate behavior. Other data sets used in the course include two different data sets giving sale prices for diamonds and panel “scanner” data describing yogurt purchases.

It was an fascinating and novel process putting together this course. We scripted almost everything in detail in advance — before any filming started — using first outlines, then drafts using Markdown in R with knitr, and then more detailed scripts with Udacity-specific notation for all the different shots and interspersed quizzes. I think this is part of what leads Kaiser Fung to write:

The course is designed from the ground up for online instruction, and it shows. If you have tried other online courses, you will immediately notice the difference in quality.

Check out the course and let me know what you think — we’re still incorporating feedback.

3 thoughts on “Exploratory data analysis: Our free online course

  1. The course looks great. Are you involved only in course design or do you work directly with participants?

  2. I’ve looked at some of the threads in the online forums and chimed in a couple times (as have my FB colleagues). I also look forward to checking out some of the final projects. But the tutoring and grading is provided primarily by Udacity folks.

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top