pdf version

Workset 10: Data Visualization

The point of this week is to get experience plotting, but also to dip your toe into the process of reproducible research through coding. So don’t turn in your responses as images this time, but as the commands you run to make the plots.

  1. In the handout, we made a histogram of “year”. Make one of month. What month (they are numbered in the data set) do the most whaling voyages leave in? The least?

  2. Here’s a list of all the variables in the shipping dataset. Make a histogram of something else.

[1] "LastName"      "FirstName"     "Rig"           "Age"           "Skin"          "Hair"          "Eye"          
 [8] "Residence"     "Rank"          "Voyage.number" "VesselNumber"  "height"        "date"          "year"         
[15] "month"         "decade"  
  1. Recall this chart that makes a barplot with colorized bars. After “fill=Rig”, add another parameter to color. What happens?
ggplot(ships) + aes(x=year,fill=Rig) + geom_bar()
  1. Make a facet_wrap or facet_grid small multiple topic that works better than the ones described.

5.Here’s just a little optional brainteaser. Run each of these charts in turn, and see if you can figure out what’s going in the progression. What is different about each one in turn? What kind of barchart is the third one? (You can type ?coord_polar to get a description of the latter case.)

ggplot(ships) + aes(x=factor(1),fill=Rig) + geom_bar(width=1)
ggplot(ships) + aes(x=factor(1),fill=Rig) + geom_bar(width=1) + coord_polar()
ggplot(ships) + aes(x=factor(1),fill=Rig) + geom_bar(width=1) + coord_polar(theta="y")
  1. Regardless of whether you care to look into what “theta” means there, make a plot of the third type on a different element of the data.