Saturday, March 12, 2011

Exploration of a Scatter Plot Visualization


Learn how to explore and interpret a visualization organized around scatter plots.


Your first look at this visualization shows a lot of data in three different frames. Use these following prompts to guide you through the basics.

  • What three variables appear to contribute to obesity?
  • Note the labels on the axes. What units of measurement were used to construct these graphs?
  • What information becomes available to you when you land on a bubble of data in any frame?
  • Not all of the x-axes start at zero. How does this influence your interpretation of the data when comparing the effects of the three variables?
Isolate the data by region of the US (see color code). Examine each of the four regions, focusing this time on the effect of eating limited vegetables and fruit.

  • What differences, if any, can you find between the different regions with regard to the effect of eating limited vegetables and fruit?
  • What differences in eating habits might you expect in the four regions of the US?
Continue to examine each of the remaining variables by region.
  • Are there any regions in which one of the variables is not as highly correlated with a high level of obesity?
  • How can you tell?
  • Make a general summarizing statement regarding the contributions of each of the three factors for each region.
Trend lines have been superimposed on the scatter plots. Examine the trendlines in each of the three frames for any given region.
  • Which variables have slopes that appear to differ from the others?
  • Which variable show the greatest variability between the four regions in its contribution to obesity? (You may want to look at all 4 regions at once to view the trendlines simultaneously.)
  • How can you find the "slope" of the trendline?
  • Which trendline has the lowest slope? What is it?
  • What explanations do you have for this lower slope?
  • What does the "y-intercept" of each trendline tell you?
Examine your own state by selecting the state from the drop-down menu. Now the visualization gets more personal! Find your county by floating around with your mouse. This will take some hunt and peck processing!

Once you find your county, click on it to highlight it in all three frames.
  • How does your county's location compare to the trendline for all the counties in your state for each of the three variables?
  • Which county shows the highest rate of obesity? the highest rate of smokers? the highest rate of NOT eating veggies and fruits?
  • Using the context of what you know about your state, what reasons can you give for the life-habits of the populations of these less healthy counties?
Create a table of data for your county compared to any other of interest by shift clicking on the counties you want to examine in more detail.
  • How does your county compare to the county with the highest rate of smokers?
  • How does your county compare to the county with the lowest rate of smokers?
Expand your table by shift clicking on up to 10 total counties within your state. Highlight the top of the obesity column. Now the icons at the bottom of the viz become accessable.
  • Sort your counties by obesity. Which variables have the highest percentages?
  • Sort your counties by smokers. What happens to the rankings of obesities?
Go back to the top, leaving all of your counties still selected, and select "all" states.
  • Examine your selected counties compared to the national display.
  • How do you compare to the "big picture" in the cluster of points? compared to the trendlines?
  • What are your conclusions about the level of obesity in different regions of the nation?
  • Export your data and print the table and images (see bottom icons) of your selected counties as evidence of your exploration and attach to your responses to these questions.