class: center, middle, inverse, title-slide .title[ # PADP 7120 Data Applications in PA ] .subtitle[ ## Data Visualization ] .author[ ### Alex Combs ] .institute[ ### UGA | SPIA | PADP ] .date[ ### Last updated: February 03, 2023 ] --- # Outline - Choosing the appropriate graphs - Guidelines of good visualization --- # Some Data |major | grad_median| |:-----------------------------------------------------------------|-----------:| |Electrical Engineering Technology | 85000| |Anthropology And Archeology | 65000| |Materials Engineering And Materials Science | 92000| |Electrical, Mechanical, And Precision Technologies And Production | 62000| |Library Science | 52000| - Includes 173 graduate degree majors --- # Choosing appropriate graph - Which graph might we use if we want to visualize the distribution of median pay? --- # Histogram .pull-left[ data:image/s3,"s3://crabby-images/23d8a/23d8a07e54002e4247523288012897864907d878" alt=""<!-- --> ] .pull-right[ ```r ggplot(college_grad_students, aes(x = grad_median)) + geom_histogram(bins = 30) ``` ] --- # Exploratory vs. explanatory .center[ data:image/s3,"s3://crabby-images/a0769/a0769e84b5adf5b3897d14bd5f1481983d96f028" alt="" ] --- # An exploratory graph .pull-left[ data:image/s3,"s3://crabby-images/a20af/a20af1af718968be46f9cfa501bd7a3bdffb9473" alt=""<!-- --> ] .pull-right[ ```r ggplot(gm_americas, aes(x = gdpPercap, y = lifeExp)) + geom_point() + geom_smooth(method = "lm", se = FALSE) ``` ] --- # An explanatory graph .pull-left[ data:image/s3,"s3://crabby-images/d63a1/d63a1d192490e8144c87d1a9b2007cc6e7d97adf" alt=""<!-- --> ] .pull-right[ ```r ggplot(gm_americas, aes(x = gdpPercap, y = lifeExp)) + geom_point() + geom_smooth(se = FALSE, linetype = "dashed", color = "gray", span = 1) + scale_x_continuous(labels = scales::dollar) + labs(title = "Life Expectancy and Wealth in the Americas", subtitle = "2007", y = "Life Expectancy", x = "GDP Per Capita", caption = "Source: Gapminder Dataset") + theme_classic() + theme(text = element_text(size = 14)) ``` ] --- # Another graph for an audience .center[ data:image/s3,"s3://crabby-images/ec53e/ec53e6776faa2dac5dafc80cbdb5f6dea0470a7e" alt="" ] --- # Qualities of good viz - Proper graph given the data - Display an accessible complexity of detail - Don't distort perception - Avoid content-free decoration--chart junk - Maximize data-to-ink ratio --- # Chart junk? .center[ data:image/s3,"s3://crabby-images/0e2bf/0e2bf194d877a287e59e4811661538abef11c8f0" alt=":scale 80%" ] --- # Distortion - Reproduction of a graph in the *New York Times* about a crisis in democracy .center[ data:image/s3,"s3://crabby-images/e9872/e98720e0fa8ef4c698517319b1d171d2bafbf6f8" alt="" ] --- # Distortion .center[ data:image/s3,"s3://crabby-images/3b92d/3b92dbdc321ab23b1174989f41ca829999965c23" alt=":scale 85%" ] - Previous graph focused on the percentage of people who gave a rating of 10 --- # Distortion .center[ data:image/s3,"s3://crabby-images/0a691/0a69167aa3f4ca11a619430c6a2059d915c1ebff" alt="" ] --- # Distortion - Law school enrollment trend .center[ data:image/s3,"s3://crabby-images/5e6ef/5e6eff33757a1de6de4b671121fc06762e32cbcd" alt="" ] - Consider how axes change the perception of scale. --- # Possible distortion .center[ data:image/s3,"s3://crabby-images/d2407/d2407053fa19ab40558e9e4fe65026f55bb9165f" alt="" ] - Depending on context, bar charts can lead people to think values inside the bars are more likely than values above. - More examples of bad graphs can be found [here](https://badvisualisations.tumblr.com) --- class: inverse, middle, center # Human perception has strengths and weaknesses --- # Decoding numerical visualization .center[ data:image/s3,"s3://crabby-images/a4eaf/a4eaf496308933a0962316479654785ea6d36217" alt="" ] - Why simple (i.e., one or two variables) bar graphs or scatterplots are typically better. --- # Example .center[ data:image/s3,"s3://crabby-images/ec188/ec188c1b7294efcc9c0c6f366841f7919e80fc0a" alt=":scale 70%" ] - Easily compare totals and bottom category, but other three are difficult to compare across bars. - Also difficult to compare within bars. - Dodged bar chart would be better. --- # Example ``` ## Warning: ggrepel: 1 unlabeled data points (too many overlaps). Consider ## increasing max.overlaps ``` <img src="Visualizations_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> --- # Example ``` ## Warning: ggrepel: 1 unlabeled data points (too many overlaps). Consider ## increasing max.overlaps ``` <img src="Visualizations_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> --- # Decoding categorical visualization .center[ data:image/s3,"s3://crabby-images/3eaca/3eacaf9fac7fa669b10d568e0ba5d9567023e94c" alt=":scale 70%" ] --- # Find the blue circle .center[ data:image/s3,"s3://crabby-images/543cf/543cfdac5794d971fe53cc879be70efeb18156c0" alt=":scale 40%" ] --- # Find the blue circle .center[ data:image/s3,"s3://crabby-images/924d7/924d7f679c93b4d47f94a66ce6496f6ecdee7803" alt=":scale 40%" ] --- # Find the blue circle .center[ data:image/s3,"s3://crabby-images/4eb77/4eb77c0e846310ede67154280095835d8f7e451f" alt=":scale 40%" ] --- # Find the blue circle .center[ data:image/s3,"s3://crabby-images/b494c/b494c2b12c5d99f5caf9329e522c3a72fdadb723" alt=":scale 40%" ] --- # Find the blue circle .center[ data:image/s3,"s3://crabby-images/01820/018208f111bdd697613b3b954d3f33375a912028" alt=":scale 40%" ] --- class: inverse, middle, center # Let's consider a few graphs you all made in R Chapter 5 --- # Number of counties by state data:image/s3,"s3://crabby-images/81679/81679193e05d8fb67d71437278caea1c72854f11" alt=""<!-- --> --- # Number of counties by state data:image/s3,"s3://crabby-images/c5e02/c5e02e50f985d79169302dec7ead3ea033759f00" alt=""<!-- --> --- # Number of counties by state data:image/s3,"s3://crabby-images/47325/47325b6411af0d3d7eaa508a380929af4778351a" alt=""<!-- --> --- # Scatterplot data:image/s3,"s3://crabby-images/32f18/32f18934988164f1c218f1edf682f020b4c0824a" alt=""<!-- --> --- # Scatterplot data:image/s3,"s3://crabby-images/fcf08/fcf08380b7907b2819f822bb120a29ddaa5a67a4" alt=""<!-- --> --- # Scatterplot data:image/s3,"s3://crabby-images/6dcef/6dceffd2ec7d3b79278570274de92e62ec549766" alt=""<!-- --> --- # Scatterplot data:image/s3,"s3://crabby-images/d57b8/d57b8699a246adc4b330e1e6de1dec6073e508a5" alt=""<!-- --> --- # Scatterplot data:image/s3,"s3://crabby-images/6366e/6366ed481e048540a425ef02ef7181a4cdbeb7a9" alt=""<!-- --> --- # Scatterplot data:image/s3,"s3://crabby-images/a39d7/a39d7608b946c57b900bfd4690128deeaa82e784" alt=""<!-- -->