Javascript required
Skip to content Skip to sidebar Skip to footer

How to Read Wall Street Journal Articles

We know that the NY Times data visualizations are pretty awesome, but the Wall Street Periodical's data visualizations are nothing to express joy at. In fact, one of my favorite books on information visualization is by Dona Wong, the former graphics editor of Wall Street Journal and a educatee of Ed Tufte at Yale (and yes, she was at the NY Times too).

The Wall Street Periodical Information Visualization

One of my favorite data visualizations from the WSJ is the graphic on Chick-fil-A and the public opinion on same-sexual practice marriage. The WSJ published this graphic in an article after the Chick-fil-A president commented against same-sex activity marriage. The author of this article suggested that Chick-fil-A could afford to express such sentiments because their stores, hence customers, were in regions with bulk of the public'due south opinion was confronting same-sex spousal relationship.

WSJ Data Visualization on Chick-fil-A

What I liked about the data visualization:

  • The graphic itself was very neutral in its color palette and in its conclusions. It left information technology up to the readers to draw whatsoever conclusions.
  • The store counts in each state were easy to meet and compare. There was no beautiful icon for each store.
  • It removed all the distractions from the data visualization. No extraneous colors, no 3-d effects, no unnecessary information.
  • The colors of diverse regions were matched past the colors of the legends for the pie charts.
  • Yes, pie charts:
    • They were suitable to explicate the three opinion categories.
    • They all started with "Favor" at 12 o'clock.
    • They were not exploding or were in iii-D, nor did they have l tiny slices.
    • The legends were printed only one time.
    • A reader could quickly distinguish the public opinion slices using the simple colors.

WSJ Data Visualization in R

Like my previous mail service on NYT's information visualization, I wanted to recreate this WSJ data visualization using R. Here's how I tried to replicate that graphic:

Load favorite libraries

                  library                  (rvest)                  library                  (stringr)                  library                  (dplyr)                  library                  (ggplot2)                  library                  (scales)                  library                  (ggmap)                  library                  (readr)                  library                  (tidyr)                  library                  (zipcode)                  library                  (maps)                  library                  (jsonlite)                  library                  (                  grid                  )                  library                  (gridExtra)                  data                  (zipcode)                  ## load zipcode data in your workspace                

library(rvest) library(stringr) library(dplyr) library(ggplot2) library(scales) library(ggmap) library(readr) library(tidyr) library(zipcode) library(maps) library(jsonlite) library(grid) library(gridExtra) data(zipcode) ## load zipcode information in your workspace

  • rvest to scrape information from the spider web
  • stringr for cord manipulation
  • dplyr for data manipulation
  • ggplot2 for plotting, duh!
  • scales to generate beautiful axis labels
  • ggmap to get and plot the United States map
  • readr to read data from CSVs
  • tidyr to transform data from long to wide form and vice versa
  • zipcode to clean upwardly zipcodes and become the coordinates for each zipcode
  • jsonlite to get data from a json object
  • grid to line up charts
  • gridExtra considering stackoverflow said so 🙂

Get Chick-fil-A locations

I plant out that Chick-fil-A listed all its store locations on this page past each country. Every state has a separate page with store locations from that state.

And so, showtime, I got all the states with a shop in an XML document:

states                  <-                  read_html(                  "https://world wide web.Chick-fil-A.com/Locations/Scan"                  )                

states <- read_html("https://www.Chick-fil-A.com/Locations/Scan")

Then, using the CSS selector values, I parsed the XML to extract all the hyperlink text with the URL to each state:

locations                  <-                  states                  %>%                  html_nodes(                  "commodity ul li a"                  )                  %>%                  html_attr(                  "href"                  )                

locations <- states %>% html_nodes("article ul li a") %>% html_attr("href")

You can utilize Selector Gadget to find CSS selectors of objects on a given web-page. Delight do read rvest'southward documentation for more information on using rvest for web-scraping.

You lot will become a character vector of all the relative urls of each state. Like this:

              ## [1] "/Locations/Browse/AL" "/Locations/Browse/AZ" "/Locations/Browse/AR"            

And then, nosotros demand to bring together this relative URL to the acme domain:

locations                  <-                  paste0(                  "https://www.Chick-fil-A.com", locations)                

locations <- paste0("https://world wide web.Chick-fil-A.com", locations)

At present, nosotros need to scrape every state page and extract the zipcode of all the stores on that page. I wrote a function to help usa with that:

extract_location                  <-                  function                  (                  url                  )                  {                  read_html(                  url                  )                  %>%                  html_nodes(                  ".location p"                  )                  %>%                  html_text(                  )                  %>%                  str_extract(                  "\\d{5}(?=\northward)"                  )                  }                

extract_location <- function(url){ read_html(url) %>% html_nodes(".location p") %>% html_text() %>% str_extract("\\d{5}(?=\n)") }

The function will download each URL, find the div with the location class, convert the lucifer to text, and lastly extract the five digit number (zipcode) using regular expression.

We need to laissez passer the URL of each state to this function. We can achieve that by using sapply role, but be patient as this will accept a minute or two:

locationzips                  <-                  sapply                  (locations, extract_location, Use.NAMES                  =                  FALSE)                

locationzips <- sapply(locations, extract_location, Apply.NAMES = False)

Clean upwardly the store zip

To make the to a higher place data usable for this data visualization, nosotros need to put the zipcode list in a data frame.

locationzips_df                  <-                  data.frame                                    (zips                  =                  unlist                  (locationzips), stringsAsFactors                  =                  FALSE)                

locationzips_df <- information.frame(zips = unlist(locationzips), stringsAsFactors = FALSE)

If this postal service goes viral, I wouldn't want millions of people trying to download this information from Chick-fil-A's site. I saved you these steps and stored the zips in a csv on my dropbox (find the raw=i parameter at the end to get the file direct and not the preview from dropbox):

locationzips_df                  <-                  read_csv(                  "https://www.dropbox.com/s/x8xwdx61go07e4e/chickfilalocationzips.csv?raw=ane", col_types                  =                  "c"                  )                

locationzips_df <- read_csv("https://www.dropbox.com/s/x8xwdx61go07e4e/chickfilalocationzips.csv?raw=i", col_types = "c")

In example nosotros got some bad data, clean up the zipcodes using clean.zipcodes function from the zipcode package.

locationzips_df$zips                  <-                  make clean.zipcodes                  (locationzips_df$zips)                

locationzips_df$zips <- make clean.zipcodes(locationzips_df$zips)

Merge coordinate data

Next, we demand to get the coordinate data on each zipcode. Again, the dataset from the zipcode package provides us with that data.

locationzips_df                  <-                  merge                  (locationzips_df, select(zipcode, zip, latitude, longitude, state), by.x                  =                  'zips', past.y                  =                  'aught', all.ten                  =                  True)                

locationzips_df <- merge(locationzips_df, select(zipcode, zip, latitude, longitude, state), by.x = 'zips', by.y = 'zip', all.x = TRUE)

Calculate the total number of stores by country

This is really easy with some dplyr magic:

rest_cnt_by_state                  <-                  count(locationzips_df, country)                

rest_cnt_by_state <- count(locationzips_df, state)

This information frame will look something like this:

              ## # A tibble: 6 × 2 ##   country     n ##                                                      ## 1    AL    77 ## 2    AR    32 ## three    AZ    35 ## 4    CA    90 ## five    CO    47 ## half dozen    CT     seven                                          

Gather the public stance data

The PRRI portal shows various survey data on American values. The latest year with data is 2015. I dug into the HTML to detect the path to save the JSON data:

region_opinion_favor                  <-                  fromJSON(                  "http://ava.prri.org/ajx_map.regionsdata?category=lgbt_ssm&sc=ii&year=2015&topic=lgbt"                  )$regions region_opinion_oppose                  <-                  fromJSON(                  "http://ava.prri.org/ajx_map.regionsdata?category=lgbt_ssm&sc=3&yr=2015&topic=lgbt"                  )$regions

region_opinion_favor <- fromJSON("http://ava.prri.org/ajx_map.regionsdata?category=lgbt_ssm&sc=ii&twelvemonth=2015&topic=lgbt")$regions region_opinion_oppose <- fromJSON("http://ava.prri.org/ajx_map.regionsdata?category=lgbt_ssm&sc=3&year=2015&topic=lgbt")$regions

Adjacent, I added a field to note the opinion:

region_opinion_favor$opi                  <-                  'favor'                  region_opinion_oppose$opi                  <-                  'oppose'                

region_opinion_favor$opi <- 'favor' region_opinion_oppose$opi <- 'oppose'

And then, I manipulated this data to make information technology usable for the pie charts:

region_opinion                  <-                  bind_rows(region_opinion_favor, region_opinion_oppose)                  %>%                  filter                  (region                  !=                  'national'                  )                  %>%                  mutate(region                  =                  recode(region,                  "i"                  =                  "Northeast",                  "2"                  =                  "Midwest",                  "3"                  =                  "Due south",                  "four"                  =                  "W"                  )                  )                  %>%                  spread(key                  =                  opi, value                  =                  percent)                  %>%                  mutate(other                  =                  100                  -                  favor                  -                  oppose)                  %>%                  assemble(key                  =                  opi, value                  =                  percent,                  -region,                  -                  sort                  )                  %>%                  select(                  -                  sort                  )                  %>%                  mutate(opi                  =                  factor                  (opi,                  levels                  =                  c                  (                  'oppose',                  'other',                  'favor'                  ),                  ordered                  =                  True)                  )                

region_opinion <- bind_rows(region_opinion_favor, region_opinion_oppose) %>% filter(region != 'national') %>% mutate(region = recode(region, "1" = "Northeast", "ii" = "Midwest", "3" = "Southward", "iv" = "West")) %>% spread(key = opi, value = per centum) %>% mutate(other = 100 - favor - oppose) %>% gather(cardinal = opi, value = percent, -region, -sort) %>% select(-sort) %>% mutate(opi = cistron(opi, levels = c('oppose', 'other', 'favor'), ordered = TRUE))

There's a lot of stuff going on in this code. Let me explicate:

  1. Nosotros bind the two data frames we created
  2. We remove the data row for the national average
  3. We recode the numerical regions to text regions
  4. Nosotros spread the information from long format to wide format. Read tidyr documentation for farther explanation.
  5. Now that the opinions are two columns, we create some other cavalcade for the remaining/unknown opinions.
  6. Nosotros bring everything dorsum to long form using gather.
  7. We remove the sort column.
  8. We create an ordered factor for the opinion. This is so that the oppose stance shows up at the top on the charts.

Afterward all that, nosotros get a data frame that looks like this:

              ##      region    opi percent ## 1 Northeast  favor      63 ## 2   Midwest  favor      54 ## three     South  favor      46 ## 4      Due west  favor      59 ## v Northeast oppose      29 ## 6   Midwest oppose      38            

The WSJ data visualization did one more than useful thing: it ordered the pie charts with the regions with most opposition to aforementioned-sex activity marriage at the tiptop. The way to handle this kind of stuff in ggplot is to order the underlying factors.

regions_oppose_sorted                  <-                  arrange(                  filter                  (region_opinion, opi                  ==                  'oppose'                  ), desc(percent)                  )$region region_opinion                  <-                  mutate(region_opinion, region                  =                  factor                  (region,                  levels                  =                  regions_oppose_sorted,                  ordered                  =                  TRUE)                  )                

regions_oppose_sorted <- arrange(filter(region_opinion, opi == 'oppose'), desc(per centum))$region region_opinion <- mutate(region_opinion, region = factor(region, levels = regions_oppose_sorted, ordered = True))

Now that our dataframe is fix, we tin create the pie charts using ggplot.

Create the pie charts

To create the pie charts in ggplot, nosotros actually create stacked bar graphs first and so change the polar coordinates. We could also utilise the base R plotting functions, but I wanted to test the limits of ggplot.

opin_pie_charts                  <-                  ggplot(region_opinion, aes(10                  =                  one, y                  =                  percent, make full                  =                  opi)                  )                  +                  geom_bar(width                  =                  one,stat                  =                  "identity", size                  =                  0.iii, color                  =                  "white", alpha                  =                  0.85, evidence.legend                  =                  Faux)                

opin_pie_charts <- ggplot(region_opinion, aes(10 = one, y = percent, fill = opi)) + geom_bar(width = 1,stat = "identity", size = 0.iii, color = "white", blastoff = 0.85, show.fable = False)

At that place are few things worth explaining here:

  • We set up the aes 10 value to 1 equally we really don't accept an x-axis.
  • We set the aes fill value to the variable opi that has values of oppose, favor and other.
  • We set the bar width to i. I chose this value after some iterations.
  • Nosotros set the stat to identity considering nosotros don't ggplot to summate the bar proportions.
  • Nosotros set the color to white – this is the color of the separator between each stack of the bars.
  • We set the size to 0.3 that gives us reasonable spacing between the stacks. I did so to match the WSJ visualization.

This is the chart we get:

And, your reaction totally may be like this:

first time you see the default ggplot stacked bar colors

first fourth dimension y'all meet the default ggplot stacked bar colors

Well, relax, Kevin. This will get better. I hope.

Adjacent, nosotros add facets to create a separate chart for each region as well as adding the polar coordinates to create the pie chart. I as well added theme_void to remove all the gridlines, centrality lines, and labels.

opin_pie_charts                  <-                  opin_pie_charts                  +                  facet_grid(region ~ .,                  switch                  =                  "y"                  )                  +                  coord_polar(                  "y"                  )                  +                  theme_void(                  )                

opin_pie_charts <- opin_pie_charts + facet_grid(region ~ ., switch = "y") + coord_polar("y") + theme_void()

Resulting in this:

Feeling better, Kevin?


Let's change the colors, which I found using prototype color picker, of the slices to match with the WSJ's data visualization:

opin_pie_charts                  <-                  opin_pie_charts                  +                  scale_fill_manual(values                  =                  c                  (                  "favor"                  =                  "black",                  "oppose"                  =                  "#908E89",                  "other"                  =                  "#F7F3E8"                  )                  )                

opin_pie_charts <- opin_pie_charts + scale_fill_manual(values = c("favor" = "black", "oppose" = "#908E89", "other" = "#F7F3E8"))

Giving us this:

Next, add together information tables. I simply picked some good values where it would make sense to evidence the labels:

                  #add together labels                  opin_pie_charts                  <-                  opin_pie_charts                  +                  geom_text(color                  =                  "white", size                  =                  rel(                  3.5                  ), aes(x                  =                  1, y                  =                  ifelse                  (opi                  ==                  'favor',                  15,                  85                  ), label                  =                  ifelse                  (opi                  ==                  'other', NA, paste0(str_to_title(opi),                  "\due north", percent(per centum/                  100                  )                  )                  )                  )                  )                

#add together labels opin_pie_charts <- opin_pie_charts + geom_text(colour = "white", size = rel(3.five), aes(x = 1, y = ifelse(opi == 'favor', 15, 85), label = ifelse(opi == 'other', NA, paste0(str_to_title(opi), "\due north", percentage(percent/100)))))

Resulting in this:

Side by side, calculation the plot championship as given in the WSJ data visualization:

opin_pie_charts                  <-                  opin_pie_charts                  +                  ggtitle(label                  =                  "Survey of views of gay\northmarriage, by region"                  )                  +                  theme(plot.championship                  =                  element_text(size                  =                  10, confront                  =                  "bold", hjust                  =                  one                  )                  )                

opin_pie_charts <- opin_pie_charts + ggtitle(characterization = "Survey of views of gay\nmarriage, past region") + theme(plot.championship = element_text(size = 10, face = "bold", hjust = 1))

And, the last step:

  • change the background color of the region labels,
  • emphasize the region labels
  • change the console color as well as the plot background color
opin_pie_charts                  <-                  opin_pie_charts                  +                  theme(strip.background                  =                  element_rect(fill                  =                  "#E6D9B7"                  ), strip.text                  =                  element_text(face up                  =                  "assuming"                  ), plot.background                  =                  element_rect(fill                  =                  "#F6F4E8", color                  =                  NA), panel.background                  =                  element_rect(fill                  =                  "#F6F4E8", colour                  =                  NA)                  )                

opin_pie_charts <- opin_pie_charts + theme(strip.groundwork = element_rect(fill = "#E6D9B7"), strip.text = element_text(face = "bold"), plot.background = element_rect(fill = "#F6F4E8", colour = NA), panel.background = element_rect(fill up = "#F6F4E8", colour = NA))

Getting us actually close to the WSJ pie charts:

I should say that a pie nautical chart in ggplot is difficult to customize considering you lose one axis. Next fourth dimension, I would try the base R pie chart.

Creating the map

Phew! That was some work to make the pie charts look similar to the WSJ information visualization. At present, more fun: the maps.

First, permit'south become some data ready.

R comes with various data points on each of u.s.. Then, nosotros will become the centers of u.s., abbreviations, regions, and state names.

state_centers_abb                  <-                  data.frame                                    (state                  =                  tolower                  (                  state.name                                    ),                                  stateabr                  =                  state.abb                  ,                                  center_long                  =                  state.center                  $x, center_lat                  =                  country.center                  $y,                                 region                  =                  state.region                                    )                

state_centers_abb <- information.frame(state = tolower(state.proper name), stateabr = state.abb, center_long = state.center$ten, center_lat = state.center$y, region = land.region)

Which gives united states of america this:

              ##        country stateabr center_long center_lat region ## 1    alabama       AL    -86.7509    32.5901  South ## 2     alaska       AK   -127.2500    49.2500   Due west ## three    arizona       AZ   -111.6250    34.2192   West ## 4   arkansas       AR    -92.2992    34.7336  South ## 5 california       CA   -119.7730    36.5341   Due west ## half dozen   colorado       CO   -105.5130    38.6777   West            

The regions in this data prepare have a value of Northward Fundamental; allow's change that to Midwest.

state_centers_abb                  <-                  mutate(state_centers_abb, region                  =                  recode(region,                  "North Fundamental"                  =                  "Midwest"                  )                  )                

state_centers_abb <- mutate(state_centers_abb, region = recode(region, "Northward Cardinal" = "Midwest"))

Next, allow's go the polygon data on each country and merge it with the state centers data frame:

us_map                  <-                  map_data(                  "state"                  )                  us_map                  <-                  rename(us_map, state                  =                  region)                  us_map                  <-                  inner_join(us_map, state_centers_abb,                  past                  =                  c                  (                  "state"                  =                  "state"                  )                  )                  us_map                  <-                  us_map[                  social club                  (us_map$order),                  ]                

us_map <- map_data("state") us_map <- rename(us_map, state = region) us_map <- inner_join(us_map, state_centers_abb, by = c("state" = "state")) us_map <- us_map[order(us_map$order), ]

While plotting the polygon data using ggplot, you take to make sure that the order column of the polygon data frame is ordered, otherwise, you volition get some wacky looking shapes.

With some prep work done, let the fun begin. Let's create the base map:

base_plot                  <-                  ggplot(                  data                  =                  left_join(us_map, rest_cnt_by_state,                  by                  =                  c                  (                  "stateabr"                  =                  "state"                  )                  ), aes(x                  =                  long, y                  =                  lat, group                  =                  group,                  order                  =                  society, fill up                  =                  region)                  )                  +                  geom_polygon(color                  =                  "white"                  )                

base_plot <- ggplot(information = left_join(us_map, rest_cnt_by_state, by = c("stateabr" = "land")), aes(ten = long, y = lat, grouping = grouping, order = order, fill = region)) + geom_polygon(colour = "white")

You will note that I've joined the polygon data frame with the counts of restaurants in each land. Also, similar to the stacked bar graphs, I've separated each country with the white color, giving us this:

I already see Kevin going crazy!

I again used the epitome color picker to select the colors from the WSJ data visualization and assigned them manually to each region. I likewise removed the legend:

base_plot                  <-                  base_plot                  +                  scale_fill_manual(values                  =                  c                  (                  "West"                  =                  "#E6D9B7",                  "South"                  =                  "#F2DBD5",                  "Northeast"                  =                  "#D8E7D7",                  "Midwest"                  =                  "#D6DBE3"                  ), guide                  =                  Simulated)                

base_plot <- base_plot + scale_fill_manual(values = c("West" = "#E6D9B7", "South" = "#F2DBD5", "Northeast" = "#D8E7D7", "Midwest" = "#D6DBE3"), guide = FALSE)

Generating this:

Next, remove all the distractions and modify the background color:

base_plot                  <-                  base_plot                  +                  theme(axis.championship                  =                  element_blank(                  ),                                   axis.ticks                  =                  element_blank(                  ),                                   axis.line                  =                  element_blank(                  ),                                   axis.text                  =                  element_blank(                  ),                                   panel.grid                  =                  element_blank(                  ),                                   console.border                  =                  element_blank(                  ),                                   plot.background                  =                  element_rect(fill                  =                  "#F6F4E8"                  ),                                   panel.background                  =                  element_rect(fill up                  =                  "#F6F4E8"                  )                  )                

base_plot <- base_plot + theme(axis.title = element_blank(), axis.ticks = element_blank(), axis.line = element_blank(), axis.text = element_blank(), panel.grid = element_blank(), panel.border = element_blank(), plot.background = element_rect(fill up = "#F6F4E8"), panel.groundwork = element_rect(fill = "#F6F4E8"))

Giving us this:

Side by side upwards is adding the circles for restaurant counts.

base_plot                  <-                  base_plot                  +                  geom_point(aes(size                  =                  n, x                  =                  center_long, y                  =                  center_lat), shape                  =                  ane, stroke                  =                  0.5, colour                  =                  "grey60", inherit.aes                  =                  FALSE)                  +                  scale_size_area(max_size                  =                  18, breaks                  =                  c                  (                  0,                  10,                  100,                  300                  ), guide                  =                  FALSE)                  +                  scale_shape(solid                  =                  False)                

base_plot <- base_plot + geom_point(aes(size = northward, x = center_long, y = center_lat), shape = 1, stroke = 0.5, color = "grey60", inherit.aes = Faux) + scale_size_area(max_size = 18, breaks = c(0, x, 100, 300), guide = False) + scale_shape(solid = FALSE)

OK. In that location is a lot of stuff going on hither. First, we employ geom_point to create the circles at the center of each state. The size of the circle is dictated past the number of restaurants in each state. The stroke parameters controls the thickness of the circle. Nosotros besides are using inherit.aes = FALSE to create new aesthetics for this geom.

The scale_size_area is very of import considering as the documentation says:

scale_size scales area, scale_radius scales radius. The size aesthetic is nigh unremarkably used for points and text, and humans perceive the surface area of points (non their radius), and so this provides for optimal perception. scale_size_area ensures that a value of 0 is mapped to a size of 0.

Don't allow this happen to you! Employ the area and non the radius.

I as well increased the size of the circles and gave breaks manually to the circle sizes.

Generating this:

Since we removed the legend for the circumvolve size and the WSJ graphic had one, allow's endeavour to add together that back in. This was challenging and hardly accurate. I played with some sizes and eyeballed to match the radii of different circles. I don't recommend this at all. This is better handled in postal service-processing using Inkscape or Illustrator.

Please add the circles every bit a legend in post-processing. Adding circumvolve grobs is not authentic and doesn't produce the desired results.

base_plot                  <-                  base_plot                  +                  annotation_custom(grob                  =                  circleGrob(r                  =                  unit of measurement(                  i.4,                  "lines"                  ), gp                  =                  gpar(fill                  =                  "gray87", alpha                  =                  0.5                  )                  ), xmin                  =                  -                  79, xmax                  =                  -                  77, ymin                  =                  48                  , ymax                  =                  49                  )                  +                  annotation_custom(grob                  =                  circleGrob(r                  =                  unit(                  0.7,                  "lines"                  ), gp                  =                  gpar(make full                  =                  "gray85", alpha                  =                  0.87                  )                  ), xmin                  =                  -                  79, xmax                  =                  -                  77, ymin                  =                  47.4, ymax                  =                  48                  )                  +                  annotation_custom(grob                  =                  circleGrob(r                  =                  unit(                  0.3,                  "lines"                  ), gp                  =                  gpar(fill                  =                  "gray80", alpha                  =                  0.95                  )                  ), xmin                  =                  -                  79, xmax                  =                  -                  77, ymin                  =                  46.5, ymax                  =                  48                  )                

base_plot <- base_plot + annotation_custom(grob = circleGrob(r = unit of measurement(1.4, "lines"), gp = gpar(fill = "gray87", blastoff = 0.5)), xmin = -79, xmax = -77, ymin = 48 , ymax = 49) + annotation_custom(grob = circleGrob(r = unit of measurement(0.7, "lines"), gp = gpar(fill = "gray85", blastoff = 0.87)), xmin = -79, xmax = -77, ymin = 47.4, ymax = 48) + annotation_custom(grob = circleGrob(r = unit(0.3, "lines"), gp = gpar(fill = "gray80", alpha = 0.95)), xmin = -79, xmax = -77, ymin = 46.five, ymax = 48)

Giving united states of america:

Let'south add the title:

base_plot                  <-                  base_plot                  +                  ggtitle(label                  =                  "Number of Chick-fil-A resturants by state and region"                  )                  +                  theme(plot.title                  =                  element_text(size                  =                  10, face                  =                  "bold", hjust                  =                  0.fourteen, vjust                  =                  -                  5                  )                  )                

base_plot <- base_plot + ggtitle(label = "Number of Chick-fil-A resturants by state and region") + theme(plot.title = element_text(size = 10, confront = "bold", hjust = 0.14, vjust = -5))

Merging united states and pie charts

This is the easiest step of all. We use the grid.adapt function to merge the two plots to create our own WSJ information visualization in R.

                  png                  (                  "wsj-Chick-fil-A-data-visualization-r-plot.png", width                  =                  800, top                  =                  500,                  units                  =                  "px"                  )                  grid.suit                  (base_plot, opin_pie_charts, widths=                  c                  (                  4,                  1                  ),                  nrow                  =                  1                  )                  dev.off                                    (                  )                

png("wsj-Chick-fil-A-data-visualization-r-plot.png", width = 800, height = 500, units = "px") grid.adapt(base_plot, opin_pie_charts, widths= c(4, 1), nrow = 1) dev.off()

Here it is:
WSJ infographics in R

What do yous call back?

Some things I couldn't brand work:

  • Alter the colour of the facet of the pie charts. I toyed with strip.text settings, but I couldn't change all the colors. Possibly, it is easy to do so in the base R pie charts.
  • The circumvolve-in-circle legend. I got the circles, but not the numbers.
  • The 'other' category label in the pie chart.
  • Brand the circles on the maps look less jagged-y.

Does the data visualization still work?

Of course, the public opinion over same-sex matrimony has changed across usa and Chick-fil-A has opened up more stores across the country.

It however does wait similar that bigger circles are in the regions where people oppose the same-sex marriage.

I wanted to question that. According to the Pew research center information, only 33% of Republicans favor same-sex matrimony. And then, if we were to plot the 2016 U.S. Presidential elections past county and plot each zipcode of Chick-fil-A stores, nosotros tin can see whether there are more stores in counties voting for Donald Trump.

Permit's become started so.

Go the county level information. Tony McGovern already did all the hard work:

cnty_results                  <-                  read_csv(                  "https://raw.githubusercontent.com/tonmcg/County_Level_Election_Results_12-16/master/2016_US_County_Level_Presidential_Results.csv"                  )                  cnty_results                  <-                  mutate(cnty_results, per_point_diff                  =                  per_gop                  -                  per_dem,                        county_name                  =                  tolower                  (str_replace(county_name,                  " County",                  ""                  )                  ),                        county_name                  =                  tolower                  (str_replace(county_name,                  " parish",                  ""                  )                  )                  )                

cnty_results <- read_csv("https://raw.githubusercontent.com/tonmcg/County_Level_Election_Results_12-16/master/2016_US_County_Level_Presidential_Results.csv") cnty_results <- mutate(cnty_results, per_point_diff = per_gop - per_dem, county_name = tolower(str_replace(county_name, " County", "")), county_name = tolower(str_replace(county_name, " parish", "")))

Become the canton map polygon data:

county_map                  <-                  map_data(                  "canton"                  )                  county_map                  <-                  rename(county_map, state                  =                  region, county                  =                  subregion)                  county_map$stateabr                  <-                  land.abb                                    [                  match                  (county_map$state,                  tolower                  (                  state.proper name                                    )                  )                  ]                

county_map <- map_data("canton") county_map <- rename(county_map, state = region, county = subregion) county_map$stateabr <- state.abb[match(county_map$state, tolower(state.name))]

Join the polygon with county results:

cnty_map_results                  <-                  left_join(county_map, cnty_results,                  by                  =                  c                  (                  "stateabr"                  =                  "state_abbr",                  "county"                  =                  "county_name"                  )                  )                  %>%                  accommodate(                  guild, group)                

cnty_map_results <- left_join(county_map, cnty_results, by = c("stateabr" = "state_abbr", "county" = "county_name")) %>% adapt(order, group)

Plot the results and counties:

cnty_plot                  <-                  ggplot(                  data                  =                  cnty_map_results, aes(x                  =                  long, y                  =                  lat, grouping                  =                  group,                  order                  =                  society                  )                  )                  +                  geom_polygon(aes(fill up                  =                  per_point_diff), colour                  =                  "gray90", size                  =                  0.1                  )                

cnty_plot <- ggplot(data = cnty_map_results, aes(x = long, y = lat, group = grouping, order = order)) + geom_polygon(aes(fill = per_point_diff), color = "gray90", size = 0.1)

This what we become:

Now, we fill up the map with red and blue colors for the republican and democratic voting counties:

cnty_plot                  <-                  cnty_plot                  +                  scale_fill_gradient2(midpoint                  =                  0, low                  =                  "#2062A1", mid                  =                  "white", high                  =                  "ruddy", na.value                  =                  "white", breaks                  =                  seq                  (from                  =                  0, to                  =                  ane, length.out                  =                  vii                  ), guide                  =                  FALSE)                

cnty_plot <- cnty_plot + scale_fill_gradient2(midpoint = 0, low = "#2062A1", mid = "white", high = "red", na.value = "white", breaks = seq(from = 0, to = ane, length.out = seven), guide = FALSE)

Generating this:

Permit'south remove all the distractions and add the store locations past zipcode:

cnty_plot                  <-                  cnty_plot                  +                  theme_void(                  )                  +                  geom_point(                  data                  =                  locationzips_df, aes(x                  =                  longitude, y                  =                  latitude), size                  =                  0.3, alpha                  =                  0.2, inherit.aes                  =                  FALSE)                

cnty_plot <- cnty_plot + theme_void() + geom_point(information = locationzips_df, aes(x = longitude, y = latitude), size = 0.iii, alpha = 0.2, inherit.aes = Imitation)

Giving u.s.a. this:

wsj-chick-fil-a-locations-presidential-elections-county-data-visualization

Very dissimilar picture, wouldn't you say? It actually looks similar that the store locations are present in more than autonomous leaning counties or at least the counties that are as divided betwixt the republican and democratic votes.

Of course, it is possible that I messed something upwardly. Simply, I tin can conclude two things based on this chart:

  • Aggregation can mislead us to come across the non-existent patterns
  • A person's political identification or views has nothing to do with the nutrient he or she likes. And, the Chick-fil-A leaders know that.

What exercise you lot think?

Consummate Script

                  library                  (rvest)                  library                  (stringr)                  library                  (dplyr)                  library                  (ggplot2)                  library                  (scales)                  library                  (ggmap)                  library                  (readr)                  library                  (tidyr)                  library                  (zipcode)                  library                  (maps)                  library                  (jsonlite)                  library                  (                  grid                  )                  library                  (gridExtra)                  data                  (zipcode)                  ## load zipcode data in your workspace                  states                  <-                  read_html(                  "https://world wide web.Chick-fil-A.com/Locations/Browse"                  )                  locations                  <-                  states                  %>%                  html_nodes(                  "article ul li a"                  )                  %>%                  html_attr(                  "href"                  )                  locations                  <-                  paste0(                  "https://www.Chick-fil-A.com", locations)                  extract_location                  <-                  function                  (                  url                  )                  {                  read_html(                  url                  )                  %>%                  html_nodes(                  ".location p"                  )                  %>%                  html_text(                  )                  %>%                  str_extract(                  "\\d{v}(?=\n)"                  )                  }                  locationzips                  <-                  sapply                  (locations, extract_location, Use.NAMES                  =                  Simulated)                  locationzips_df                  <-                  data.frame                                    (zips                  =                  unlist                  (locationzips), stringsAsFactors                  =                  Fake)                  #locationzips_df <- read_csv("https://www.dropbox.com/s/x8xwdx61go07e4e/chickfilalocationzips.csv?raw=1", col_types = "c")                  locationzips_df$zips                  <-                  clean.zipcodes                  (locationzips_df$zips)                  locationzips_df                  <-                  merge                  (locationzips_df, select(zipcode, zip, latitude, longitude, state), by.x                  =                  'zips', by.y                  =                  'zip', all.ten                  =                  TRUE)                  est_cnt_by_state                  <-                  count(locationzips_df, state)                  head                  (rest_cnt_by_state)                  region_opinion_favor                  <-                  fromJSON(                  "http://ava.prri.org/ajx_map.regionsdata?category=lgbt_ssm&sc=2&year=2015&topic=lgbt"                  )$regions region_opinion_oppose                  <-                  fromJSON(                  "http://ava.prri.org/ajx_map.regionsdata?category=lgbt_ssm&sc=3&year=2015&topic=lgbt"                  )$regions   region_opinion_favor$opi                  <-                  'favor'                  region_opinion_oppose$opi                  <-                  'oppose'                  region_opinion                  <-                  bind_rows(region_opinion_favor, region_opinion_oppose)                  %>%                  filter                  (region                  !=                  'national'                  )                  %>%                  mutate(region                  =                  recode(region,                  "one"                  =                  "Northeast",                  "ii"                  =                  "Midwest",                  "3"                  =                  "Due south",                  "4"                  =                  "West"                  )                  )                  %>%                  spread(key                  =                  opi, value                  =                  percent)                  %>%                  mutate(other                  =                  100                  -                  favor                  -                  oppose)                  %>%                  gather(key                  =                  opi, value                  =                  percentage,                  -region,                  -                  sort                  )                  %>%                  select(                  -                  sort                  )                  %>%                  mutate(opi                  =                  factor                  (opi,                  levels                  =                  c                  (                  'oppose',                  'other',                  'favor'                  ),                  ordered                  =                  TRUE)                  )                  head                  (region_opinion)                  regions_oppose_sorted                  <-                  suit(                  filter                  (region_opinion, opi                  ==                  'oppose'                  ), desc(percent)                  )$region region_opinion                  <-                  mutate(region_opinion, region                  =                  cistron                  (region,                  levels                  =                  regions_oppose_sorted,                  ordered                  =                  TRUE)                  )                  opin_pie_charts                  <-                  ggplot(region_opinion, aes(x                  =                  1, y                  =                  percent, fill up                  =                  opi)                  )                  +                  geom_bar(width                  =                  i,stat                  =                  "identity", size                  =                  0.iii, color                  =                  "white", alpha                  =                  0.85, show.legend                  =                  FALSE)                  plot                  (opin_pie_charts)                  opin_pie_charts                  <-                  opin_pie_charts                  +                  facet_grid(region ~ .,                  switch                  =                  "y"                  )                  +                  coord_polar(                  "y"                  )                  +                  theme_void(                  )                  plot                  (opin_pie_charts)                  opin_pie_charts                  <-                  opin_pie_charts                  +                  scale_fill_manual(values                  =                  c                  (                  "favor"                  =                  "blackness",                  "oppose"                  =                  "#908E89",                  "other"                  =                  "#F7F3E8"                  )                  )                  plot                  (opin_pie_charts)                  #add labels                  opin_pie_charts                  <-                  opin_pie_charts                  +                  geom_text(colour                  =                  "white", size                  =                  rel(                  3.v                  ), aes(ten                  =                  1, y                  =                  ifelse                  (opi                  ==                  'favor',                  15,                  85                  ), label                  =                  ifelse                  (opi                  ==                  'other', NA, paste0(str_to_title(opi),                  "\n", percent(percentage/                  100                  )                  )                  )                  )                  )                  plot                  (opin_pie_charts)                  opin_pie_charts                  <-                  opin_pie_charts                  +                  ggtitle(label                  =                  "Survey of views of gay\nmarriage, past region"                  )                  +                  theme(plot.title                  =                  element_text(size                  =                  10, face                  =                  "bold", hjust                  =                  i                  )                  )                  plot                  (opin_pie_charts)                  opin_pie_charts                  <-                  opin_pie_charts                  +                  theme(strip.background                  =                  element_rect(fill                  =                  "#E6D9B7"                  ), strip.text                  =                  element_text(face up                  =                  "bold"                  ), plot.groundwork                  =                  element_rect(fill                  =                  "#F6F4E8", color                  =                  NA), panel.background                  =                  element_rect(fill up                  =                  "#F6F4E8", colour                  =                  NA)                  )                  plot                  (opin_pie_charts)                  state_centers_abb                  <-                  data.frame                                    (state                  =                  tolower                  (                  state.proper name                                    ),                                  stateabr                  =                  state.abb                  ,                                  center_long                  =                  country.center                  $x, center_lat                  =                  state.center                  $y,                                 region                  =                  state.region                                    )                  head                  (state_centers_abb)                  state_centers_abb                  <-                  mutate(state_centers_abb, region                  =                  recode(region,                  "N Central"                  =                  "Midwest"                  )                  )                  us_map                  <-                  map_data(                  "state"                  )                  us_map                  <-                  rename(us_map, land                  =                  region)                  us_map                  <-                  inner_join(us_map, state_centers_abb,                  by                  =                  c                  (                  "state"                  =                  "land"                  )                  )                  us_map                  <-                  us_map[                  lodge                  (us_map$lodge),                  ]                  base_plot                  <-                  ggplot(                  data                  =                  left_join(us_map, rest_cnt_by_state,                  by                  =                  c                  (                  "stateabr"                  =                  "country"                  )                  ), aes(x                  =                  long, y                  =                  lat, grouping                  =                  group,                  order                  =                  order, fill up                  =                  region)                  )                  +                  geom_polygon(color                  =                  "white"                  )                  plot                  (base_plot)                  base_plot                  <-                  base_plot                  +                  scale_fill_manual(values                  =                  c                  (                  "West"                  =                  "#E6D9B7",                  "South"                  =                  "#F2DBD5",                  "Northeast"                  =                  "#D8E7D7",                  "Midwest"                  =                  "#D6DBE3"                  ), guide                  =                  FALSE)                  plot                  (base_plot)                  base_plot                  <-                  base_plot                  +                  theme(axis.championship                  =                  element_blank(                  ),                                   axis.ticks                  =                  element_blank(                  ),                                   axis.line                  =                  element_blank(                  ),                                   axis.text                  =                  element_blank(                  ),                                   console.grid                  =                  element_blank(                  ),                                   panel.border                  =                  element_blank(                  ),                                   plot.background                  =                  element_rect(fill up                  =                  "#F6F4E8"                  ),                                   panel.background                  =                  element_rect(fill                  =                  "#F6F4E8"                  )                  )                  plot                  (base_plot)                  base_plot                  <-                  base_plot                  +                  geom_point(aes(size                  =                  northward, x                  =                  center_long, y                  =                  center_lat), shape                  =                  1, stroke                  =                  0.5, color                  =                  "grey60", inherit.aes                  =                  FALSE)                  +                  scale_size_area(max_size                  =                  xviii, breaks                  =                  c                  (                  0,                  10,                  100,                  300                  ), guide                  =                  Simulated)                  +                  scale_shape(solid                  =                  Imitation)                  plot                  (base_plot)                  base_plot                  <-                  base_plot                  +                  annotation_custom(grob                  =                  circleGrob(r                  =                  unit of measurement(                  1.4,                  "lines"                  ), gp                  =                  gpar(fill                  =                  "gray87", alpha                  =                  0.5                  )                  ), xmin                  =                  -                  79, xmax                  =                  -                  77, ymin                  =                  48                  , ymax                  =                  49                  )                  +                  annotation_custom(grob                  =                  circleGrob(r                  =                  unit of measurement(                  0.7,                  "lines"                  ), gp                  =                  gpar(fill                  =                  "gray85", alpha                  =                  0.87                  )                  ), xmin                  =                  -                  79, xmax                  =                  -                  77, ymin                  =                  47.four, ymax                  =                  48                  )                  +                  annotation_custom(grob                  =                  circleGrob(r                  =                  unit(                  0.iii,                  "lines"                  ), gp                  =                  gpar(fill                  =                  "gray80", alpha                  =                  0.95                  )                  ), xmin                  =                  -                  79, xmax                  =                  -                  77, ymin                  =                  46.5, ymax                  =                  48                  )                  plot                  (base_plot)                  base_plot                  <-                  base_plot                  +                  ggtitle(label                  =                  "Number of Chick-fil-A resturants by state and region"                  )                  +                  theme(plot.championship                  =                  element_text(size                  =                  10, face                  =                  "bold", hjust                  =                  0.xiv, vjust                  =                  -                  5                  )                  )                  png                  (                  "wsj-Chick-fil-A-data-visualization-r-plot.png", width                  =                  800, summit                  =                  500,                  units                  =                  "px"                  )                  filigree.conform                  (base_plot, opin_pie_charts, widths=                  c                  (                  4,                  1                  ),                  nrow                  =                  1                  )                  dev.off                                    (                  )                  cnty_results                  <-                  read_csv(                  "https://raw.githubusercontent.com/tonmcg/County_Level_Election_Results_12-16/master/2016_US_County_Level_Presidential_Results.csv"                  )                  cnty_results                  <-                  mutate(cnty_results, per_point_diff                  =                  per_gop                  -                  per_dem,                        county_name                  =                  tolower                  (str_replace(county_name,                  " County",                  ""                  )                  ),                        county_name                  =                  tolower                  (str_replace(county_name,                  " parish",                  ""                  )                  )                  )                  county_map                  <-                  map_data(                  "canton"                  )                  county_map                  <-                  rename(county_map, state                  =                  region, county                  =                  subregion)                  county_map$stateabr                  <-                  state.abb                                    [                  lucifer                  (county_map$state,                  tolower                  (                  country.name                                    )                  )                  ]                  cnty_map_results                  <-                  left_join(county_map, cnty_results,                  by                  =                  c                  (                  "stateabr"                  =                  "state_abbr",                  "county"                  =                  "county_name"                  )                  )                  %>%                  arrange(                  order, group)                  cnty_plot                  <-                  ggplot(                  data                  =                  cnty_map_results, aes(x                  =                  long, y                  =                  lat, grouping                  =                  group,                  order                  =                  order                  )                  )                  +                  geom_polygon(aes(fill                  =                  per_point_diff), color                  =                  "gray90", size                  =                  0.1                  )                  plot                  (cnty_plot)                  cnty_plot                  <-                  cnty_plot                  +                  scale_fill_gradient2(midpoint                  =                  0, low                  =                  "#2062A1", mid                  =                  "white", high                  =                  "red", na.value                  =                  "white", breaks                  =                  seq                  (from                  =                  0, to                  =                  i, length.out                  =                  7                  ), guide                  =                  FALSE)                  plot                  (cnty_plot)                  cnty_plot                  <-                  cnty_plot                  +                  theme_void(                  )                  +                  geom_point(                  data                  =                  locationzips_df, aes(10                  =                  longitude, y                  =                  breadth), size                  =                  0.3, alpha                  =                  0.2, inherit.aes                  =                  FALSE)                

library(rvest) library(stringr) library(dplyr) library(ggplot2) library(scales) library(ggmap) library(readr) library(tidyr) library(zipcode) library(maps) library(jsonlite) library(grid) library(gridExtra) information(zipcode) ## load zipcode information in your workspace states <- read_html("https://www.Chick-fil-A.com/Locations/Browse") locations <- states %>% html_nodes("article ul li a") %>% html_attr("href") locations <- paste0("https://www.Chick-fil-A.com", locations) extract_location <- function(url){ read_html(url) %>% html_nodes(".location p") %>% html_text() %>% str_extract("\\d{five}(?=\n)") } locationzips <- sapply(locations, extract_location, Utilise.NAMES = FALSE) locationzips_df <- information.frame(zips = unlist(locationzips), stringsAsFactors = Faux) #locationzips_df <- read_csv("https://world wide web.dropbox.com/due south/x8xwdx61go07e4e/chickfilalocationzips.csv?raw=i", col_types = "c") locationzips_df$zips <- clean.zipcodes(locationzips_df$zips) locationzips_df <- merge(locationzips_df, select(zipcode, zip, latitude, longitude, state), past.x = 'zips', by.y = 'nothing', all.x = Truthful) est_cnt_by_state <- count(locationzips_df, state) head(rest_cnt_by_state) region_opinion_favor <- fromJSON("http://ava.prri.org/ajx_map.regionsdata?category=lgbt_ssm&sc=2&year=2015&topic=lgbt")$regions region_opinion_oppose <- fromJSON("http://ava.prri.org/ajx_map.regionsdata?category=lgbt_ssm&sc=3&year=2015&topic=lgbt")$regions region_opinion_favor$opi <- 'favor' region_opinion_oppose$opi <- 'oppose' region_opinion <- bind_rows(region_opinion_favor, region_opinion_oppose) %>% filter(region != 'national') %>% mutate(region = recode(region, "1" = "Northeast", "2" = "Midwest", "three" = "South", "iv" = "West")) %>% spread(key = opi, value = percent) %>% mutate(other = 100 - favor - oppose) %>% gather(central = opi, value = percent, -region, -sort) %>% select(-sort) %>% mutate(opi = factor(opi, levels = c('oppose', 'other', 'favor'), ordered = TRUE)) head(region_opinion) regions_oppose_sorted <- accommodate(filter(region_opinion, opi == 'oppose'), desc(percent))$region region_opinion <- mutate(region_opinion, region = factor(region, levels = regions_oppose_sorted, ordered = Truthful)) opin_pie_charts <- ggplot(region_opinion, aes(10 = 1, y = pct, make full = opi)) + geom_bar(width = i,stat = "identity", size = 0.iii, color = "white", alpha = 0.85, bear witness.legend = Faux) plot(opin_pie_charts) opin_pie_charts <- opin_pie_charts + facet_grid(region ~ ., switch = "y") + coord_polar("y") + theme_void() plot(opin_pie_charts) opin_pie_charts <- opin_pie_charts + scale_fill_manual(values = c("favor" = "black", "oppose" = "#908E89", "other" = "#F7F3E8")) plot(opin_pie_charts) #add labels opin_pie_charts <- opin_pie_charts + geom_text(color = "white", size = rel(iii.5), aes(ten = 1, y = ifelse(opi == 'favor', 15, 85), characterization = ifelse(opi == 'other', NA, paste0(str_to_title(opi), "\northward", percent(pct/100))))) plot(opin_pie_charts) opin_pie_charts <- opin_pie_charts + ggtitle(label = "Survey of views of gay\nmarriage, by region") + theme(plot.title = element_text(size = 10, face = "assuming", hjust = i)) plot(opin_pie_charts) opin_pie_charts <- opin_pie_charts + theme(strip.background = element_rect(fill = "#E6D9B7"), strip.text = element_text(face = "assuming"), plot.background = element_rect(make full = "#F6F4E8", color = NA), panel.background = element_rect(make full = "#F6F4E8", color = NA)) plot(opin_pie_charts) state_centers_abb <- information.frame(country = tolower(state.name), stateabr = country.abb, center_long = country.center$10, center_lat = state.center$y, region = country.region) caput(state_centers_abb) state_centers_abb <- mutate(state_centers_abb, region = recode(region, "North Central" = "Midwest")) us_map <- map_data("country") us_map <- rename(us_map, state = region) us_map <- inner_join(us_map, state_centers_abb, by = c("state" = "land")) us_map <- us_map[lodge(us_map$club), ] base_plot <- ggplot(data = left_join(us_map, rest_cnt_by_state, past = c("stateabr" = "state")), aes(x = long, y = lat, group = group, society = order, fill = region)) + geom_polygon(color = "white") plot(base_plot) base_plot <- base_plot + scale_fill_manual(values = c("Due west" = "#E6D9B7", "S" = "#F2DBD5", "Northeast" = "#D8E7D7", "Midwest" = "#D6DBE3"), guide = FALSE) plot(base_plot) base_plot <- base_plot + theme(axis.title = element_blank(), axis.ticks = element_blank(), axis.line = element_blank(), centrality.text = element_blank(), panel.grid = element_blank(), panel.border = element_blank(), plot.background = element_rect(fill = "#F6F4E8"), panel.background = element_rect(fill = "#F6F4E8")) plot(base_plot) base_plot <- base_plot + geom_point(aes(size = north, x = center_long, y = center_lat), shape = 1, stroke = 0.5, color = "grey60", inherit.aes = FALSE) + scale_size_area(max_size = 18, breaks = c(0, ten, 100, 300), guide = FALSE) + scale_shape(solid = Simulated) plot(base_plot) base_plot <- base_plot + annotation_custom(grob = circleGrob(r = unit of measurement(1.4, "lines"), gp = gpar(fill = "gray87", blastoff = 0.5)), xmin = -79, xmax = -77, ymin = 48 , ymax = 49) + annotation_custom(grob = circleGrob(r = unit(0.7, "lines"), gp = gpar(fill = "gray85", alpha = 0.87)), xmin = -79, xmax = -77, ymin = 47.4, ymax = 48) + annotation_custom(grob = circleGrob(r = unit(0.3, "lines"), gp = gpar(make full = "gray80", blastoff = 0.95)), xmin = -79, xmax = -77, ymin = 46.5, ymax = 48) plot(base_plot) base_plot <- base_plot + ggtitle(label = "Number of Chick-fil-A resturants by state and region") + theme(plot.title = element_text(size = 10, face up = "bold", hjust = 0.fourteen, vjust = -5)) png("wsj-Chick-fil-A-data-visualization-r-plot.png", width = 800, superlative = 500, units = "px") grid.arrange(base_plot, opin_pie_charts, widths= c(4, 1), nrow = 1) dev.off() cnty_results <- read_csv("https://raw.githubusercontent.com/tonmcg/County_Level_Election_Results_12-xvi/principal/2016_US_County_Level_Presidential_Results.csv") cnty_results <- mutate(cnty_results, per_point_diff = per_gop - per_dem, county_name = tolower(str_replace(county_name, " County", "")), county_name = tolower(str_replace(county_name, " parish", ""))) county_map <- map_data("county") county_map <- rename(county_map, state = region, county = subregion) county_map$stateabr <- state.abb[match(county_map$state, tolower(state.name))] cnty_map_results <- left_join(county_map, cnty_results, past = c("stateabr" = "state_abbr", "county" = "county_name")) %>% arrange(order, group) cnty_plot <- ggplot(data = cnty_map_results, aes(x = long, y = lat, group = group, order = lodge)) + geom_polygon(aes(fill = per_point_diff), color = "gray90", size = 0.one) plot(cnty_plot) cnty_plot <- cnty_plot + scale_fill_gradient2(midpoint = 0, low = "#2062A1", mid = "white", high = "red", na.value = "white", breaks = seq(from = 0, to = 1, length.out = 7), guide = FALSE) plot(cnty_plot) cnty_plot <- cnty_plot + theme_void() + geom_point(data = locationzips_df, aes(x = longitude, y = latitude), size = 0.iii, alpha = 0.2, inherit.aes = FALSE)

About the Author

A co-author of Data Scientific discipline for Fundraising, an accolade winning keynote speaker, Ashutosh R. Nandeshwar is one of the few analytics professionals in the higher education manufacture who has adult analytical solutions for all stages of the student life cycle (from recruitment to giving). He enjoys speaking about the ability of information, besides as ranting about data professionals who hunt after "interesting" things. He earned his PhD/MS from West Virginia University and his BEng from Nagpur University, all in industrial engineering. Currently, he is leading the data science, reporting, and prospect development efforts at the Academy of Southern California.

mcmahonhapped.blogspot.com

Source: https://nandeshwar.info/data-visualization/wall-street-journal-data-visualization-r/