7 Leaflet

Main functions and concepts covered in this BP chapter:

TidyCensus
- get_decennial()
- get_acs()
- Searching for Variables
- Loading Multiple Variables
- Renaming Variables
Creating the Dataset
- Non-GIS dataset
- GIS dataset
- Merge the two datasets
- Popup Labels
Creating the leaflet map
- leaflet() arguments:
- Creating Custom Bins
- Continuous with 2 Colors
- Continuous with 3 Colors
- 4 Categories
- 4 Categories Opacity by Log
- Total Votes Map

Packages used in this chapter:

## Load all packages used in this chapter
library(tidyverse) #includes dplyr, ggplot2, and other common packages

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(leaflet)
library(rmapshaper)

Datasets used in this chapter:

## Load datasets used in this chapter
library(tidycensus)
options(tigris_use_cache = TRUE)
census_api_key("84a079087bb9160d3235964985ce06a4240cb4c8", install = TRUE)

ERROR:: A CENSUS_API_KEY already exists. You can overwrite it with the argument overwrite=TRUE

For two of the projects, you will be required to make maps (optional for the RP). I will give you a separate GitHub repo to help you learn how to make interactive maps using leaflet. For this BP chapter, you will write your own short tutorial on how to make a leaflet map. You can copy/paste from what I give you and what we discuss as part of the course. You are also free to use content from the internet or the DataCamp course on leaflet (see below).

Some of you might find it helpful to check out DataCamp’s Leaflet course, but it is not assigned. They also have other courses on spatial data (e.g., Visualizing Geospatial Data in R, Analyzing US Census Data in R). If you do any (or parts of…you don’t need to do entire courses, other than the ones I assigned) of these DC courses, you are free to add notes to this chapter, but I won’t be looking for that when checking off this chapter.

So what should this chapter include? Suppose in a year you want to make a leaflet map, but you don’t remember anything about how to do so. You should be able to open up this chapter and follow along to create an interactive map using leaflet. Include enough details so you understand what’s going on, but not more than you think will be helpful. Since no two people find the same explanations/examples equally clear, I don’t expect multiple people in the class to have the same chapter. You can talk to each other, but write your own.

This chapter might be graded somewhat differently from the others.

7.1 TidyCensus

7.1.1 `get_decennial()`

get_decennial() grants access to the 2000, 2010, and 2020 decennial US Census APIs,

# displays the median age by state in 2020, with data drawn from the Demographic and Housing Characteristics summary file

get_decennial(geography = "state", 
                       variables = "P13_001N", 
                       year = 2020,
                       sumfile = "dhc")

## Getting data from the 2020 decennial Census

## Using the Demographic and Housing Characteristics File

## Note: 2020 decennial Census data use differential privacy, a technique that
## introduces errors into data to preserve respondent confidentiality.
## ℹ Small counts should be interpreted with caution.
## ℹ See https://www.census.gov/library/fact-sheets/2021/protecting-the-confidentiality-of-the-2020-census-redistricting-data.html for additional guidance.
## This message is displayed once per session.

## # A tibble: 52 × 4
##    GEOID NAME                 variable value
##    <chr> <chr>                <chr>    <dbl>
##  1 09    Connecticut          P13_001N  41.1
##  2 10    Delaware             P13_001N  41.1
##  3 11    District of Columbia P13_001N  33.9
##  4 12    Florida              P13_001N  43  
##  5 13    Georgia              P13_001N  37.5
##  6 15    Hawaii               P13_001N  40.8
##  7 16    Idaho                P13_001N  36.8
##  8 17    Illinois             P13_001N  38.8
##  9 18    Indiana              P13_001N  38.2
## 10 19    Iowa                 P13_001N  38.6
## # ℹ 42 more rows

7.1.2 get_acs()

get_asc() grants access to the 1-year and 5-year American Community Survey APIs. For this class, this is he one we will use.

Below are the arguments for get_asc()

Arguments:

geography The geography of your data.
variables Character string or vector of character strings of variable IDs. tidycensus automatically returns the estimate and the margin of error associated with the variable.
table The ACS table for which you would like to request all variables. Uses lookup tables to identify the variables; performs faster when variable table already exists through load_variables(cache = TRUE). Only one table may be requested per call.
cache_table Whether or not to cache table names for faster future access. Defaults to FALSE; if TRUE, only needs to be called once per dataset. If variables dataset is already cached via the load_variables function, this can be bypassed.
year The year, or endyear, of the ACS sample. 5-year ACS data is available from 2009 through 2021; 1-year ACS data is available from 2005 through 2021, with the exception of 2020. Defaults to 2021.
output One of “tidy” (the default) in which each row represents an enumeration unit-variable combination, or “wide” in which each row represents an enumeration unit and the variables are in the columns.
state An optional vector of states for which you are requesting data. State names, postal codes, and FIPS codes are accepted. Defaults to NULL.
county The county for which you are requesting data. County names and FIPS codes are accepted. Must be combined with a value supplied to state. Defaults to NULL.
zcta The zip code tabulation area(s) for which you are requesting data. Specify a single value or a vector of values to get data for more than one ZCTA. Numeric or character ZCTA GEOIDs are accepted. When specifying ZCTAs, geography must be set to "zcta" and state must be specified with county left as NULL. Defaults to NULL.
geometry if FALSE (the default), return a regular tibble of ACS data. if TRUE, uses the tigris package to return an sf tibble with simple feature geometry in the geometry column.
keep_geo_vars if TRUE, keeps all the variables from the Census shapefile obtained by tigris. Defaults to FALSE.
shift_geo (deprecated) if TRUE, returns geometry with Alaska and Hawaii shifted for thematic mapping of the entire US. Geometry was originally obtained from the albersusa R package. As of May 2021, we recommend using tigris::shift_geometry() instead.
summary_var Character string of a “summary variable” from the ACS to be included in your output. Usually a variable (e.g. total population) that you’ll want to use as a denominator or comparison.
key Your Census API key. Obtain one at https://api.census.gov/data/key_signup.html
moe_level The confidence level of the returned margin of error. One of 90 (the default), 95, or 99.
survey The ACS contains one-year, three-year, and five-year surveys expressed as “acs1”, “acs3”, and “acs5”. The default selection is “acs5.”
show_call if TRUE, display call made to Census API. This can be very useful in debugging and determining if error messages returned are due to tidycensus or the Census API. Copy to the API call into a browser and see what is returned by the API directly. Defaults to FALSE.

# Examples: 
tarr <- get_acs(geography = "tract", variables = "B19013_001",
                state = "TX", county = "Tarrant", geometry = TRUE, year = 2020)

## Getting data from the 2016-2020 5-year ACS

vt <- get_acs(geography = "county", variables = "B19013_001", state = "vt", year = 2019)

## Getting data from the 2015-2019 5-year ACS

vt %>%
mutate(NAME = gsub(" County, Vermont", "", NAME)) %>%
 ggplot(aes(x = estimate, y = reorder(NAME, estimate))) +
  geom_errorbar(aes(xmin = estimate - moe, xmax = estimate + moe), width = 0.3, size = 0.5) +
  geom_point(color = "red", size = 3) +
  labs(title = "Household income by county in Vermont",
       subtitle = "2015-2019 American Community Survey",
       y = "",
       x = "ACS estimate (bars represent margin of error)")

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

7.1.3 Searching for Variables

Getting variables from the Census or ACS requires knowing the variable ID - and there are thousands of these IDs across the different Census files. To rapidly search for variables, use the load_variables() function. The function takes two required arguments: the year of the Census or endyear of the ACS sample, and the dataset name, which varies in availability by year. For the decennial Census, possible dataset choices include “pl” for the redistricting files; “dhc” for the Demographic and Housing Characteristics file and “dp” for the Demographic Profile (2020 only), and “sf1” or “sf2” (2000 and 2010) and “sf3” or “sf4” (2000 only) for the various summary files. Special island area summary files are available with “as”, “mp”, “gu”, or “vi”.

For the ACS, use either “acs1” or “acs5” for the ACS detailed tables, and append /profile for the Data Profile and /subject for the Subject Tables. To browse these variables, assign the result of this function to a variable and use the View function in RStudio. An optional argument cache = TRUE will cache the dataset on your computer for future use.

v17 <- load_variables(year = 2017, dataset = "acs5", cache = TRUE)

#View(v17)

7.1.4 Loading Multiple Variables

To get multiple variables, you can use a vector with the variables argument

wi <- get_acs(geography = "county", variables = c("B07404B_001", "B07404B_002", "B07404B_003", "B07404B_004", "B07404B_005"), state = "wi")

## Getting data from the 2017-2021 5-year ACS

7.1.5 Renaming Variables

To rename variables, use a vector with the variables argument. c(new_name = "VariableID")

wi_renamed <- get_acs(geography = "county", variables = c(number_of_Africian_Americians_in_County = "B07404B_001"), state = "wi", output = "wide")

## Getting data from the 2017-2021 5-year ACS

#adding output = "wide" makes the data set tidy, nd easier to work with

head(wi_renamed)

## # A tibble: 6 × 4
##   GEOID NAME                       number_of_Africian_A…¹ number_of_Africian_A…²
##   <chr> <chr>                                       <dbl>                  <dbl>
## 1 55001 Adams County, Wisconsin                       601                    169
## 2 55003 Ashland County, Wisconsin                      NA                     NA
## 3 55005 Barron County, Wisconsin                      910                    208
## 4 55007 Bayfield County, Wisconsin                     NA                     NA
## 5 55009 Brown County, Wisconsin                      7066                    666
## 6 55011 Buffalo County, Wisconsin                      NA                     NA
## # ℹ abbreviated names: ¹number_of_Africian_Americians_in_CountyE,
## #   ²number_of_Africian_Americians_in_CountyM

7.2 Creating the Dataset

There are two main datasets you need to create:

Non-GIS dataset: This is the dataset you use for regression models, and for the summary statistics and graphs you often display before displaying regression results. This dataset does not contain the GIS information needed to create maps (for example, it does not include the thousands of points that trace out the border of each county). If you aren’t creating maps, this is the only dataset you need. Create this dataset first. Include the unique identifier for each unit of observation (e.g., 5-character FIPS codes that identify each US county), but do NOT include the GIS data used to create the map itself.
GIS dataset: This is only needed for creating maps. It includes a unique identifier that identifies each geographic unit (e.g., 5-character FIPS codes that identify each US county). It also includes GIS data, such as thousands of points that trace out the borders of each county (these take up a lot of space, and are why you don’t want them included with the non-GIS dataset you’ll use for things like regressions…and sometimes you won’t have as many observations in the GIS dataset, such as if you leave Hawaii out of the map but have data on it for regressions). It does not include all your variables, just those needed to create the map. You’ll need to merge the non-GIS dataset with this GIS dataset to create maps of the variables of interest from the non-GIS dataset.

7.2.1 Non-GIS dataset

First, we’re going to create the dataset that does not include the GIS information used to create maps. For this example, wow we’re going to download 2020 county-level election results from a GitHub repo. This data set will include all the data nessacairy to crearte graphs, run regrssions, ect. For the cenuse data set the crucial part is that you keep the 5-character FIPS code that uniquely identifies each county. this is the key that will allow you to merge this data with the dataset used to create the maps.

## 2020 Election data
dta2020 <- read_csv("https://raw.githubusercontent.com/tonmcg/US_County_Level_Election_Results_08-20/master/2020_US_County_Level_Presidential_Results.csv")

## Calculate percentages based on total votes for Trump and Biden (GOP and Dem) only
##   In some years there have been ties, so we're allowing for that
##   stdVotes and stdVotesLog will be used to scale color opacity from 0 to 1 based on total votes

dta2020 <- dta2020 %>% 
            mutate(pctGOP = votes_gop/(votes_gop + votes_dem) * 100, #percent for GOP
                   totalVotes = votes_gop + votes_dem, #total votes cast
                   winner = ifelse(dta2020$votes_gop > dta2020$votes_dem,"Trump",
                                   ifelse(dta2020$votes_gop < dta2020$votes_dem,"Biden", 
                                          "Tie")), #variable to tell the winner
                   pctWinner = ifelse(dta2020$votes_gop > dta2020$votes_dem,pctGOP,100-pctGOP),  #vote share of the winner
                   FontColorWinner = ifelse(dta2020$votes_gop > dta2020$votes_dem,"red", 
                                      ifelse(dta2020$votes_gop < dta2020$votes_dem,"blue",
                                             "purple")), #if gop wins, display red, dem, blue, tie = purple
                   pctGOPcategories = ifelse(pctGOP<48,"0-48%", 
                                             ifelse(pctGOP<50,"48-50%", 
                                                    ifelse(pctGOP<52, "50-52%",
                                                           "52-100%"))),
                   stdVotes = (totalVotes-min(totalVotes))/(max(totalVotes)-min(totalVotes)),
                   stdVotesLog = (log(totalVotes)-min(log(totalVotes)))/(max(log(totalVotes))-min(log(totalVotes)))
                   )



dta2020 <- dta2020 %>% 
            dplyr::select(FIPS =  county_fips, pctGOP, totalVotes, winner, pctWinner, pctGOPcategories, FontColorWinner, stdVotes, stdVotesLog)

7.2.2 Gis data

This is the data set that we will need to make the map itself.

Note that tidycensus/the Census API makes you download at least one variable of data. So, we are downloading the total population of each county. If we wanted to use Census data and download more than one variable, that would be part of our non-GIS dataset created first (i.e., the section above, just in this example, our only non-GIS data is the election data). That dataset should NOT include the GIS data (i.e., use tidycensus options geometry = False, keep_geo_vars = FALSE).

## turn off scientific notation
options(scipen = 999)


## Download GIS data for maps
##   geometry = TRUE --> include GIS shapefile data to create maps
##   B01001_001: total population (we have to download at least one variable, so we're using this one)
##   NOTE: If you want to download multiple variables for things like regressions, 
##     that's a non-GIS dataset and should use options: geometry = False, keep_geo_vars = FALSE
##     then merge that non-GIS dataset with a GIS dataset just like this one

# Median household income
countyGIS <- get_acs(geography = "county",
              variables = "B01001_001",
              geometry = TRUE,
              keep_geo_vars = TRUE)

# State data (for displaying state borders on map)
stateGIS <- get_acs(geography = "state",
              variables = "B01001_001",
              geometry = TRUE,
              keep_geo_vars = FALSE)


## Simplify GIS data to make file sizes smaller. This essentially removes some details along coastlines and very-not-straight borders. 
stateGIS <- ms_simplify(stateGIS, keep = 0.01)
countyGIS <- ms_simplify(countyGIS, keep = 0.01)


countyGIS <- countyGIS %>% 
                dplyr::select(FIPS = GEOID, 
                       stFIPS = STATEFP, 
                       coFIPS = COUNTYFP, 
                       coNAME = NAME.x, 
                       pop = estimate, 
                       geometry)


## For mapping, let's drop the following: 
##   Puerto Rico (ST FIPS 72) (no election data)
##   Alaska (ST FIPS 02) (voting data isn't reported by county...we could also map the legislative districts, but we're not going to since we'd rather have smaller maps without those extra details)
##   Hawaii (ST FIPS 15) (so our map can zoom in on continental 48 states)
countyGIS <- countyGIS %>% filter(stFIPS != "72" & stFIPS != "02" & stFIPS != "15")
stateGIS <- stateGIS %>% filter(GEOID != "72" & GEOID != "02" & GEOID != "15")


## join 2-character state abbreviation and create name = "county, St" for labeling maps (e.g., Outagamie, WI) 
fipsToSTcode <- fips_codes %>% dplyr::select(stFIPS = state_code, stNAME = state) %>% unique()

countyGIS <- inner_join(countyGIS,fipsToSTcode,by="stFIPS")

countyGIS <- countyGIS %>% mutate(name = paste0(coNAME,", ", stNAME))



## NOTE: If you don't use keep_geo_vars = TRUE, you don't get separate STATEFP and COUNTYFP, but you can use mutate() and create stFIPS = substr(GEOID,1,2) and coFIPS = substr(GEOID,3,5)

note to self: something overrode the select from dplyr, so I needed to specificy with dplyr::

7.2.3 Merge the two datasets

## merge GIS data with voting data using FIPS code
countyGIS <- left_join(countyGIS,dta2020,by="FIPS")

countyGIS is what we’ll use to make the maps, along with stateGIS to draw state borders

7.2.4 Popup labels

We also need to make the text (with HTML for formatting) to use in popup labels for each county. First we create the text that will make up the labels, along with HTML formatting (e.g., <b> to make the county name bold, font color to make it red when Trump wins and blue when Biden wins). Then we pipe that to the HTML function from the htmltools package that turns our text into HTML code that leaflet can use to make popups in the map.

Note that you do this with the GIS dataset. This is crucial! If you use different datasets to make the map and create the popup labels, the popup labels might not match with the correct counties.

popupLabels <- paste0("<b>",countyGIS$name," (",countyGIS$FIPS,")</b>",
                    "<br><font color='",countyGIS$FontColorWinner,"'>",countyGIS$winner, 
                    ": ",
                    format(countyGIS$pctWinner,digits=4, trim=TRUE),
                    "%</font>",
                    "<br>Total votes: ", format(countyGIS$totalVotes,big.mark=",", trim=TRUE)
                    ) %>% 
              lapply(htmltools::HTML)

It is important to use the same dataframe for each part. A common mistake people make is to do something like countyGIS$name for the name and countyGIS$totalVotes for total votes, but dta2020$pctWinner for the other value. Mixing dataframes like this is very dangerous. Sometimes it works correctly, but other times you end up with data from the wrong county. By using the same dataframe (i.e., countyGIS$pctWinner instead of dta2020$pctWinner), you guarantee that the same county will be used for everything.

Use one dataframe for everything to avoid problems!

7.3 Creating the leaflet map

## create color palette used by map
pal <- colorBin("RdBu", countyGIS$pctGOP, n = 9, reverse=TRUE)

leaflet(countyGIS, options = leafletOptions(crsClass = "L.CRS.EPSG3857"), width="100%") %>%
  addPolygons(weight = 0.5, color = "grey", opacity = 0.7,
    fillColor = ~pal(pctGOP), fillOpacity = 1, smoothFactor = 0.5,
    label = popupLabels,
    labelOptions = labelOptions(direction = "auto")) %>%
    addPolygons(data = stateGIS,fill = FALSE,color="black",weight = 1) %>%
    addLegend(pal = pal,values = ~countyGIS$pctGOP, opacity = 0.7, title = "% Trump",position = "bottomright")

7.3.1 `leaflet()` arguments:

countyGIS is the dataset being used.

options = leafletOptions() are options for map creation. I read that L.CRS.EPSG3857 is the standard, I read that it sets the coordinate projection system, though I can’t tell what this means in practice. When I tried EPSG:3857, which was a different option I found online, I couldn’t tell what it changed.

addPolygons(): The first addPolygons is setting up the counties on the map. The first line is filling in the outlines of the counties, the lines below is setting the color and how to display the information we want to display, there are also astethic choices being made. The second addPolygon adds the state outines.

addLegend(): adds the legend, and determine what is in the legend.

colorBin(): is creating the color scale we will use on the map. The first argument sets the colors, the second the data to map the colors to, the third is how many times we want to separate the colors (ie. how many shades we should have, and n= how many we want -1) and reverse = TRUE descends the colors from darkes red to darkest blue.

7.3.2 Creating Custom Bins

If we want for the colors the map displays to be in non-even intervals, we need to use custom bins. In the example above the color changed after a 10% increase in vote share for trump, the bins were from 0-10, 10-20, ect. To make it so the bins change by varying amounts, in the colorBin() argument, use bins = c() to create bins of the desired width. If you don’t specify 0 or 100, the bins will start and end with whatever number you give, so that information wouldn’t be displayed.

## You can also use custom bins
pal <- colorBin("RdBu", countyGIS$pctGOP,bins = c(0,20,30,40,45,49,51,55,60,70,80, 100), reverse=TRUE)

leaflet(countyGIS, options = leafletOptions(crsClass = "L.CRS.EPSG3857"), width="100%") %>%
  addPolygons(weight = 0.5, color = "gray", opacity = 0.7,
    fillColor = ~pal(pctGOP), fillOpacity = 1, smoothFactor = 0.5,
    label = popupLabels,
    labelOptions = labelOptions(direction = "auto")) %>%
    addPolygons(data = stateGIS,fill = FALSE,color="black",weight = 1) %>%
    addLegend(pal = pal,values = ~countyGIS$pctGOP, opacity = 0.7, title = "% Trump",position = "bottomright")

7.3.3 Continuous with 2 Colors

In the prior examples, the maps use 2 distinct colors, red and blue. The shades get lighter as the percent of the vote share gets closer to 50/50, but there are only w colors being used. If, we canted the colors instead to blend together as the vote share got closer to 50/50 we can do that too.

Instead of using colorBin(), which assigns colors to a specific bin, we can use colorNumberic() to create a scale from dark red, to purple, to dark blue.

pal <- colorNumeric(
  palette = colorRampPalette(c('red', 'blue'))(length(countyGIS$pctGOP)), 
  domain = countyGIS$pctGOP, reverse=TRUE)

leaflet(countyGIS, options = leafletOptions(crsClass = "L.CRS.EPSG3857"), width="100%") %>%
  addPolygons(weight = 0.5, color = "gray", opacity = 0.7,
    fillColor = ~pal(pctGOP), fillOpacity = 1, smoothFactor = 0.5,
    label = popupLabels,
    labelOptions = labelOptions(direction = "auto")) %>%
    addPolygons(data = stateGIS,fill = FALSE,color="black",weight = 1) %>%
    addLegend(pal = pal,values = ~countyGIS$pctGOP, opacity = 0.7, title = "% Trump",position = "bottomright")

7.3.3.1 `colorNumeric()`

colorNumeric() requires a palette and a domain.

the palette argument sets the colors being used in the scale, and how R should mix them. In the example above, we use red and blue, and we change the color based on the % of the vote for the GOP. we use colorRampPalette(c('red', blue)) to determine the colors, and (length(countyGIS$pctGOP)) to set the scale.

the domain argument determines which color to assign to which value, the code above is telling R to use the % of the vote for trump, and to order the color from most red for trump, and to most blue for biden.

7.3.4 Continuous with 3 Colors

In the map we created above, we used a continuous scale to display the infromation using only two colors, red and blue, and the colors mixed to create purple when the cote was 50/50. If instead, we wanted the scale to use a third color instead of the colors mixing, we can do that by adding a third color to the colorRampPalette() argument between the two colors that detrmine the color at the extremes. so instead of using colorRampPalette(c("red", "blue")), we would use colorRampPalette(c('red', 'white', 'blue')). Note that any color could be used, and if we used purple instead of white, the map would look the exact same as the map above. Also note that nothing else changes from the map created above.

pal <- colorNumeric(
  palette = colorRampPalette(c('red', 'white', 'blue'))(length(countyGIS$pctGOP)), 
  domain = countyGIS$pctGOP, reverse=TRUE)

leaflet(countyGIS, options = leafletOptions(crsClass = "L.CRS.EPSG3857"), width="100%") %>%
  addPolygons(weight = 0.5, color = "gray", opacity = 0.7,
    fillColor = ~pal(pctGOP), fillOpacity = 1, smoothFactor = 0.5,
    label = popupLabels,
    labelOptions = labelOptions(direction = "auto")) %>%
    addPolygons(data = stateGIS,fill = FALSE,color="black",weight = 1) %>%
    addLegend(pal = pal,values = ~countyGIS$pctGOP, opacity = 0.7, title = "% Trump",position = "bottomright")

7.3.5 4 Categories

This map uses 4 custom labels (blue, cyan, magenta, and red). To do this, we use colorFactor(c("color1", "color2"...)dataset$data)

The resulting map looks like this:

pal <- colorNumeric(
  palette = colorRampPalette(c('red', 'white', 'blue'))(length(countyGIS$pctGOP)), 
  domain = countyGIS$pctGOP, reverse=TRUE)

factorPal <- colorFactor(c("blue", "cyan", "magenta", "red"), countyGIS$pctGOPcategories )

leaflet(countyGIS, options = leafletOptions(crsClass = "L.CRS.EPSG3857"), width="100%") %>%
  addPolygons(weight = 0.5, color = "gray", opacity = 0.7,
    fillColor = ~factorPal(pctGOPcategories), fillOpacity = 1, smoothFactor = 0.5,
    label = popupLabels,
    labelOptions = labelOptions(direction = "auto")) %>%
    addPolygons(data = stateGIS,fill = FALSE,color="black",weight = 1) %>%
    addLegend(pal = factorPal,values = ~countyGIS$pctGOPcategories, opacity = 0.7, title = "% Trump",position = "bottomright")

The challenge with maps like these is that we can,t see how many votes each county has. If you just looked at how much of each color was on the map, you would guess trump won the 2020 election, but he didn’t. The blue counties have more people than the red ones so Biden actually won. LA county is the largest county, with over 4 million total votes, which Biden received 72% of.

7.3.6 4 Categories Opacity by Log

To fix the issues with the map above, we can adjust the opacity based on the total number of votes in that county. Before, in the first addPolygons() function, we were setting fillOpacity = 1, in the map below it is set equal to countyGIS$stdVotesLog. This will adjust the transparency of the countys, based on the number of votes in that county. Counties with more votes will be less transparent, counties with fewer votes will be more transparent.

## Now use 4 custom-labeled factors, with opacity set based on total votes (re-scaled after log)
pal <- colorNumeric(
  palette = colorRampPalette(c('red', 'white', 'blue'))(length(countyGIS$pctGOP)), 
  domain = countyGIS$pctGOP, reverse=TRUE)

factorPal <- colorFactor(c("blue", "cyan", "magenta", "red"), countyGIS$pctGOPcategories )

leaflet(countyGIS, options = leafletOptions(crsClass = "L.CRS.EPSG3857"), width="100%") %>%
  addPolygons(weight = 0.5, color = "gray", opacity = 0.7,
    fillColor = ~factorPal(pctGOPcategories), fillOpacity = countyGIS$stdVotesLog, smoothFactor = 0.5,
    label = popupLabels,
    labelOptions = labelOptions(direction = "auto")) %>%
    addPolygons(data = stateGIS,fill = FALSE,color="black",weight = 1) %>%
    addLegend(pal = factorPal,values = ~countyGIS$pctGOPcategories, opacity = 0.7, title = "% Trump",position = "bottomright")

The reason we use countyGIS$stdVotesLog instead of countyGIS$stdVotes is because if we didn’t use the log, only a few counties would be visible. That would look like this:

## Now use 4 custom-labeled factors, with opacity set based on total votes (re-scaled without taking log)
pal <- colorNumeric(
  palette = colorRampPalette(c('red', 'white', 'blue'))(length(countyGIS$pctGOP)), 
  domain = countyGIS$pctGOP, reverse=TRUE)

factorPal <- colorFactor(c("blue", "cyan", "magenta", "red"), countyGIS$pctGOPcategories )

leaflet(countyGIS, options = leafletOptions(crsClass = "L.CRS.EPSG3857"), width="100%") %>%
  addPolygons(weight = 0.5, color = "gray", opacity = 0.7,
    fillColor = ~factorPal(pctGOPcategories), fillOpacity = countyGIS$stdVotes, smoothFactor = 0.5,
    label = popupLabels,
    labelOptions = labelOptions(direction = "auto")) %>%
    addPolygons(data = stateGIS,fill = FALSE,color="black",weight = 1) %>%
    addLegend(pal = factorPal,values = ~countyGIS$pctGOPcategories, opacity = 0.7, title = "% Trump",position = "bottomright")

7.3.7 Total Votes Map

The last map shows how many votes were cast in each county. To create this map, we use totalVotes in the place of pctGOP. This changes the data that is shown. We also need to create a new color scale using colorBin() using the same method as before. Notice that in this example that the bins are set such that the lower bound on the bins is the same as the lowest number of total votes, and the upper bound is the same as the highest number of total votes.

## Custom bins to show total votes
pal <- colorBin(c("white","yellow","cyan","green","purple","magenta"), countyGIS$totalVotes,bins = c(min(countyGIS$totalVotes),
                                                       25000,
                                                       50000,
                                                       100000,
                                                       500000,
                                                       1000000,
                                                       max(countyGIS$totalVotes)))

leaflet(countyGIS, options = leafletOptions(crsClass = "L.CRS.EPSG3857"), width="100%") %>%
  addPolygons(weight = 0.5, color = "gray", opacity = 0.7,
    fillColor = ~pal(totalVotes), fillOpacity = 1, smoothFactor = 0.5,
    label = popupLabels,
    labelOptions = labelOptions(direction = "auto")) %>%
    addPolygons(data = stateGIS,fill = FALSE,color="black",weight = 1) %>%
    addLegend(pal = pal,values = ~countyGIS$totalVotes, opacity = 0.7, title = "Total Votes",position = "bottomright")