DAY 2: Data visualization (Python)

The COVID-19 cases data we have are inherently temporal and spatial. Let’s explore the space and time dimensions of the case data through visualization.

Non-spatial graphs

Let’s load the daily cases data we cleaned yesterday:

We can easily create a wide range of non-spatial graphs using the seaborn module. We can start with a very simple line graph of the COVID-19 cases rates over time:

This gives us an overall sense that the rate of cases has increased over time and has become particularly prevalent in the fall of 2020. But, because the lines for each state are not discernible, we can’t see if some states have a different trajectory of case rates than other states. A better solution is to use faceting to produce mini-plots for each state.

Let’s create a new line graph of COVID-19 cases rates over time, this time with a separate mini-plot for each state:

We can try the same strategy for cumulative COVID-19 case rates over time. First, in a graph that jumbles together all the states:

Again, we get a sense of the overall trend here, but we can get a much better picture of state-level differences by faceting. So, let’s create a new line graph of COVID-19 cumulative cases rates over time, this time with a separate mini-plot for each state:

Static Maps

A great way to visualize spatial relationships in data is to superimpose variables onto a map. For some datasets, this could involve superimposing points or lines. For our state-level data, this will involve coloring state polygons in proportion to a variable of interest that represents an aggregate summary of a geographic characteristic within each state. Such a graph is often referred to as a choropleth map. To create a choropleth map we first need to acquire shapefiles that contain spatial data about U.S. state-level geographies.

We can use the Census Tiger shape files for census geographies. We want state-level geographies, which we can download by putting the following URL into our browser: https://www2.census.gov/geo/tiger/TIGER2019/STATE/tl_2019_us_state.zip. We then need to unzip the directory to gain access to the shapefile named tl_2019_us_state.shp.

Alternatively, we can do this programmatically using the UNIX bash shell, either using wget or, in this example, curl:

Let’s read the shapefile and clean it:

##    REGION  ...                                           geometry
## 0       3  ...  POLYGON ((-81.74725 39.09538, -81.74635 39.096...
## 1       3  ...  MULTIPOLYGON (((-86.38865 30.99418, -86.38385 ...
## 2       2  ...  POLYGON ((-91.18529 40.63780, -91.17510 40.643...
## 3       2  ...  POLYGON ((-96.78438 46.63050, -96.78434 46.630...
## 4       3  ...  POLYGON ((-77.45881 39.22027, -77.45866 39.220...
## 5       1  ...  MULTIPOLYGON (((-71.78970 41.72520, -71.78971 ...
## 6       4  ...  POLYGON ((-116.89971 44.84061, -116.89967 44.8...
## 7       1  ...  POLYGON ((-72.32990 43.60021, -72.32984 43.600...
## 8       3  ...  POLYGON ((-82.41674 36.07283, -82.41660 36.073...
## 9       1  ...  POLYGON ((-73.31328 44.26413, -73.31274 44.265...
## 10      1  ...  POLYGON ((-73.51808 41.66672, -73.51807 41.666...
## 11      3  ...  POLYGON ((-75.76007 39.29682, -75.76010 39.297...
## 12      4  ...  POLYGON ((-106.00632 36.99527, -106.00531 36.9...
## 13      4  ...  MULTIPOLYGON (((-124.13656 41.46445, -124.1378...
## 14      1  ...  POLYGON ((-75.18960 40.59178, -75.18977 40.592...
## 15      2  ...  POLYGON ((-92.88707 45.64415, -92.88671 45.644...
## 16      4  ...  POLYGON ((-124.06545 45.78305, -124.06206 45.7...
## 17      2  ...  POLYGON ((-104.05264 42.00172, -104.05263 42.0...
## 18      1  ...  POLYGON ((-80.51935 41.84956, -80.51938 41.850...
## 19      4  ...  POLYGON ((-123.24792 48.28456, -123.24751 48.2...
## 
## [20 rows x 15 columns]

Now let’s read in the weekly cases data we cleaned yesterday:

We can now merge the spatial data with our weekly COVID-19 cases data, keeping only the contiguous 48 states (plus D.C.):

##     REGION DIVISION  ... cases_count_pos cases_rate_100K
## 44       3        5  ...          6519.0      351.809018
## 89       3        5  ...         54246.0      288.522449
## 134      2        3  ...         64085.0      499.468771
## 179      2        4  ...         42815.0      807.232380
## 224      3        5  ...         14245.0      246.728530
## 269      1        1  ...          5953.0      565.569698
## 314      4        8  ...          8823.0      562.841370
## 359      1        1  ...          2882.0      218.918775
## 404      3        5  ...         25141.0      263.657331
## 449      1        1  ...           471.0       75.270759
## 494      1        1  ...         11112.0      310.903705
## 539      3        5  ...          3443.0      383.435754
## 584      4        8  ...         13521.0      656.620915
## 629      4        9  ...        100844.0      270.693400
## 674      1        2  ...         28124.0      319.885567
## 719      2        3  ...         32816.0      577.036764
## 764      4        9  ...          8950.0      233.615952
## 809      2        4  ...         12405.0      679.226935
## 854      1        2  ...         47599.0      374.725081
## 899      4        9  ...         18630.0      277.044973
## 
## [20 rows x 22 columns]

Let’s create a choropleth map for the latest week’s COVID-19 cases using the Matplotlib module:

## (0.0, 1.0, 0.0, 1.0)
## <matplotlib.colorbar.Colorbar object at 0x17a6bd850>

Interactive Maps

Static maps are great for publications. Interactive maps, which can be viewed in a browser, can potentially provide a much richer source of information.

Firstly, we need to create a function to get the polygon x and y coordinates:

Now let’s use our function to get the Polygon x and y coordinates:

##                                                     x                                                  y
## 44  [-81.747254, -81.746354, -81.746254, -81.74605...  [39.095379, 39.096578, 39.096878, 39.096978, 3...
## 89  [-86.388646, -86.383854, -86.37907, -86.378965...  [30.994180999999998, 30.994235999999997, 30.99...

In this section, we’ll focus on building a simple interactive map using the geopandas module.

## GlyphRenderer(id='1039', ...)
## '/Users/sworthin/Documents/IQSS/datafest-project/data_py/covid-19_map_hover.html'