Lesson Goals

  • Familiarity with some typical topological operations (spatial subsetting, proximity, aggregation, spatial joins, dissolve) using sf
  • Familiarity with some typical raster operations using the raster package and combining with vector data

Spatial Subsetting

Let’s return to the Durham Open Data we grabbed in the previous section - the parks and trails layers in particular. A typical spatial question we might ask of the data is ‘what trails go through parks in town?’ You should already have loaded but code below loads again, and shows the simplest of all methods to perform this spatial subset using sf:

Notice the warning we got about planar coordinates - should we be concerned about that?

Using sf and dplyr together

Now say we’re concerned it might rain. I want to know where the covered bike parking is that is near parks and along trails. Given what we know so far, see if you can figure out how to:

  • subset the bike parking near parks to records within 400 meters of trails
  • subset to just those bike parking locations that are covered
  • plot the results with ggplot

If you’re comfortalbe with dplyr, try doing this with dplyr. Don’t look at the code below unless or until you get stuck. Then after you figure it out, see how similar your results are to what is below - and look at how coord_sf is being used to control x and y axes based on sf coordinates. How could we zoom in more?

Spatial Joining and Aggregating

Let’s grab a couple more sample data sets from Open Data Durham. We’ll use tidycensus census tracts as we did in previous module, and we’ll join this to survey data by block group from Open Data Durham to evaluate survey point data at the block group level by census tract.

Dissolve

We can take a quick look at using a spatial dissolve and then re-aggregate our result to see if there is a difference at another spatial level by dissolving to the tract level - we do this dissolve simply by using dplyr group_by and summarize functions with an sf object! Note that we could pull down tidycensus at tract level, but instead we want to look at running a dissolve to get from block group to tract level

Exercise Try exploring another data set from Open Data Durham, or use existing data, to put together another spatial subset, join, or aggregation - try aggregating with different data, use a different aggregation method, or perhaps try other types of joins (st_intersects is the default, but check ?st_join for other options.

Raster - Raster-Vector Processing

Now let’s look at an example of some raster processing. First, we’ll load the Third Fork watershed shapefile I have in our course GitHub repository:

## Reading layer `Third_Fork' from data source `F:\GitProjects\R-User-Group-Spatial-Workshop-2018\data\Third_Fork.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 21 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -78.96945 ymin: 35.90316 xmax: -78.86583 ymax: 35.99798
## epsg (SRID):    4269
## proj4string:    +proj=longlat +datum=NAD83 +no_defs

FedData still uses sp rather than sf objects, so we could have loaded our shapefile template in with readOGR from rgdal, but we’ll just convert from sf to sp to demonstrate:

The get_ned function from FedData grabs a corresponding online tile of NED (National Elevation Data) from Amazon - you can simply type get_ned at console to examine the function and see how it works.

We saw the getData function in the raster package in the SpatialObjects section. Is get_ned from FedData any differet? We can mask to the watershed as we did earlier - notice we don’t need to use crop then mask…why?

The RasterViz package has some neat additional visualization methods for rasters:

We can easily do some terrain analysis with raster as well:

Quick Exercise What kind of object did terrain return? How would you get the projection for our ‘NED’ raster? The min, max, standard deviation of NED, mean as single values? Same for our ‘wat_terrain’? Play with on your own and look at code if you need to