The overpass API
osmdata
uses the overpass
API for
accessing and querying data from the OpenStreetMap database.
overpass
queries are initiated by calling the function
opq()
. All overpass
queries must begin by
specifying a bounding box that defines the spatial extent of the area to
be queried. The bounding box parameter is specified with the argument
bbox=
, and can be indicated by regular character strings
enclosed in single quotation marks:
# Any query within Newport News would begin like this. The results of the query are here assigned to a variable named `query_output`.
query_output <- opq(bbox = 'Newport News, VA')
Note: The output of any function call can be stored in a variable
using the <-
operator. query_output
is an
intuitive name for the variable holding the output from an
opq()
function call, but realize that you can name the
variable anything you want.
Constructing Queries
Key/value pairs
Features in OpenStreetMap are identified by a key/value
pair. There are over two dozen feature keys, and many dozens
more values corresponding to those keys. The best resource available for
finding the appropriate key/value pair for the feature(s) you are
interested in is the OpenStreetMap Map Features wiki.
Let’s suppose you want to pull data from OSM corresponding to
restaurants, bars, amenities, leisure, gyms, shopping, and supermarkets.
Some of these categories are treated as ‘keys’ within the OSM database,
while others constitute ‘values.’ It is therefore essential to refer to
the Map Features wiki and sort of out which is which.
Depending on how your query is constructed, it is possible to either
select a subset of the features that fall under a given key or simply
select for all values corresponding to a given key by leaving
out the value=
parameter. Amenities and leisure, for
instance, are enormous categories comprising dozens of different
features. Likewise, OSM includes an extensive range of highly specific
shop-types, and so it will depend on the nature of your analysis whether
or not it is advisable to include every feature that
corresponds to a particular key in your query output.
For this example we will query a subset of ‘amenity’ and ‘shop’
values and select for all ‘leisure’ values. (Note that ‘gym’ is a value
tied to the ‘leisure’ key, and so is not specified in the table
below.)
restaurants |
'amenity' |
'restaurant' |
bars |
'amenity' |
'bar' |
cinemas |
'amenity' |
'cinema' |
nightclubs |
'amenity' |
'nightclub' |
(all leisure) |
'leisure' |
NA |
supermarket |
'shop' |
'supermarket' |
clothing store |
'shop' |
'clothes' |
add_osm_features()
To extract these features from the OSM database you will build upon
the initial opq()
query above by specifying the features
you want within the function add_osm_features()
. This query
gets linked to the original opq()
function through the use
of %>%
, the ‘pipe’ operator.
# This query will return all features tagged as restaurants within the Newport News bounding box. Note use of 'pipe' operator, `%>%`. This is essential for the query to execute correctly.
query_output <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'amenity', value = 'restaurant')
Next, append the osmdata_sf()
function, again linked
using the %>%
operator, to convert the OSM data to a
spatial data object. This step is necessary to eventually be able to
export the query results for use in a GIS. It is not necessary to
include an argument in the parentheses. By using the %>%
operator, it is implicitly understood that the query output should be
taken as the input in osmdata_sf()
.
query_output <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'amenity', value = 'restaurant') %>%
osmdata_sf()
print(query_output)
## Object of class 'osmdata' with:
## $bbox : 36.8175016,-76.58977,37.1375016,-76.26977
## $overpass_call : The call submitted to the overpass API
## $meta : metadata including timestamp and version numbers
## $osm_points : 'sf' Simple Features Collection with 2004 points
## $osm_lines : NULL
## $osm_polygons : 'sf' Simple Features Collection with 171 polygons
## $osm_multilines : NULL
## $osm_multipolygons : NULL
You can see that this query has returned a spatial data object with
1963 points and 164 polygon features.
Further Developing Queries
‘AND’ queries
Any subsequent call to add_osm_feature()
will add a new
feature to the query. Combining feature queries in this way is
equivalent to using the ‘AND’ operator. For example,
the following query will return all features that are BOTH restaurants
AND bars.
# Adding an additional feature query corresponds to the logical AND operator. This query will return features that are both restaurants AND bars.
restaurantsWithBars <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'amenity', value = 'restaurant') %>%
add_osm_feature(key = 'amenity', value = 'bar') %>%
osmdata_sf()
print(restaurantsWithBars)
## Object of class 'osmdata' with:
## $bbox : 36.8175016,-76.58977,37.1375016,-76.26977
## $overpass_call : The call submitted to the overpass API
## $meta : metadata including timestamp and version numbers
## $osm_points : 'sf' Simple Features Collection with 28 points
## $osm_lines : NULL
## $osm_polygons : 'sf' Simple Features Collection with 0 polygons
## $osm_multilines : NULL
## $osm_multipolygons : NULL
The spatial data object contained in the variable
restaurantsWithBars
contains only 28 points and 0 polygons.
The number of features returned is far lower now that a second condition
(‘AND ’bars’’) has been added to the query.
‘OR’ queries
What if you want to query features that are either a restaurant OR a
bar? In this case, you would need to take one of two approaches to
constructing the query.
- You can query each feature separately. In this case, each query
result should be stored in its own variable and each set of features
will end up being exported as its own file.
rests <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'amenity', value = 'restaurant') %>%
osmdata_sf()
bars <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'amenity', value = 'bar') %>%
osmdata_sf()
cinemas <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'amenity', value = 'cinema') %>%
osmdata_sf()
nightclubs <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'amenity', value = 'nightclub') %>%
osmdata_sf()
supers <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'shop', value = 'supermarket') %>%
osmdata_sf()
clothes <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'shop', value = 'clothes') %>%
osmdata_sf()
all_leisure <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'leisure') %>%
osmdata_sf()
Note that it is possible to combine any values from the same key
(e.g., ‘restaurant’ and ‘bar’ are both values to the ‘amenity’ key) into
a single query by using the c()
function to construct a
vector of names corresponding to the desired keys. Thus, the sequence of
queries in the chunk above can be modified to look like:
amenity_ftrs <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'amenity', value = c('restaurant', 'bar', 'cinema')) %>%
osmdata_sf()
shop_ftrs <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'shop', value = c('supermarket', 'clothes')) %>%
osmdata_sf()
all_leisure <- opq(bbox = 'Newport News, VA') %>%
add_osm_feature(key = 'leisure') %>%
osmdata_sf()
The crucial difference is in how the data will be exported. With
the first set of queries, you will end up with seven separate data
layers, one for each feature (except for all_leisure
, which
contains all features corresponding to the ‘leisure’ key). With the
second query structure, you will have only three different data layers,
and whichever features are grouped within the c()
function
will be output together in a single layer.
- Alternatively, you can combine the feature queries in a single call
to
add_osm_feature()
using the ‘OR’
syntax, shown below. This will return all features that are a restaurant
OR a bar OR a cinema, etc. This approach entails the export of one
single layer containing all the features enumerated in the query.
# Note that when using the 'OR' syntax the 'key=' and 'value=' parameters are not explicitly stated. Instead, backslaches and quotation marks are used in a less-than intuitive way.
all_features_in_one_query <- opq(bbox = 'Newport News, VA') %>%
add_osm_features(features = c("\"amenity\"=\"restaurant\"",
"\"amenity\"=\"bar\"",
"\"amenity\"=\"cinema\"",
"\"amenity\"=\"nightclub\"",
"\"shop\"=\"supermarket\"",
"\"shop\"=\"clothes\"",
"\"leisure\"")) %>%
osmdata_sf()