In this blog post I will be continuing writing about the travel cost variable. This is part 2 of my previous post.
In part 1 of this series, I was discussing how to obtain accurate measures of time and distance traveled from the respondent’s origin and the site(s) of interest. We are using the gmapsdistance package (or alternatively the ggmaps package).
As any package, if at any time you are having trouble with the syntax, just type:
A help tab should appear.
For the gmapsdistance or ggmap package to work, you need at least to supply it with:
- An origin vector
- A destination vector
- Mode of transportation
- An API key
I already talked about how to get an API key on my previous post, so I will be focusing on the three other arguments that you need to declare: origin and destination vectors, mode of transport, and other options in this function.
Setting the origin and destination vectors
Usually, in travel cost applications, we have multiple origins, which correspond to the home, postal codes or other origins of each of the sampled individuals. On the other hand, we have one or few destinations, which are the sites of interest (e.g. beaches, forests, lakes). Because we usually have survey data, we may have as many origins as the number of sampled individuals.
The origin and destination vectors can either be defined inside the gmapsdistance function (or mapdist function in ggmaps), or previously as its own vector. These two samples of code:
origin1 = c("Oslo+Norway") gmapsdistance(origin1, ...)
gmapsdistance(origin = c("Oslo+Norway")
are equivalent commands.
You can define either a text expression as the origin, or sets of coordinates. However, be alert that Google Maps has to be able to find the expression you are using, otherwise it will try to approximate as best as it can. To double-check that Google Maps is getting the right origin, I like to visualize my origin vector in a map:
library(ggmap) register_google(key = 'YOUR_API_KEY') map <- get_map(location = 'Rogaland', zoom = 8) mapPoints <- ggmap(map) + geom_point(aes(x = Longitude, y = Latitude), data = YOUR_DATA, alpha = .5) mapPoints
Just replace the “location =” part with the name of your study site. YOUR_DATA has to be a matrix with a longitude and a latitude vectors for each postal code. In my case, this code yields this map:
I see that some points are in the middle of the sea, so the coordinates I have are not trust-worthy. Any distance and/or time estimates I get will be wrong, and exaggerated at times.
Alternatively I used postal codes (e.g. “4001+Norway”) as my point of origin. When I tried typing the postal code in Google Maps, it could not identify the right point of origin, and used “Norway” as the origin instead. Hence, I heavily recommend using coordinates as the origin vector (latitude+longitude) that have been quality-checked (by you or someone else).
I close with some final notes about the origin and destination vectors. When typing either coordinates or references into these vectors, there cannot be any spaces inside the quotes (e.g. “Oslo+Norway” instead of “Oslo + Norway”). Second, the + sign should always be used to separate the coordinates and/or references (e.g. “Oslo+Norway” instead of “Oslo Norway”). Finally, R does not accept very long origin and/or destination vectors (i.e. there is a character length limit to vectors). I usually cut the origin vector as many times as necessary, until R accepts it (e.g. origin1, origin2, …).
This package is very useful not only to estimate travel distances and times by car (which is the default) but also using other modes of transportation.
When respondents travel, they may choose a wide variety of modes of transportation. If you have questions in your survey that elicit the transportation mode, then you can calculate the travel cost variable with more precision by allowing it to depend on the mode of transportation.
The graph below illustrates what types of mode of transportation my respondents reported:
The overwhelming majority traveled by car (which is a common finding in TCM studies). Nonetheless, it is useful to calculate distances and times tailored to the mode of transportation.
The gmapsdistance and ggmap packages allow for 4 different modes: driving, walking, cycling and public transportation (“transit“). This is declared inside the function:
gmapsdistance(..., mode="driving", ...)
Although the driving option nearly always finds a route and reports travel time and distance, Google maps might not always be able to find the route if you choose walking, bicycling or transit. In this case, I usually fill in the blanks by assuming the same distance as the driving option and a fixed walking and biking speed to calculate the travel times.
Moreover, the transit option requires a bit more “calls” from the Google API key, and it’s thus slightly more costly. This is because it uses actual bus and train routes to find the fastest route.
Shape option: Finally, one important option in the gmapsdistance and ggmap packages is the shape option. This option tells the function how to display the information: either in wide or long format. The output of the gmapsdistance function is always a matrix with a Time and Distance Vector, but the shape option dictates how the information will be presented.
Units of time and distance: If you use the gmapsdistance package, the time is displayed in seconds and the distance is displayed in meters. If you use the ggmap package, the output gives you several columns with distances in miles, kilometers and meters, and with times in seconds, minutes and hours.
Other options: There are other options in this function (e.g. avoid, departure, dep_date). Depending on your needs, you may be interested in these options.
Saving your work: Once I have satisfactorily obtained all the distances and times, I like to save them in a separate file to access later. Here is an excerpt of my code assuming the origin is Oslo and the destination is Stockholm:
install.packages("writexl") library(writexl) DRIVING = gmapsdistance(origin = c("Oslo+Norway"), destination=c("Stockholm+Sweden"), mode="driving", key = YOUR_API_KEY) DRIVING_dist = c(DRIVING$Distance/1000 ) DRIVING_time = c(DRIVING$Time /60/60) TC = data.frame(DRIVING_dist, DRIVING_time) colnames(TC) <- c("Car Distance", "Car Time") write_xlsx(TC, "DIRECTORY/TravelCost.xlsx")
The new file with your distances and times will be stored in the DIRECTORY you have chosen.
Happy Travel Cost’ing!