Calculating a Distance Matrix for Geographic Points Using R
Written by Peter Rosenmai on 30 Jan 2014.
Here's an example of how to calculate a distance matrix for geographic points (expressed as decimal latitudes and longitudes) using R:
> df.cities <- data.frame(name = c("New York City", "Chicago", "Los Angeles", "Atlanta"),
+ lat = c( 40.75170, 41.87440, 34.05420, 33.75280),
+ lon = c( -73.99420, -87.63940, -118.24100, -84.39360))
> round(GeoDistanceInMetresMatrix(df.cities) / 1000)
New York City Chicago Los Angeles Atlanta
New York City 0 1148 3945 1204
Chicago 1148 0 2808 945
Los Angeles 3945 2808 0 3116
Atlanta 1204 945 3116 0
For example, the above distance matrix shows that the straight-line distance—accounting for curvature of the earth—between Los Angeles and NYC is 3,945 km.
And here's the code for the GeoDistanceInMetresMatrix() function that generates the matrix:
ReplaceLowerOrUpperTriangle <- function(m, triangle.to.replace){
if (nrow(m) != ncol(m)) stop("Supplied matrix must be square.")
if (tolower(triangle.to.replace) == "lower") tri <- lower.tri(m)
else if (tolower(triangle.to.replace) == "upper") tri <- upper.tri(m)
else stop("triangle.to.replace must be set to 'lower' or 'upper'.")
m[tri] <- t(m)[tri]
return(m)
}
GeoDistanceInMetresMatrix <- function(df.geopoints){
GeoDistanceInMetres <- function(g1, g2){
DistM <- function(g1, g2){
require("Imap")
return(ifelse(g1$index > g2$index, 0, gdist(lat.1=g1$lat, lon.1=g1$lon, lat.2=g2$lat, lon.2=g2$lon, units="m")))
}
return(mapply(DistM, g1, g2))
}
n.geopoints <- nrow(df.geopoints)
df.geopoints$index <- 1:n.geopoints
list.geopoints <- by(df.geopoints[,c("index", "lat", "lon")], 1:n.geopoints, function(x){return(list(x))})
mat.distances <- ReplaceLowerOrUpperTriangle(outer(list.geopoints, list.geopoints, GeoDistanceInMetres), "lower")
rownames(mat.distances) <- df.geopoints$name
colnames(mat.distances) <- df.geopoints$name
return(mat.distances)
}
An Example: Starbucks Crowding
You can use the above code to, for instance, find those Starbucks stores in New York that have 70 or more other Starbucks stores within one km:

>
>
>
head(df.starbucks.ny)
name lat lon
1 1770 West Main Street, Suite 1400, Riverhead, NY 40.91709 -72.70946
2 10095 Main Road, Mattituck, NY 40.98587 -72.53907
3 24 Montauk Highway, Hampton Bays, NY 40.87864 -72.52142
4 485 CR-111, Manorville, NY 40.85249 -72.78954
5 2488 Montauk Hwy, Bridgehampton, NY 40.93760 -72.30177
6 385 Route 25 A, 14, Miller Place, NY 40.94175 -72.98710
> distance.mat.m <-
GeoDistanceInMetresMatrix(df.starbucks.ny)
> stores.within.1.km.count <-
colSums(x=(distance.mat.m < 1000), na.rm=TRUE) - 1
>
sort(stores.within.1.km.count[
which(stores.within.1.km.count > 70)], decreasing=TRUE)
1166 Avenue of the Americas, New York, NY 76
1101 - 1109 Avenue of the Americas, New York, NY 72
1100 Avenue of the Americas, New York, NY 71
Wow. A Starbucks within one km of 76 other Starbucks stores!