Written by Peter Rosenmai on 22 Feb 2014.

I present here R code to calculate the influence that a set of points have upon each other, where influence is a function of (1) the distance between the points and (2) the inherent influence of the points.

Consider, for example, five light bulbs with brightness given by this vector:

> brightness

[1] 3 4 2 1 2

[1] 3 4 2 1 2

Now, suppose that the distance between the light bulbs (in metres) is given by this distance matrix:

> distance.matrix

[,1] [,2] [,3] [,4] [,5]

[1,] 0 6 11 4 5

[2,] 6 0 12 9 10

[3,] 11 12 0 14 15

[4,] 4 9 14 0 4

[5,] 5 10 15 4 0

[,1] [,2] [,3] [,4] [,5]

[1,] 0 6 11 4 5

[2,] 6 0 12 9 10

[3,] 11 12 0 14 15

[4,] 4 9 14 0 4

[5,] 5 10 15 4 0

This matrix tells us, for instance, that bulbs two and three are 12 metres apart. Note that the distance matrix is symmetrical about a zero diagonal. This corresponds with the ordinary notion of distance: Any point is a zero distance from itself, and the distance from point A to point B equals the distance from point B to point A. However, my code permits non-symmetric distances: If bulb two is "uphill" from bulb three, [2, 3] will be greater than [2, 3].

Okay, next thing we need is a function that gives the light (the influence) that one light bulb receives from another as a function of the brightness of the bulbs and the distance between them. Let's use the inverse square of the distance between the bulbs:

LightReceived <- function(brightness.of.receiver, brightness.of.giver, distance.from.receiver.to.giver){

# The brightness.of.receiver doesn't come into the equation in this case

return(brightness.of.giver / (distance.from.receiver.to.giver^2))

}

# The brightness.of.receiver doesn't come into the equation in this case

return(brightness.of.giver / (distance.from.receiver.to.giver^2))

}

All done! Let's run the code:

> influence.matrix <- InfluenceMatrix(LightReceived, distance.matrix, brightness)

> influence.matrix

[,1] [,2] [,3] [,4] [,5]

[1,] NA 0.11111111 0.016528926 0.062500000 0.080000000

[2,] 0.08333333 NA 0.013888889 0.012345679 0.020000000

[3,] 0.02479339 0.02777778 NA 0.005102041 0.008888889

[4,] 0.18750000 0.04938272 0.010204082 NA 0.125000000

[5,] 0.12000000 0.04000000 0.008888889 0.062500000 NA

> influence.matrix

[,1] [,2] [,3] [,4] [,5]

[1,] NA 0.11111111 0.016528926 0.062500000 0.080000000

[2,] 0.08333333 NA 0.013888889 0.012345679 0.020000000

[3,] 0.02479339 0.02777778 NA 0.005102041 0.008888889

[4,] 0.18750000 0.04938272 0.010204082 NA 0.125000000

[5,] 0.12000000 0.04000000 0.008888889 0.062500000 NA

I call the resulting matrix an *influence matrix*. It tells us, for instance, that bulb five throws four times as much light on bulb one as it does on bulb two: [1, 5] is 0.08 and [2, 5] is 0.02.

To get the total light shone on a bulb by the other bulbs, we take the row sums:

> rowSums(influence.matrix, na.rm=TRUE)

[1] 0.2701400 0.1295679 0.0665621 0.3720868 0.2313889

[1] 0.2701400 0.1295679 0.0665621 0.3720868 0.2313889

So we see that bulb four receives the most light and bulb three receives the least light.

Now, let's consider a more practical example.

I decided to write this blog piece after listening to a talk by Mastadon C in February 2014 at the Society of Data Miners in London. The speaker described in that talk a study of the uptake of new treatments by general medical practitioners (GPs). The study tested the assumption that GPs are more likely to adopt new treatments if their practices are located near other practices, particularly large ones. I found myself wondering how the required calculations were done; I then went home and wrote my InfluenceMatrix() function (see below).

Okay, so let's consider five medical practices in Camden Town, London, described by the following dataframe:

> medical.practices

name lat lon doctors

1 Adelaide Med. Centre 51.54564 -0.161203 8

2 Queens Crescent 51.55151 -0.151676 9

3 Prince Of Wales Rd 51.54788 -0.148071 11

4 Fourtrees 51.54954 -0.153221 2

5 Ampthill 51.53625 -0.137257 5

name lat lon doctors

1 Adelaide Med. Centre 51.54564 -0.161203 8

2 Queens Crescent 51.55151 -0.151676 9

3 Prince Of Wales Rd 51.54788 -0.148071 11

4 Fourtrees 51.54954 -0.153221 2

5 Ampthill 51.53625 -0.137257 5

I use my GeoDistanceInMetresMatrix() function to obtain a distance matrix from the longitudes and latitudes of the practices:

> distance.matrix.in.km <- GeoDistanceInMetresMatrix(medical.practices) / 1000

> distance.matrix.in.km

Adelaide Med. Centre Queens Crescent Prince Of Wales Rd Fourtrees Ampthill

Adelaide Med. Centre 0.0000000 0.9291940 0.9444937 0.7032550 1.962810

Queens Crescent 0.9291940 0.0000000 0.4749221 0.2443764 1.971147

Prince Of Wales Rd 0.9444937 0.4749221 0.0000000 0.4018999 1.496379

Fourtrees 0.7032550 0.2443764 0.4018999 0.0000000 1.847585

Ampthill 1.9628102 1.9711466 1.4963786 1.8475850 0.000000

> distance.matrix.in.km

Adelaide Med. Centre Queens Crescent Prince Of Wales Rd Fourtrees Ampthill

Adelaide Med. Centre 0.0000000 0.9291940 0.9444937 0.7032550 1.962810

Queens Crescent 0.9291940 0.0000000 0.4749221 0.2443764 1.971147

Prince Of Wales Rd 0.9444937 0.4749221 0.0000000 0.4018999 1.496379

Fourtrees 0.7032550 0.2443764 0.4018999 0.0000000 1.847585

Ampthill 1.9628102 1.9711466 1.4963786 1.8475850 0.000000

One again, I'll use the inverse square to get the influence that one medical practice has on another:

InfluenceReceived <- function(influence.of.receiver, influence.of.giver, distance.from.receiver.to.giver){

# The influence.of.receiver doesn't come into the equation in this case

return(influence.of.giver / (distance.from.receiver.to.giver^2))

}

# The influence.of.receiver doesn't come into the equation in this case

return(influence.of.giver / (distance.from.receiver.to.giver^2))

}

To calculate the assumed influence that medical practices have upon each other, I suppose that influence is proportional to the number of doctors in a practice. So the influence acting on these five practices is:

> influence.acting.on <- 10000 * # Rescale for ease of reading the results

+ rowSums(InfluenceMatrix(InfluenceReceived,

+ distance.matrix,

+ medical.practices$doctors),

+ na.rm=TRUE)

> setNames(influence.acting.on, medical.practices$name)

Adelaide Med. Centre Queens Crescent Prince Of Wales Rd Fourtrees Ampthill

0.28096535 0.92811717 0.63485299 2.36445719 0.09891337

+ rowSums(InfluenceMatrix(InfluenceReceived,

+ distance.matrix,

+ medical.practices$doctors),

+ na.rm=TRUE)

> setNames(influence.acting.on, medical.practices$name)

Adelaide Med. Centre Queens Crescent Prince Of Wales Rd Fourtrees Ampthill

0.28096535 0.92811717 0.63485299 2.36445719 0.09891337

So Fourtrees—being located very close to Queens Crescent, the second largest practice—receives the most influence.

Here's the code for the InfluenceMatrix() function used in the above examples. It calculates an influence matrix using a supplied function, a distance matrix and a vector of influences. It's short but complicated—sorry! I like my code simple, but sometimes it's better to build for speed, not comfort.

InfluenceMatrix <- function(influence.received.fn, distance.matrix, influences){

# Calculates an influence matrix in which the value at row i and column j is the influence of

# a j-th thing on an i-th thing.

#

# influence.received.fn:

# A function that takes three vectors -- influence.of.receiver, influence.of.giver,

# distance.from.receiver.to.giver -- of equal length and returns a vector of that same

# length that gives the influence of the giver on the receiver.

#

# distance.matrix:

# A square matrix in which the value at row i and column j is the distance from thing j to

# thing i. In most cases, this matrix will be symmetrical about the diagonal with a zero

# diagonal. However, an asymmetrical matrix may be used to incorporate "uphill" and

# "downhill" distances.

#

# influences:

# A vector giving the influence of the things under consideration. The i-th element of

# this vector is the influence of the i-th thing.

margin <- 1:nrow(distance.matrix)

influence.matrix <- outer(margin, margin, function(irow, icol){influence.received.fn(influences[irow], influences[icol], mapply(function(i.row, i.col){distance.matrix[i.row, i.col]}, irow, icol))})

diag(influence.matrix) <- NA

return(influence.matrix)

}

# Calculates an influence matrix in which the value at row i and column j is the influence of

# a j-th thing on an i-th thing.

#

# influence.received.fn:

# A function that takes three vectors -- influence.of.receiver, influence.of.giver,

# distance.from.receiver.to.giver -- of equal length and returns a vector of that same

# length that gives the influence of the giver on the receiver.

#

# distance.matrix:

# A square matrix in which the value at row i and column j is the distance from thing j to

# thing i. In most cases, this matrix will be symmetrical about the diagonal with a zero

# diagonal. However, an asymmetrical matrix may be used to incorporate "uphill" and

# "downhill" distances.

#

# influences:

# A vector giving the influence of the things under consideration. The i-th element of

# this vector is the influence of the i-th thing.

margin <- 1:nrow(distance.matrix)

influence.matrix <- outer(margin, margin, function(irow, icol){influence.received.fn(influences[irow], influences[icol], mapply(function(i.row, i.col){distance.matrix[i.row, i.col]}, irow, icol))})

diag(influence.matrix) <- NA

return(influence.matrix)

}