Creating Datasets Interactively
Written by Peter Rosenmai on 25 Nov 2013. Last revised 1 Jan 2014.
The following Create2DimData() R function allows two-dimensional datasets (e.g. of heights and weights) to be created by clicking with a mouse within a plot. This can be useful if you need to throw together a dataset for demonstration purposes.
For example, try calling Create2DimData() like this:
df.points <-
Create2DimData(xlim=c(0,10), ylim=c(0,5))
That opens a 5-unit high by 10-unit wide plot window. Click inside the plot to add points. Click on an existing point to remove it. Click outside the plot area to finish. The Create2DimData() function will return a dataframe containing the points you have selected.
An existing dataframe may be supplied as a function parameter. That allows points to be interactively added to or removed from the dataset.
Create2DimData <- function(selected.points=NULL, xlim=NULL, ylim=NULL, tol=0.005){
if (!is.null(selected.points)){
if (length(intersect(c("x", "y"), names(selected.points)))!=2){
stop("The selected.points dataframe must include columns 'x' and 'y'.")
}
}
if (is.null(xlim)){
if (is.null(selected.points)){
xlim <- c(0,1)
} else {
min.val <- min(selected.points$x, na.rm=TRUE)
max.val <- max(selected.points$x, na.rm=TRUE)
padding <- 0.02 * (max.val - min.val)
xlim <- c(min.val - padding, max.val + padding)
}
}
if (is.null(ylim)){
if (is.null(selected.points)){
ylim <- c(0,1)
} else {
min.val <- min(selected.points$y, na.rm=TRUE)
max.val <- max(selected.points$y, na.rm=TRUE)
padding <- 0.02 * (max.val - min.val)
ylim <- c(min.val - padding, max.val + padding)
}
}
if (is.null(selected.points)){
selected.points <- data.frame(x=NULL, y=NULL)
}
repeat{
plot(x=selected.points$x, y=selected.points$y, xlim=xlim, ylim=ylim,
xlab="x", ylab="y", xaxs="i", yaxs="i", main="Create 2-Dim Data",
sub="Click inside the plot to add points. Click outside to finish. Click on a point to remove it.")
new.point <- locator(1)
if ((xlim[1] <= new.point$x) && (new.point$x <= xlim[2]) &&
(ylim[1] <= new.point$y) && (new.point$y <= ylim[2])){
dists <- sqrt(((selected.points$x-new.point$x)/(xlim[2]-xlim[1]))^2
+((selected.points$y-new.point$y)/(ylim[2]-ylim[1]))^2)
min.index <- which.min(dists)
if ((length(min.index)>0) && (dists[min.index] < tol)){
selected.points <- selected.points[-min.index,]
} else {
selected.points <- rbind(selected.points, new.point)
}
}
else{
break
}
}
return(selected.points)
}