Creating Datasets Interactively

The following Create2DimData() R function allows two-dimensional datasets (e.g. of heights and weights) to be created by clicking with a mouse within a plot. This can be useful if you need to throw together a dataset for demonstration purposes.

For example, try calling Create2DimData() like this:

df.points <- Create2DimData(xlim=c(0,10), ylim=c(0,5))
An interactive plot for generating 2-dim data.

That opens a 5-unit high by 10-unit wide plot window. Click inside the plot to add points. Click on an existing point to remove it. Click outside the plot area to finish. The Create2DimData() function will return a dataframe containing the points you have selected.

An existing dataframe may be supplied as a function parameter. That allows points to be interactively added to or removed from the dataset.

Create2DimData <- function(selected.points=NULL, xlim=NULL, ylim=NULL, tol=0.005){
   # This function lets a user interactively create a set of two-dimensional data
   # (e.g. of heights and weights).
   #
   # Click inside the plot box to add data points.
   # Click on existing points to remove them.
   # Click outside the plot area when you're done.
   #
   # Parameters:
   # selected.points:
   # Supply this to work on an existing datafame, with selected.points$x as your
   # x values and selected.points$y as your y values.
   #
   # xlim and ylim:
   # These define the plot area. E.g. xlim=c(0,100), ylim=c(50,60).
   # If not supplied, these will be set to enclose selected.points; if
   # selected.points is not supplied, they will be set to c(0,1).
   #
   # tol:
   # Sets the distance that a mouse-click must be from an existing point if
   # that point is to be removed.
   #
   # Returns:
   # A dataframe of two-dimensional data.

   if (!is.null(selected.points)){
      # Check that the dataframe contains columns x and y
      if (length(intersect(c("x", "y"), names(selected.points)))!=2){
         stop("The selected.points dataframe must include columns 'x' and 'y'.")
      }
   }

   if (is.null(xlim)){
      # Set xlim
      if (is.null(selected.points)){
         xlim <- c(0,1)
      } else {
         min.val <- min(selected.points$x, na.rm=TRUE)
         max.val <- max(selected.points$x, na.rm=TRUE)
         padding <- 0.02 * (max.val - min.val)
         xlim    <- c(min.val - padding, max.val + padding)
      }
   }

   if (is.null(ylim)){
      # Set ylim
      if (is.null(selected.points)){
         ylim <- c(0,1)
      } else {
         min.val <- min(selected.points$y, na.rm=TRUE)
         max.val <- max(selected.points$y, na.rm=TRUE)
         padding <- 0.02 * (max.val - min.val)
         ylim    <- c(min.val - padding, max.val + padding)
      }
   }

   if (is.null(selected.points)){
      selected.points <- data.frame(x=NULL, y=NULL)
   }

   repeat{
      plot(x=selected.points$x, y=selected.points$y, xlim=xlim, ylim=ylim,
           xlab="x", ylab="y", xaxs="i", yaxs="i", main="Create 2-Dim Data",
           sub="Click inside the plot to add points. Click outside to finish. Click on a point to remove it.")

      new.point <- locator(1)
      if ((xlim[1] <= new.point$x) && (new.point$x <= xlim[2]) &&
          (ylim[1] <= new.point$y) && (new.point$y <= ylim[2])){
         
         # The new point is within the acceptable ranges.
         # Calculate its distance from the other points.
         dists <- sqrt(((selected.points$x-new.point$x)/(xlim[2]-xlim[1]))^2
                      +((selected.points$y-new.point$y)/(ylim[2]-ylim[1]))^2)

         # Find the point that the new point is closest to.
         min.index <- which.min(dists)

         if ((length(min.index)>0) && (dists[min.index] < tol)){
            # The user has clicked on an existing point. Remove that point.
            selected.points <- selected.points[-min.index,]
         } else {
            # Add the new point.
            selected.points <- rbind(selected.points, new.point)
         }
      }
      else{
         # The new point was outside the accceptable ranges.
         break
      }
   }

   return(selected.points)
}