Quick Data Tip in R and Excel

Last week, a colleague showed me a trick in excel.  And it really bugs me when you can do something in excel that R can not do easily.  So I gave it some thought and came up with a solution.  Here's the problem:

Sometimes we have data with missing values and want to fill them in a specific way.  What if you want to fill them in with the above-nearest non-missing value?  An example of such a data set:

ExcelRRecursionDataYou can imagine that it's easy to do this manually for a small data set, but tedious and horrible for larger data sets.  The instructions for excel is this:

  1. Select data.
  2. Hit F5 (Go To command)
  3. Click on the 'Special...' button.
  4. Select the 'Blanks' option, hit ok.
  5. Now, while every blank is highlighted, type in "=A2", where A2 is your first non-blank value.
  6. Hit 'Ctrl + Enter' at the same time.

Viola!  Done.  This can be done even faster with keyboard shortcuts.

In R, we have to be smart.  But it can be done easily with a recursive function.  See the code below.

##-----Recursive Fill in------
# This script will attempt to do an excel trick with a recursive R function

data = data.frame("label"=c("A",NA,"B",NA,NA,"C",NA,"D",NA,NA,NA,NA,"E",NA,"F"))

non.na.index = function(x,data){if(is.na(data[x])){non.na.index(x-1,data)}else{data[x]}}

new.column = sapply(1:(dim(data)[1]),function(x) non.na.index(x,data$label))

This R code does the following:

  1. Creates the data and the missing rows are designated as "NA". This is standard for R.
  2. non.na.index function takes a column and an index.  If that value is NA, it calls itself for the index above it.
  3. The 'sapply' function is used to apply the function over a whole column.

This runs relatively quick and gives us the same column as the excel trick.  Neat-o.


PS-  If the data comes this way via pivot tables in Excel, remember you can format the pivot tables certain ways to avoid having to do this at all.

This entry was posted in data and tagged , , , . Bookmark the permalink.

Leave a Reply