7.2 Returns

As R is a functional language, functions (and therefore the values that they return) are an important thing to understand as a user. As we learned in the teacheR book, when you write a function, the return value will be whatever was last evaluated in that function definition, unless you specified a return() call. This “early return” strategy, however, is often a point of contention. Let’s have a look at the different points of view:

  1. There shouldn’t be any early returns

At one extreme, there are people who teach that you shouldn’t return something early from a function (unless there’s an error). In practice, this might look something like this:

late_bird <- function(x, y, w, add_w = TRUE) {
  ret <- x + y
  if (add_w) {
    ret <- ret + w
  }
  ret
}

So what’s going on here? Well, we’re creating a variable (called ret) and always returning that at the end of our function. The value of ret changes depending on our parameters, but we always return the value of that ret variable.

On the one hand, this makes it clear which variable is being returned. But on the other, it’s not always clear what the value of that variable is. We have to scan down the whole body of the function to see what happens to ret, even though if we set add_w to FALSE, nothing happens after the original call.

  1. Never use a common return variable

At the other end, we could completely avoid return placeholder variables:

early_bird <- function(x, y, w, add_w = TRUE) {
  if (add_w) {
    return(x + y + w)
  } else {
    return(x + y) 
  }
}

Unlike the first example, we only have to read down until the path that we’ve chosen is finished. For example, if add_w is TRUE, then we only need to read down until the first return() call and we know what we’re going to get. However, this approach would probably be more complicated if we had more than 2 paths or if we’re doing complicated actions. Plus, we’re duplicating our code here a bit by specifying the x + y part in both return() calls.

  1. Return early when possible

And finally, we reach a more middle-of-the-road approach:

middle_bird <- function(x, y, w, add_w = TRUE) {
  ret <- x + y
  if (add_w) {
    return(ret + w)
  }
  ret 
}

Here we use an intermediate variable to avoid duplicating our x + y operation, but then we return when we’re ready, meaning that someone doesn’t have to read to the bottom of the function if they’ve chosen the add_w path.

And herein lies the crux of the issue. Early returns have been a hot topic since the inception of computer programming, and people will continue to have their opinions on what the correct approach should be, so here’s mine:

The final return value should follow the “happy” path. That is, if you used the function with its default parameters, it should reach the end and return the final evaluation. Otherwise, you should probably be returning early. This way, when people first glance at the function, they can easily understand the logic and the “default” return value. From there, they can then monitor the edge cases to understand how the return value changes.

So what does this look like exactly? Well, Example 3 is close, but it doesn’t return via the final evaluation for the “happy” path. Let’s fix that:

adams_bird <- function(x, y, w, add_w = TRUE) {
  ret <- x + y
  if (!add_w) {
    return(ret)
  }
  ret + w
  
}

Now, the “happy” path works its way all the way down to the final expression ret + w. But when we change the default value for add_w and then deviate from the strict “happy” path, we see an early return call when we’re ready.

Of course, this approach won’t always be the easiest to understand - for example, if you have a parameter that changes the path at multiple points, then you won’t be able to return until later in the function anyway, but to me this approach is the most conducive to readable code.