The Little Things

Analyzing Business Data with Excel is a book by one of my favorite publishers that tries to show you how Excel can make your life easier. Throughout the book, you are presented with strategies to help you tackle business problems through the advanced use of spreadsheets. One of the problems discussed in the book is that of "Measuring Quality" (Chapter 5). In that chapter, the Author discusses "Statistical Process Control" techniques to evaluate the quality of a check processing operation. As part of the analysis, he notes on page 108 that:

In Statistical Process Control, it is considered significant if the metric being measured has been on the same side of the average for three consecutive days. This could mean that a trend has started.

Essentially, the Author compares today's metric to the historical average. He uses a nested IF that

returns 1 if the metric is above average, -1 if it is below, or 0 if it is equal to the average.

The formula looks like this:

$$code(lang=python) =IF(D19>G19,1,(IF(D19<G19,-1,0))) $$/code

This got me thinking - the use of the IF function in spreadsheets is pervasive. Looking at the formula above, though, makes you wonder how the next person will decipher what the formula achieves. Or even how you will decipher it 6 months down the road. You might argue (perhaps correctly) that the above example is not difficult to interpret. But what if you had another condition that you wanted to evaluate (or even more)? For example, what if you wanted to evaluate the magnitute with which the metric differed from the average. In that case, you might want:

Val Test / Condition
2 The metric is greater than twice the average
1 Greater than the average, but less than or equal to 2x the average
0 Equal to the average
-1 Less than the average, but greater than 0.5x the average
-2 Less than 0.5x the average

The corresponding spreadsheet function would then be:

$$code(lang=python) =IF(D19>2G19,2,IF(D19>G19,1,IF(D19<0.5G19,-2,IF(D19<G19,-1,0)))) $$/code

That is virtually impenetrable. What if, instead, that same formula could be written as:

$$code(lang=lua) =if (D19 > G19 * 2) then 2 elseif (D19 > G19) then 1 elseif (D19 < G19 * 0.5) then -2 elseif (D19 < G19) then -1 else 0 end $$/code

I'm not going to argue that you won't have to think when reading the above formula, but I would argue that it is at least an order of magnitude easier to understand. This demonstrates one of the types of problems we're trying to solve - making models not only easy to create, but also easy to understand (and therefore scale). If you prefer the "old" way of doing things, you have the ability to use a "spreadsheet like" IF function in our software, except it's called IFTHEN (e.g., =IFTHEN(1=1,"yes","no")). If you prefer the "more natural" syntax noted above, use it (e.g., =if 1=1 then "yes" else "no" end). You can decide on a case by case basis which method you prefer. I would recommend that if you want to look at more than 1 condition you use the "new style" IF (as I expect it will always be easier to read as compared to nested IFs) and choose for yourself otherwise.

One quick final note along a similar line is on the use of AND and OR functions. In most spreadsheets, you get something like:

$$code(lang=python) =IF(AND(A1=10,A2=30), 1, 2) $$/code

While not terribly difficult to read or understand, it does not appear to be optimal. In our software you have the ability to write:

$$code(lang=lua) =IFTHEN(A1=10 and A2=30, 1, 2) $$/code

or even:

$$code(lang=lua) =if (A1=10 and A2=30) then 1 else 2 end $$/code

which I find to be the most readable. OR can be used the same way. Anyway, I hope you enjoyed the post. If you haven't signed up for our beta you can do so from our home page. If you want to contact us, we'd love to hear from you at feedback@gridmule.com.


Written by ktr in Excel on Thu 28 July 2011. Tags: Productivity, Technology, Excel,


© Data Mules, LLC