Lies, Damned Lies, and Accuracy Assessment


“There are three kinds of lies: lies, damned lies, and statistics.”  Mark Twain popularized this saying, describing the persuasive power of numbers (particularly to bolster a weak argument).  Accuracy Assessments can often be thought of in the same light.  All too often these numbers are accepted at face value, without question.

It is important to evaluate the stated accuracy of the land cover (or any other) data that you intend to use, but equally important to ask some questions of how that accuracy assessment was performed.  And, when comparing accuracy information from multiple products, thinking about whether those products are directly comparable… or whether a direct comparison is more apple-to-oranges.

While I’m not going to get into the specifics of how a valid accuracy assessment should be designed or performed (maybe a topic for future consideration), I am going to address one of the most common apples-to-orange comparison issues that I tend to see in my land cover work at the Center.  It has to do with the overall accuracy of a land cover product versus the number of classes that are mapped within that product.  What I often see (and end up sometimes being the target of criticism based on) is one product with a large number of classes (25 for instance) being directly compared to a product with a much smaller number of land cover classes (5, for example), when such products should not be directly comparable.

Why are these numbers not directly comparable?

Because the overall accuracy of these products takes into account all of the confusion that exist between all various land cover categories within that data…   And, the more categories, the more likely there will be confusion or error between these additional detailed classes (that don’t exist in the other).

For instance, if one product is mapping multiple types of forests, or forests and scrub shrub, there will be errors between these forest categories, or scrub, that contribute to that overall error.  But another product that maps only forest in general (i.e. one category), or a combined woody vegetation category (i.e. no distinction between forest and scrub), will have no such error present.  At face value, this second product would look more accurate, but both may map general forest equally well (as much of the more detailed products “error” may be a result of confusion between forest types).

So, how can you compare these numbers?

If the categories used in each product are related it may be possible to “crosswalk” or roll some of categories up into a common class in order to make the products more alike.  In the above example for instance, if you have access to all of the detailed accuracy information from the accuracy assessment it would be possible to combine and ignore the confusion between those various classes of forest and report on only forest in general.

If the categories are not comparable, or you do not have access to the full accuracy data, then you may just have to use your best judgment.  Are the errors between these sub-classes?  Are these classes of interest to you (could more classes actually be more useful)?  Either way, you may need to take such comparisons with a grain of salt.  But don’t blindly believe that a product with fewer classes is necessarily more accurate (especially if that additional detail is of importance to you) without taking a closer look.  After you do, you will be able to make a better decision on which product might be best for your need.

Leave a Reply. Comments are moderated.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s