The Choices People Make – Looking at the Stats


We’ve been offering geospatial data for many years with the ability to customize the output. That means we have a treasure trove of records showing people’s choices of projections, data formats, and a few other things. I thought it would be fun to have a look at the choices people are making and see if we can make any useful observations.

The Data

All of our records come from the Digital Coast DAV system. The geospatial data is primarily lidar, imagery, raster DEMs, and land cover, with popularity in approximately that order. For all the data already in a raster format, users can primarily choose the projection and the file format. The lidar data is stored as points and products can be derived, so there are a lot more choices that can be made in addition. So, we can also look at the cell sizes people request for making raster products, or the contour intervals. The stats shown are for requests made between Jan 1, 2018 and July 24, 2019.

By the way, there are already some publicly accessible stats for the DAV system that cover things like dataset popularity and internet domains. I won’t be looking at those as you’re free to do that yourself.

Projections

The DAV system happens to use a different default projection for each of the data types. You’ll see the influence of that default in the stats and it either means that the examination of people’s choices is biased by having a default or we’re really good at picking a default for people. I’m inclined to think it’s a bias. So, for people choosing elevation data, their projection selections look like this:

Projection for elevationTimes Chosen
Albers 79
State Plane 27 102
Geographic 1781
UTM6609
State Plane 8348744

While the projections chosen for imagery look like:

Projection for imageryTimes Chosen
Albers 16
State Plane 27 18
Geographic           224
State Plane 83 3811
UTM4507

The main difference is the switch between State Plane 83 and UTM, although for imagery the choice between the two is much closer. As you might guess, our default projection for each case is the one that was selected most often. As you can see, Albers doesn’t get a lot of interest. It is even lower than State Plane 27, which we hope nobody has to use any longer. However, it is our default for land cover data and we can see the impact in those statistics.

Projection for land coverTimes Chosen
State Plane 2716
Geographic107
UTM217
State Plane 83405
Albers1001

Albers is the most popular for land cover. Given that State Plane 83 is still popular even when it isn’t the default and wildly popular when it is, it may be that we should use State Plane 83 for all types.

Points, Rasters, or Contours

For the elevation data that starts as a lidar point cloud, the user can choose if they want the points or a derived product, such as a raster or contours. Even though we’re starting from points, the default choice is for a raster, so we expect that to be the most popular. We’ll also break out the file format they choose for each type. Here’s what the stats say.

Product and File FormatTimes Chosen
Raster – ESRI binary grid345
Raster – Imagine353
Raster – ESRI ASCII grid1091
Points – LAZ1482
Contour – DXF4269
Points – ASCII X,Y,Z 5475
Contour – Shapefile10963
Points – LAS12088
Raster – GeoTiff 32-bit21163

I find some interesting points in the selections here, but again defaults play a large role. For each type of output, the default file format is the most chosen. For the points, the choice of LAS over LAZ is probably a poor one just from the viewpoint of download times. It would be faster to download the data in LAZ format, and uncompress it to LAS with laszip locally, than to download LAS. There are also a surprising number of people requesting ASCII point data and I suspect these are engineers trying to find a way to get the data into CAD programs.

If you group those format choices by type of product (i.e. raster, point, or contour), you can see that while the default choice, raster, has the most requests, they are all requested a lot.

Datums

The DAV system lets you change the horizontal and vertical datum of the data. The default datums, if possible, are NAD83 horizontally and NAVD88 (or island equivalent) vertically. Is the effort to support changing datums worthwhile? Here are the numbers:

Horizontal DatumRequest Count
NAD 27214
WGS84/ITRF2392
NAD 8380083

For the vertical, there are more options and you may only get the choice of some vertical datums if you pick the appropriate horizontal datum. For instance, you can’t pick the EGM2008 geoid model unless you picked WGS84 horizontally.The tidal datums (MLLW and MSL) are a bit misleading because only data sets already in those datums have that choice and you can’t change it, so nobody really picked it.

Vertical Datum (lidar point clouds only)Request Count
Mean Lower Low Water 52
WGS84/ITRF Ellipsoid heights339
NAD83 Ellipsoid heights399
Mean Sea Level496
NGVD29904
WGS84 with EGM2008909
NAVD88 (NGS GEOID)54110

We clearly see that the default of NAVD88 dominates. If someone were to choose the advanced options, they could pick which NGS GEOID model is going to be applied. If they don’t pick the advanced options, they’ll get GEOID12B. Let’s look at that:

Geoid Model NameRequest Count
Geoid 9623
Geoid 0932
Geoid 0335
Geoid 9952
Geoid 12A140
Geoid 12B51592

As expected, the default is the dominate choice. However, I’m a little surprised there are people that want some of those older models and worked hard enough to find where to pick them. The choice of Geoid 12A is interesting as the only difference between Geoid 12B (the default) and Geoid 12A is that 12A had mistakes in a couple locations.

One of the reasons I wanted to look at what people choose is the upcoming new reference frames for the USA. We’ll clearly need to add those options, but perhaps it’s time to cull some of the others. We may also end up rebuilding the app to operate as a cloud native service and reducing the complexity would save money. I’d be happy to hear opinions about the options you’d like to see or what other stats you’d find interesting.

Leave a Reply. Comments are moderated.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.