The Choices People Make – Looking at the Stats

We’ve been offering geospatial data for many years with the ability to customize the output. That means we have a treasure trove of records showing people’s choices of projections, data formats, and a few other things. I thought it would be fun to have a look at the choices people are making and see if we can make any useful observations.

The Data

All of our records come from the Digital Coast DAV system. The geospatial data is primarily lidar, imagery, raster DEMs, and land cover, with popularity in approximately that order. For all the data already in a raster format, users can primarily choose the projection and the file format. The lidar data is stored as points and products can be derived, so there are a lot more choices that can be made in addition. So, we can also look at the cell sizes people request for making raster products, or the contour intervals. The stats shown are for requests made between Jan 1, 2018 and July 24, 2019.

By the way, there are already some publicly accessible stats for the DAV system that cover things like dataset popularity and internet domains. I won’t be looking at those as you’re free to do that yourself.

Projections

The DAV system happens to use a different default projection for each of the data types. You’ll see the influence of that default in the stats and it either means that the examination of people’s choices is biased by having a default or we’re really good at picking a default for people. I’m inclined to think it’s a bias. So, for people choosing elevation data, their projection selections look like this:

Projection for elevation	Times Chosen
Albers	79
State Plane 27	102
Geographic	1781
UTM	6609
State Plane 83	48744

While the projections chosen for imagery look like:

Projection for imagery	Times Chosen
Albers	16
State Plane 27	18
Geographic	224
State Plane 83	3811
UTM	4507

The main difference is the switch between State Plane 83 and UTM, although for imagery the choice between the two is much closer. As you might guess, our default projection for each case is the one that was selected most often. As you can see, Albers doesn’t get a lot of interest. It is even lower than State Plane 27, which we hope nobody has to use any longer. However, it is our default for land cover data and we can see the impact in those statistics.

Projection for land cover	Times Chosen
State Plane 27	16
Geographic	107
UTM	217
State Plane 83	405
Albers	1001

Albers is the most popular for land cover. Given that State Plane 83 is still popular even when it isn’t the default and wildly popular when it is, it may be that we should use State Plane 83 for all types.

Points, Rasters, or Contours

For the elevation data that starts as a lidar point cloud, the user can choose if they want the points or a derived product, such as a raster or contours. Even though we’re starting from points, the default choice is for a raster, so we expect that to be the most popular. We’ll also break out the file format they choose for each type. Here’s what the stats say.

Product and File Format	Times Chosen
Raster – ESRI binary grid	345
Raster – Imagine	353
Raster – ESRI ASCII grid	1091
Points – LAZ	1482
Contour – DXF	4269
Points – ASCII X,Y,Z	5475
Contour – Shapefile	10963
Points – LAS	12088
Raster – GeoTiff 32-bit	21163

I find some interesting points in the selections here, but again defaults play a large role. For each type of output, the default file format is the most chosen. For the points, the choice of LAS over LAZ is probably a poor one just from the viewpoint of download times. It would be faster to download the data in LAZ format, and uncompress it to LAS with laszip locally, than to download LAS. There are also a surprising number of people requesting ASCII point data and I suspect these are engineers trying to find a way to get the data into CAD programs.

If you group those format choices by type of product (i.e. raster, point, or contour), you can see that while the default choice, raster, has the most requests, they are all requested a lot.

Datums

The DAV system lets you change the horizontal and vertical datum of the data. The default datums, if possible, are NAD83 horizontally and NAVD88 (or island equivalent) vertically. Is the effort to support changing datums worthwhile? Here are the numbers:

Horizontal Datum	Request Count
NAD 27	214
WGS84/ITRF	2392
NAD 83	80083

For the vertical, there are more options and you may only get the choice of some vertical datums if you pick the appropriate horizontal datum. For instance, you can’t pick the EGM2008 geoid model unless you picked WGS84 horizontally.The tidal datums (MLLW and MSL) are a bit misleading because only data sets already in those datums have that choice and you can’t change it, so nobody really picked it.

Vertical Datum (lidar point clouds only)	Request Count
Mean Lower Low Water	52
WGS84/ITRF Ellipsoid heights	339
NAD83 Ellipsoid heights	399
Mean Sea Level	496
NGVD29	904
WGS84 with EGM2008	909
NAVD88 (NGS GEOID)	54110

We clearly see that the default of NAVD88 dominates. If someone were to choose the advanced options, they could pick which NGS GEOID model is going to be applied. If they don’t pick the advanced options, they’ll get GEOID12B. Let’s look at that:

Geoid Model Name	Request Count
Geoid 96	23
Geoid 09	32
Geoid 03	35
Geoid 99	52
Geoid 12A	140
Geoid 12B	51592

As expected, the default is the dominate choice. However, I’m a little surprised there are people that want some of those older models and worked hard enough to find where to pick them. The choice of Geoid 12A is interesting as the only difference between Geoid 12B (the default) and Geoid 12A is that 12A had mistakes in a couple locations.

One of the reasons I wanted to look at what people choose is the upcoming new reference frames for the USA. We’ll clearly need to add those options, but perhaps it’s time to cull some of the others. We may also end up rebuilding the app to operate as a cloud native service and reducing the complexity would save money. I’d be happy to hear opinions about the options you’d like to see or what other stats you’d find interesting.

Digital Coast GeoZone

Tech talk for the Digital Coast

The Choices People Make – Looking at the Stats

The Data

Projections

Points, Rasters, or Contours

Datums

Leave a Reply. Comments are moderated. Cancel reply

The Choices People Make – Looking at the Stats

The Data

Projections

Points, Rasters, or Contours

Datums

Share this:

Leave a Reply. Comments are moderated. Cancel reply