Credit: Arnold Janz

The down-low on downloading NatureLynx data

After a busy winter, the ABMI’s NatureLynx team is gearing up for an even busier spring and summer as we roll out new functions and welcome new members to the growing NatureLynx community. Community is at the heart of NatureLynx, but today we’re talking about something a little more technical: the data that our community has helped to collect. What is it, how does it work, and how can you use it?

Download a deluge of detailed datadirectly!

You can easily download NatureLynx data using the “Download CSV” button in the web interface.

When users contribute to the NatureLynx community by recording their plant and animal sightings, they’re helping to build a free, public, crowd-sourced biodiversity data set, and directly contributing to the understanding of Alberta’s biodiversity. NatureLynx data can easily be download by selecting the “Download CSV” link located on the Newsfeed, Mission, and Group pages on the NatureLynx website. Clicking this button will download a CSV spreadsheet file containing all sightings currently displayed in the user’s Newsfeed, based on the user’s settings. If desired, the data can be filtered by location, time period, species, group, mission, and more! Over time, we hope the NatureLynx data set will grow into a valuable resource for its users, and anyone else seeking broad-scale biodiversity data for Alberta.

A haze of biodiversity data

NatureLynx users upload lots of photos, and some of these may contain sensitive information. For example, houses or other personally-identifying features might be visible in the background. Alternately, sightings of rare or threatened species might attract more human traffic to the area, putting the species at further risk. To ensure the confidentiality of our users and that species are protected, the data publicly available on NatureLynx is ‘hazed’. Essentially, this means that the point you see on the screen and in the public data isn’t exactly where the original record was made. Specifically, the publicly available point is randomly placed within a 24 km2 area that contains the actual sighting location. This hazed data is sufficient for addressing many research questions, especially those related to broader-scale patterns of species distribution and abundance.

An example of data-hazing in public NatureLynx data, designed to protect both species and user confidentiality.

Hazing 101: Geohashing

NatureLynx uses geohashes to haze its public data. A geohash is a convenient way of expressing a location using a short alphanumeric (letters and numbers) string. The longer the string, the more precise the location.

A geohash refers to an area generated by dividing the globe into rectangles. Each rectangle is then divided into 36 cells, each of these cells is further divided into 36 smaller cells, and so on. This is repeated until each rectangle is roughly 0.5 m by 0.5 m in size. Each time a rectangle is divided, a new character is assigned and added to the alphanumeric string, creating the geohash. For example, in the image below, we see the geohash code for one of the largest rectangles highlighted as “g”. Within this rectangle, 36 rectangular cells are created. By selecting one of these cells, we add a second character to the geohash, increasing the precision of the area (e.g., “gk”). This pattern continues, adding more characters to the geohash to increase the precision of the coordinate.

Geohashing in action. (Source)

At its largest, a geohash of one character can represent an area of about 25,000,000 km2. At the opposite extreme, a geohash consisting of 12 characters represents an area of about 0.25 m2.

Using the geohash system on NatureLynx

When you submit your sightings to NatureLynx using your mobile phone, we ask that you keep your location settings set to “on”. This allows your phone to automatically tag your images with a GPS coordinate (i.e., latitude and longitude). When we receive these coordinates, we use them to determine the geohash for your sighting. We round the geohashes to five characters to give us an area of roughly 24 km2. A random location is then selected somewhere within this area. The latitude and longitude of this random location is what is shared in our public data.

If your location settings are set to “off”, or the GPS coordinates for your sighting are unavailable, you can “drop a pin” at your current location through the app, or manually specify lat-long GPS coordinates through the web interface. When you manually select a location using one of these methods, the publicly visible location will still be hazed as outlined above.

Using NatureLynx data and submitting a request for exact location data

We think that our approach works well for addressing many research questions, and is a reasonable compromise with confidentiality and species protection. But we also understand that hazing our public data may create limitations for some users.

We recognize the public nature of NatureLynx data, and we believe that everyone should have access to the data they need to answer their research questions. With that in mind, if you have questions about using our public data or would like to receive the exact coordinates for research purposes, we’re happy to address these requests on a case by case basis. Please email your requests to natlynx@ualberta.ca. We hope this approach strikes a reasonable balance that protects the privacy of our usersand sensitive species!while ensuring that NatureLynx data serve as a valuable, community-driven resource for understanding Alberta’s biodiversity.

With files from NatureLynx Coordinator Jordan Bell.