Creating county distribution maps

Update, 11 April 2020: Six years after writing the material below, I came up with a much better way of doing this and it took me a lot less time to get there.

Fun with QGIS! I figured out how to make a set of ca. 4000 county-level distribution maps for plants in New Mexico. These will eventually be part of Ken Heil & Steve O’Kane’s New Mexico flora project. Given that this is not a straightforward task, I thought I’d put a description of the process (including some wrong turns–I might leave those out next time) up here for my future reference. Maybe someone else will find it useful as well. Probably not, but who knows? So, Ken sent me a spreadsheet he’s put together that includes a list of species in the state with county distribution entered as a list in one of the columns. He has the data in a FileMaker database. I don’t really know what it looks like in there, but this is what it looked like when it got to me:

So, the first step – I had to figure out how to get this data into a form in which I can get it into QGIS. A presence / absence matrix seemed like a good idea, so I turned the cells with lists of abbreviated counties into a set of columns (via exporting as CSV, deleting all the field-delimiting quotation marks, and loading back into Excel):

Then I used all those columns to populate a matrix, via the following formula: “=IF(ISERROR(MATCH(“BER”,$H3:$AP3,FALSE)),”0″,$C3)”. “BER” is the county (Bernalillo) for this column of the matrix, “$H3:$AP3” is the range of cells listing counties for the plant on this row, and “$C3” is a cell containing the name of the species. So, if “BER” is in the county list, it returns the name of the species (I’m using the name as a marker for presence for reasons that will become clear later on–I used “1” initially, but that doesn’t end up working); if not, it returns “0”. I end up with something like this:

After poking around online for a while, I figured out that if I transposed this matrix (so it has a column of counties on the left and a row of species at the top) I could send it to a CSV, load it in QGIS, and join it to a vector layer I have for county boundaries of New Mexico. All the plants ended up in a long list of attributes for the county layer. Then through conditional layer styling, the counties can be set to grey if the species is present and white if it is not, like this:

That worked, but going through every species in the layer dialogue would take an extremely long amount of time. There are a few different options out there for automating map generation in QGIS, but I couldn’t find anything that would work for my purposes. I wanted to be able to open up the Print Composer, get things set up, and tell it, “Take this layer here, and cycle through all its attributes, generating an image for each one.” The Atlas Generator in QGIS can iterate map generation across features (rows) of a layer, but not across attributes (columns). That’s as close as I could get. Having hit a brick wall, I went to StackExchange and user underdark very helpfully pointed me in the right direction. Instead of trying to figure out how to iterate across attributes, it’s easier to work with the existing functionality and get each plant species’ distribution into QGIS as a feature in a vector layer. It would not have occurred to me that you could do this, never mind how. First, I needed to transform my data so that it has only two columns: the first is a county identifier (for which I used a 4-digit ID already present in my county vector layer) and the second is the name of a species. Each presence/absence cell in my matrix becomes a row. The second column can be generated from my matrix using the following formula in Excel: “=OFFSET($AQ$3,MOD(ROW()-ROW($BX$3),ROWS($AQ$3:$AQ$4400)),TRUNC((ROW()-ROW($BX$3))/ROWS($AQ$3:$AQ$4400)),1,1)”. To be perfectly honest, I don’t understand exactly how that formula works–I modified it from a template in one of the online Excel help fora (I’d put a link here but I forgot to bookmark it)–but it takes each column from my matrix and stacks them into one huge column. The first column I generated by hand; it’s just 34 big chunks of a 4-digit code, so that’s not too painful. I end up with this:

I deleted all the “0” rows (QGIS doesn’t need to know where plants aren’t and the whole shebang was 149,000 rows–it went down to 36,000 after removing the null rows), exported those two columns (BY & BZ) into CSV, and off I went to QGIS to try to join it to my county vector layer. There’s a slight problem here–these two tables have a one-to-many relationship. Each county has one entry in the county layer, but appears hundreds of times in my county/species CSV (once for each plant recorded from the county). When I naïvely joined them in QGIS, only one of the rows from my CSV was matched to each county, and the rest disappeared. Unhelpful. Fortunately, I was not the first person to encounter this problem and I found a helpful tutorial explaining the process. The short version is that I had to export the county vector layer to CSV with the shape information formatted as well-known text (via “GEOMETRY=AS_WKT” in the layer field) and then use my county/species CSV as the “parent” to which the new CSV-ified county layer is joined. Apparently, QGIS is perfectly happy joining tables with a many-to-one relationship; it just doesn’t do one-to-many. Also, because all the shape information is just a column in the the new joined layer, you can just save the whole thing and Bob’s your uncle–no futzing around trying to get the half-dozen separate files in the ESRI shapefile format to play nicely with each other.

If you were wondering what these files look like, here you go. You may notice that I’m using LibreOffice for dealing with CSV files. For some reason the CSV files created by Excel are not readable by QGIS. First, the county/species CSV:

The CSV-ified county layer:

The joined layer:

At the end of all that, I had a file that had one row for each species by county occurrence, and that row included the shape of the county in question. Progress! Next, I dissolved the layer on the “species” field, which results in the creation of a new vector layer (as a shapefile, no more CSV) in which all the various counties for each species have been conglomerated (no, I don’t know why this is called “dissolve”, since it does essentially the opposite) into a single feature. The last challenge was getting the Atlas Generator to create a separate image for each of these features. First, I set a rule-based layer style: “$id = $atlasfeatureid” (meaning “apply the style to Atlas’s current feature”) so that the current feature will be medium grey and everything else will have no fill:

I had been using the current KyngChaos build of QGIS, but the version of Atlas it uses has some annoying features–most notably, it changes the map extent to center each feature. Probably you can turn this off somehow so that the whole state remains nicely centered in each image, but if so I don’t know how. Second, I couldn’t get rule-based layer styling to work properly under the KyngChaos build (probably user error, but nonetheless…). So I tried the current nightly build from Dakota Cartography and it all worked nicely:

Now I have a huge pile of maps, each showing county distribution for one of the plants of New Mexico, and helpfully with file names that are just the name of the species. They look like this one, for Thelypodiopsis vaseyi:

The whole thing seems to work, and although it’s not entirely automated I think it’ll only take me an hour or so next time to go from having a new data file in Ken’s format (or a new version of my current spreadsheet; I’ve started looking through the data and making some corrections) to having QGIS plug away creating images. I’m sure it could be made more efficient if I were storing my data in a PostgreSQL database and all that, but this is close enough!

Organ Mountains-Desert Peaks Conservation Act

Yeah, I know, I never post anything here. Well, in any case…

I have mixed feelings about the Organ Mountains-Desert Peaks Conservation Act, but on the whole I think it is a bad idea. This act would create a new National Monument in the Organ Mountains and several nearby mountain ranges in Doña Ana County. It is being presented (as indicated by the name) as a conservation measure. There seems to be a common, but generally unexamined, belief that designation of wilderness, national monuments, and various other sorts of “special” public land is inherently a good thing for conservation. However, we should ask ourselves: 1) What are the threats to this land? 2) How will those threats be reduced or eliminated by this designation? 3) What other effects will this designation have–will it create new threats to the landscape, or reduce public access to it?

1) This is answered in one word: grazing. In short, we know that grazing has had severe negative impacts on the local landscape and we do not know if any level of grazing exists that will not simply continue those negative impacts. Other threats to public lands in the area include off-road vehicles (so far as I know, these are already disallowed within the proposed monument), mining (although there are no active mining claims in the proposed monument so far as I know), herbicide use (undertaken by the BLM in an attempt to “restore” historic grasslands–not currently occurring within the proposed monument so far as I am aware, but the program is continually expanding), residential development (through occasional sale of BLM lands–again, not currently occurring within the proposed monument so far as I am aware), and hunting (and, more than hunting itself, the various land management practices that federal agencies engage in to promote hunting–e.g., wildlife watering stations, vegetation management intended to increase forage available for game species, attempts to reduce or eliminate predators).

2) The act specifically states that existing grazing will be maintained. So the biggest threat to the landscape is intended to continue. For the remaining, lesser threats: OHV use, although already disallowed, could potentially be reduced by better enforcement; I don’t know how either mining or damaging “restoration” efforts would be affected, if at all; sale of BLM land for residential development would presumably not occur within a national monument, but the areas of the proposed monument are mostly, if not entirely, places where such development is impracticable or already exceedingly unlikely to happen; hunting would continue, although it is not clear if the various adverse impacts from game management would be increased, decreased, or remain unchanged. So, in short, the major threat would be unaffected and for the others there is, at least, not much to expect designation as a monument to have any significant impact. There might be gains, there might not. One would hope that, if a purported conservation measure will not address the major threat to the landscape, that at least such lesser threats would be clearly addressed and measures to reduce them be required in the act, but this does not seem to be the case.

3) After conservation, the second main selling point of the act seems to be that it would be good for business. It would bring more visitors and more money into the area. That, unquestionably, means an increase of threats to the landscape. More development, more people, more adverse human impact. The act does not discuss public access in much detail. Existing roads will continue to be accessible, but will there be more fee stations? More gates that are locked for much of the time? National Monuments also typically have much stronger restrictions on recreational activity–limits on camping outside of established (and generally expensive, crowded, and noisy) campgrounds, limits on off-trail hiking, and, of particular interest to academic botanists like myself, limits on collection of plant specimens or other research & educational activity. To what extent will these exist in the proposed monument? I haven’t a clue. It’s possible, I suppose, that designation as a national monument would not involve any such increase in restrictions on public access, but comparison to other established national monuments (e.g., White Sands N.M.) makes this seem extremely unlikely.

So, the take-home message seems to be: Designation as a national monument will not have any significant conservation value (it will not address the present primary threat, does not appear to be likely to have a substantial impact on lesser threats, and will create new threats to the landscape) and will probably increase the restrictions on and/or commodification of public access. At best, it’s a wash. At worst, it allows the existing damaging practices to continue unimpeeded while creating more tourist traffic, more development, and reducing public access.

Getting plant sex wrong (3)

Continuing my tendency to be irritated by descriptions of botany in the popular literature, I’m now reading The Forest Unseen: A Year’s Watch in Nature. I’m not too impressed with the book in general, but here’s a bit that’s particularly irritating (end of the chapter “March 25th – Spring Ephemerals”):

This intricate web of dependency dates back one hundred and twenty-five million years to when the first flowers evolved. The oldest fossil flower, called Archaefructus, had no petals, but its pollen-bearing anthers had flags on their tips. The botanists who described the fossil believe that these extensions may have been used to attract pollinators. Other ancient flowers also appear to have been insect-pollinated, further supporting the idea that insects and flowers have been partners since the first flowers evolved. How this marriage came about is unknown, but it seems likely that flowering plants evolved from fernlike plants. These ancestors produced spores that attracted insects looking for an easy meal. The ancestors of the flowers turned the plague of insect predators into a blessing by producing conspicuous displays to attract these spore munchers, then producing so many spores that thee insects’ bodies would be coated. The predators inadvertently carried some of this sporey dust onto the next flower, increasing the fecundity of the spore producer. Eventually the spores got wrapped in a package, the pollen grain, and the true flower was born. The bees and spring beauties in the mandala reenact the main theme of the original relationship. The bees, or their larvae, eat most of the pollen they gather, transferring only a small number of pollen grains from flower to flower.

To say “it seems likely that flowering plants evolved from fernlike plants” is somewhat misleading and at least unhelpful. Flowering plants are not particularly closely related to ferns; various of the ancestors of flowering plants back around their common ancestor with the gymnosperms might have looked vaguely ferny, but not in any way that is relevant to the evolution of pollination. But that’s not really too big a deal, it’s just a minor annoyance. The big irritation is here: “Eventually the spores got wrapped in a package, the pollen grain, and the true flower was born.” First – a pollen grain is not a package of spores. Each pollen grain begins as a single spore. Then one or more (the details depending somewhat on which lineage we are talking about) cell divisions take place inside the spore wall, and you have a group of cells inside the original spore wall, which persists more or less unchanged. If we’re doing academic botany, we call that an endosporic microgametophyte; in popular writing there are any number of less technical ways to say “several cells inside a spore” but “a package of spores” is just wrong. Second – pollen predates flowering plants, so identifying the origin of pollen with the origin of flowers is also wrong. All the gymnosperms, which do not produce flowers, do produce pollen. A few of them (Ephedra, for instance) even produce structures that look an awful lot like the stamens of flowering plants. The defining morphological feature of flowers, as compared to the cones of gymnosperms, is the carpel.

The decline of field botany

An article worth reading:

Profiling prolific plant hunters provides insight as to strategy for collecting undiscovered plant species.

The gist is: the current situation is dire.

“Plant collecting is a specific part of the three-step process of plant species discovery (collection, recognition and publication), and as the numbers of professional taxonomists who classify plants decline, there has been a massive increase in the utilization of non-professionals to aid in this work. This study suggests that as science pushes for more rapid documentation of the world’s flora, policy makers and funders must examine how best to develop the experience and skills of selected individuals to catalog undiscovered plants more efficiently.

“One way for institutions to encourage the development of these skills is in performance evaluations, rewarding effective field work on an equal footing with number of papers published and grants obtained,” notes Davidse.”

In other words, there’s no money to do field botany, institutions aren’t encouraging it, we aren’t training new field botanists, and we aren’t hiring them. And that’s why we need to do what we can quickly and on a shoe-string budget.

Pointless trivia…

A job application (for a botanical position with the state of Missouri) had a field for typing speed. Since I don’t know how quickly I type, I figured I’d take several of the various online tests. Over four of them I averaged about 85 words per minute, which I guess is respectable.