Denver, The Mile High City in Colorado, USA, will be the stage for two of the most prominent conferences in the domain of open geospatial software and data, FOSS4G and SOTM. I will be speaking at both conferences. The community is gearing up for these important and fun events, preparing their talks and discussing the hot topics of the conferences.
For me, the direction in which OpenStreetMap will be heading over the next few years is one of those main topics. I will address this topic myself from the angle of contributor motivation. Specifically, I will address the challenge of addressing the extremely high churn rate that OpenStreetMap is coping with — less than one tenth of everyone who ever created an OpenStreetMap account continue to become active contributors. I will investigate which tactics from the gaming domain OpenStreetMap could use — tactics that have made Foursquare and StackOverflow so successful.
OpenStreetMap needs those flesh and blood contributors, because it is ‘Warm Geography’ at its core: real people mapping what is important to them — as opposed to the ‘Cold Geography’ of the thematic geodata churned out by the national mapping agencies and commercial street data providers; data that is governed by volumes of specifications and elaborate QA rules.
Don’t get me wrong — I am not denouncing these authoritative geodata sources. They have their mandate, and an increasing amount of authoritative, high quality geodata is now freely (beer and speech) available to the public — and I like to think that OpenStreetMap’s success played some part in this.
However, OpenStreetMap occupies its very own special niche in the domain of geodata with its Warm Geography — real people contributing their local knowledge of the world around them.
Will OpenStreetMap retain that unique position?
OpenStreetMap has grown very rapidly over the last three years, both in number of contributors and in data volume. For some countries and regions, OpenStreetMap is considered more complete and / or of higher quality than any other available data source. However much or little truth there may be in such a statement, the fact of the matter is that the relation between OpenStreetMap and authoritative or commercial geodata sources is being reconsidered.
On the one hand, VGI (Volunteered Geographic Information, the widely accepted misnomer that covers a wide range of geospatial information resources that consist of contributions by non-professionals) techniques are introduced into the data collection processes of traditional producers. The degree of voluntarism varies — on one end of the spectrum, we see PND giant TomTom using data collected from their users to trigger better targeted road data updates; on the other end, the VGI experiments conducted by the USGS to help improve the National Map. In either case, we see the intrusion of crowdsourcing techniques into the traditionally closed domains of authoritative and commercial geospatial information.
On the other hand, we see authoritative and other Cold data being integrated into OpenStreetMap, by way of imports on various scales — some local, others covering entire nations.
I have a strong ambivalence towards data imports into OpenStreetMap. I have seen how they can spark and nurture the OpenStreetMap community in the regions affected — but I also envisage how detrimental they could be to OpenStreetMap as a whole. This touches on the very nature of OpenStreetMap, and I reiterate: OpenStreetMap is Warm Geography — real people contributing their local knowledge of the world around them. Will OpenStreetMap claim that particular space the geospatial information landscape? Does it even want to? Or will it manifest itself as an open data repository in which Cold and Warm Geography are mixed together resulting in something Lukewarm?
One could argue that the large scale imports OpenStreetMap has already seen have probably helped spark the local efforts to improve the data and the community. This may be true — although it has never really been researched and you can easily make a counterargument with Germany as your case — but I believe we need to be looking at the relation between imports / authoritative data and the community, and OpenStreetMap should do so before imports are being done or even considered on an individual basis. Let me give two examples as food for thought to wrap this up.
The TIGER line data import in the US is one I deem useful and productive. It’s a low quality data source in terms of geometry, but pretty good (as far as I can tell) in terms of metadata. Also, it represents features that are easily recognizable in the field and map to our real world experience with ease: it’s the streets that we drive, walk, bike, address our mail to, and input in our satnav devices. Importing that data provided initial OpenStreetMap content with two key properties that are essential for an import to ‘work’ in the sense that it serves the interests of the data provider, the OpenStreetMap community and OpenStreetMap as an entity.
First, it provides initial map content that serves to make OpenStreetMap less scary for both aspiring volunteer mappers and potential professional users. An empty map is not attractive for casual mappers — a category that will become more important in the future, even if right now the majority of mapping is done by a small number of mappers in most regions (read my observations on churn rate in OpenStreetMap in a related post).
Second, it provides content that constitutes a low barrier for improving on it. The coarse, quirky geometry of the TIGER line segments can be easily fixed, especially now that nationwide high resolution imagery from BING is available as a backdrop for the OpenStreetMap editors. As a novice to OpenStreetMap, you have an easy way in: just start by fixing some streets in Potlatch2, couldn’t be easier. Having the instant gratification of seeing your contributions on the main map almost instantly helps motivate the novice to continue to contribute more. This process, I believe, would be further aided if OpenStreetMap would have more elements from gaming — competition, scoring, achievements, awards — but that will be the topic of my talk at SOTM ;).
In contrast, let me address my concerns with the land use import done recently in the Netherlands — see the current border of that import here to do a before/after comparison. This is authoritative data sourced from the national mapping agency via a loophole (the actual source dataset, Top10Vector, is not open data but a derived product was deemed to have a license compatible with OpenStreetMap’s). Its addition to the OpenStreetMap data body surely makes for a map that is very appealing visually, but it does not meet both the requirements for it to be warm, living data in OpenStreetMap. Land use data, in contrast to street data, is much more abstract and also much more difficult to survey, especially if you’re an OpenStreetMap novice. On top of that, it’s easy to break things. I can easily see how an OpenStreetMap contributor, even if he’s not a novice, would be daunted by an editor view that looked like this (see inset).
Mind you that this is almost 100% imported data.
So to conclude: I am not against mixing and matching authoritative data and OpenStreetMap, but I firmly believe the distinction between mixing and matching should be heeded more carefully than it has been in some cases. Don’t mix when it’s not going to be mutually beneficial, keeping in mind the requirements I laid out in the cases I described just now. This is necessary for OpenStreetMap to retain its unique position in the geodata domain as a Warm Resource, and that is the only way it will survive in the longer run.
I have been invited to discuss the topic of OpenStreetMap versus Authoritative Data in a panel organized by Eric Wolf of the US Geological Survey at the upcoming State Of The Map conference in Denver. If you’re interested in this topic and would like a broader view on it, I invite you to attend. It takes place on Sunday, Sep. 11th around noon. See you in Denver!