Data Imports In OpenStreetMap – Love ‘Em Or Loathe ‘Em?


OpenStreetMap showing the Netherlands with partial land use data import.

The Netherlands is starting to look really great in OpenStreetMap. This is due only in part to the tireless efforts of the Dutch Mappers – who are not to be deterred by the cold, rain, hail and winds of the Dutch fall season. Data imports have played a very important role in the state of the map in the Low Countries. As early as 2007, AND released its complete road network for The Netherlands to OpenStreetMap, and was imported in July and August of that year.

More recently, a comprehensive high resolution buildings and land use dataset was released for public use in a license compatible with OpenStreetMap. Import of this 3DShapes data set commenced almost immediately, starting with building footprints. Land use is ongoing – if you take a look at the map now, it is easy to distinguish the regions that have seen land use import love from those who haven’t yet.

These imports make the map look beautiful and contribute enormously to the richness and completeness of the data. In the Dutch OpenStreetMap community, there has been little if any resistance to the large scale data imports. What helps this run smoothly is that the imports are done by dedicated OpenStreetMappers who take every precaution to make sure no arduous manual edits based on ground surveying are overwritten without consent.

Apart from a visually stunning map, there are more arguments in favor of these large scale imports. A map that already contains information for your local area is less daunting to start working on than a map that is empty and void, so data imports may help lower the barrier to entry for new contributors. A map that looks complete is also more attractive for potential users, and thus may encourage professional use of of OpenStreetmap data.

Still, I feel ambiguous about the large scale imports that we are seeing in the Netherlands. I support all the arguments I mentioned in favor of them strongly, but there’s also some important drawbacks that I’d like to point out here.

The most important issue I see has to do with data aging and pseudo-information. As it is, the data being imported is a few years old. It comes from a source that would update it by ground surveying every few years to keep it up-to-date, but as yet it is unclear if newer versions of the dataset will also be released under a compatible license. Even if it will be, a re-import will be a hard thing to do. The community will go ahead and  update imported the land use data in OpenStreetMap, acquiring its own community dynamic. That will be difficult if not impossible to retain in a subsequent import of a new version of the source data. I’m therefore going to assume that both the AND and the 3DShapes imports will remain one-time efforts.

Given the singular character of the large scale data imports, it is upon the OpenStreetMap community to keep it up-to-date after the initial import effort. In some areas, this will be done diligently by a dense network of active OpenStreetMappers. But community activity is not spread evenly over the Netherlands  In others, community activity is sparse and the data just sits there, turning stale slowly. This raises the question which is better: no data at all, or one-time imported data that is slowly turning stale?

Stale data represents pseudo-information, in my opinion. As long as the data is more or less static, it is not such a big deal. Land use is more static than the road network, but it still changes over time. On average, more than 15% or the road network sees some kind of change every year according to some. Even if it’s half that, road network information requires a lot of community love to keep it from going stale fast. I realized this soon after the initial AND import, and requested a layer in the awesome Geofabrik OSM Inspektor tool that would reveal which parts of the road network in the Netherlands had actually seen any community updates since the imports. On higher zoom levels, this layer reveals that some cities have seen almost no updates in the last three years! Assuming that the data was around two years old at the time of import, this means that even in major towns and cities, most of the road data is five years old by now. For use in a background map, that may be just fine, but for more data-critical uses such as routing, five year old data has little value. How much would you pay for a satnav device with data that old on it?

Another more fundamental issue has to do with OpenStreetMap as a community effort. Looking at the OpenStreetMap of Amsterdam, I still feel a sense of pride. I made a significant contribution to this map. I know most of the other OpenStreetMap mappers here. We get together now and then to discuss the local state of the map. On the scale of the Netherlands, looking at the map I do not feel that sense of pride anymore. The map does not feel like a community effort, it feels more like a repository for open data. The map looks great but it looks cold – the quality looks about the same across the country, but that obscures the huge differences in community activity, which in some regions is all but nonexistant. I doubt that the current situation, the map looking the way it does and the tools availble for editing being what they are, will inspire people in those regions to join and start contributing.

Arguments for and against large-scale imports like we are seeing happening in the Netherlands are both strong. OpenStreetMap largely ignited the open geodata movement, and I think it’s awesome that we’re now seeing more geodata becoming open, and OpenStreetMap becoming a showcase and repository for that open data. I do believe that the community can, in some places and to some extent, benefit from these infusions of open data, improve it and keep it up-to-date. One of the pillars of a community however is a common goal; the sense of building something together. As that common goal fades out of focus, it takes much more than a casual glance on the Dutch OpenStreetMap to see the community effort.

10 thoughts on “Data Imports In OpenStreetMap – Love ‘Em Or Loathe ‘Em?

  1. As I argumented before, I am not to fond of data imports since they offer no added value compared to commercially available geodata. You describe perfectly that aging of data is a big issue when the community is not evenly spread over the Netherlands.

    IF the community was evenly spread, we could address the issue of data turning stale. So in my opinion, it would be better to put effort in enlarging the community then to put effort in finding more (commercial) datasources that can be imported. Maybe this will evolve automatically in time, but as “the dutch openstreetmap community” we are wrongly focussing on what matters and what doesn’t. Short term thinking and technical challenges have more chance to get attention then community building and this, in my opinion, is the biggest problem of the community in the Netherlands.

  2. As a user, I like the imports very much. They make the map usable in large(r) areas. As a mapper, I don’t. It’s more fun mapping something new, than it is to touch up or update existing data. And as a geospatial professional I’m worried about how the imports fit in existing data and how they’re going to be maintained.

    An extra issue with the 3DShapes import is that the nature of the data (areas, not trackable ways) makes it extra hard for individual mappers to collect the data necessary to keep it up-to-date.

    • @Jeroen Muris

      When the map grows, it becomes more difficult to find uncharted areas, so the fact that the role of contributors becomes more maintenance is inevitabe.

      Regarding the landuse import, I’m putting in effort to get a supportive base of users in the “affected” areas. It’s true that I haven’t done that in the beginning (east of the Veluwe), but this is partially also a learning process. Regarding the import of landuse data east of the Veluwe, I didn’t get any negative comments. The most “negative” one was that parts of the forested areas (mostly on the Sallandse Heuvelrug) was actually heath. This is due to the fact that the data source available to us classified both forests and heath as “nature”, and the fact that this area was heath was not evident from the original AND data either.

      Maintenance is another story. At least I think that the barrier to maintenance of landuse data can be set lower than to road data. Usually landuse doesn’t change very frequently, except for maybe turning meadows into farmland and vice versa, turning farmland/meadows into nature (at least in the Netherlands), and turning areas into residential areas. The last case is the most easy to spot and accomodate for.

      Another problem with landuse is that it is generally more difficult to maintain. It is easier to draw ways, and make sure they’re connected (although this can still be difficult for novices), but for landuse entire areas need to line up well. This could be accomplished by using the topological nature of OSM more extensively, although this leads to a situation which is harder to understand for most of the users.

  3. Hi Martijn,

    Interesting post. I find it good that you raise this issue, although a solution is not that easy to find.

    About the question which data is better, no data at all or imported data turning stale: this is not the fundamental question. Imported data does not necessarily have to turn stale, although I realize that a relatively large amount of data will turn stale if the community is missing. On the other hand, user contributed data can turn stale as well. Just have a look at Quebec City.

    A more fundamental question would be ask, before each import, is whether it would be justified. This is not an easy decision, and with many existing imports there are many arguments for and many arguments against it. The knife can cut both ways. For example the 3dShapes import. I admit that it is not easy to maintain this data, because we, as a community consisting of private individuals, don’t have the means (professional equipment, sufficient time, etc.) for it. On the other hand, it can also put the focus on areas where user contributions and maintenance _are_ possible. With buildings this is the assignment of house numbers to buildings. There are several nice examples over the country (although by no means complete). The landuse import, whether it outdated or not, gives also strong hints where unpaved roads, cycleways, and footpaths are missing.

    The last example is interesting, since the landuse data could also be used as an overlay (as has been suggested for imported data in general on the mailing lists lately). However, that would leave OSM with landuse data which suffers from quality differences, and which is woefully incomplete. Yes, that might also be one of the charms of OSM🙂, and it is not realistic to expect that everyone is a GIS expert. On the other hand, if someone wants to use more consistent and generally more accurate (although out-of-date) data, he would need to either deal with data overlap, or suppress the user-contributed data! Therefore I think that something like 3dShapes is worth importing, so that it can be used as a basis for further improvement and actualization. At least the problem of actualization is not a big an issue for landuse as it is for the road network.

    Regarding keeping data up-to-date: yes, this is a problem. It can only be solved by getting active contributors/maintainers, which are spread over the country. As was pointed out during SotM, the location of users is a very important factor which strongly determines whether data is current or not. Keeping data current is also a problem for commercial suppliers of road data. I see quite often complaints that those companies are not updating their data after someone spotted an error. Of course this is true for OSM as well. There are plenty of errors in OpenStreetBugs and KeepRight, and again, this would lessen if the community grows.

    Furthermore, I’m not really buying the statistic that “more than 15% of the road network is updated each year”. The source you’re linking has a commercial interest in making people aware of this “fact” (which imho is partially FUD). Also, how do you measure the “road network”? Nonetheless, there are plenty of changes, and even 5% (of the length) of the road network in the Netherlands is still a large area (way!) to cover😉

    Anyways, it is evident that OSM exists by the grace of its community, despite imports or not. I’m not saying that _all_ imports are good; OSM would still be able to survive with quite a lot of imports; but as long as imports are an asset rather than a burden, I think we should take advantage of them. The real strength lies in the combined effort of user contributions and well-coordinated imports.

    If staleness of data is an issue, and I absolutely agree that it is, we should develop adequate tools to put insight in that staleness. The one you requested from Geofabrik is excellent for this purpose. Although, due to the massive 3dShapes import, this is unfortunately no longer the case, so maybe it should be refined to show highways only. Such tools are especially important for communities with less activity.

    Thanks for reading,

    Frank

  4. I am glad you share my worries about large-scale data imports, like 3dShapes. My opinion is that OpenStreetMap has to be very careful with importing of these huge data sets.

    My main reason for this is the main aim of OpenStreetMap: creating open maps of the world. With imports, we aren’t creating maps, but we are copying them; I don’t think this is useful for “the world”.

    What’s more, you write that “the imports are done by dedicated OpenStreetMappers who take every precaution to make sure no arduous manual edits based on ground surveying are overwritten without consent.” This is indeed true beyond any doubt, but in my opinion too little attention is paid about merging the new, imported data with the existing data. Sometimes the 3dShapes data doesn’t “fit” in the old data.

    Furthermore, I agree with miblon that the Dutch community focuses on the wrong things. We should more oriented towards mappers, especially new mappers. This way, we will eventually have a better map, since we get more contributors.

    I think the best thing to do is to put the 3dShapes data as an WMS layer in JOSM and using it as a source (just like the Yahoo imagery). Of course, this is much more tedious work, but the result probably will be much better. And I will still be able to say to friends: “Look, I mapped this myself.”

    • The wiki states: “OpenStreetMap creates and provides free geographic data such as street maps to anyone who wants them.” So, this isn’t exclusively limited to creation of data manually. This would be nearly impossible for many map features which are too large to map, or requires equipment beyond the league of nearly all of us. If we would limit to this, then OSM would be much less valuable. All the value which is put in into integrating imported data in OSM would be lost.

      Regarding 3dShapes, yes, everyone knows that it isn’t 100% correct, but that is nearly impossible to achieve. At least in actively mapped areas it is being picked up and extended. The buildings are ideal for housenumber mapping. Areas with little or no mapping activity remains a problem, but that isn’t caused by the 3dShapes import. (OTOH, I don’t know how the Dutch community would look like without the AND import, but I’m no fortune teller.)

      And the community: everyone has his/her own motivation regarding joining OSM. Many of us have a technical background, so it isn’t surprising that the focus lays there. I think they’re doing a wonderful job in keeping OSM.nl running, and providing useful services like tile.openstreetmap.nl, the cycle map and walking map. And someone like Rullzer is very keen on thinking of new useful services based on OSM data, like POI exports, etc. Maybe not exactly your cup of tea, but undoubtedly it is useful for others.

      OSM is an open and diverse community, so everyone is welcome to join, and put in efforts in whatever way he thinks would make sense. This is necessary, especially since OSM is still growing bigger. Please be aware that you are part of the community as well, so if you think that the community isn’t putting in enough effort in outreach, and helping newbies, most, if not all people will agree with you on that.

      For example, ZMWandelaar, maybe not one of the most vocal and ancient members, is working tenuously on the Dutch version of the map features page in the wiki to make it more applicable for the situation in the Netherlands, and to remove some inconsistencies / difficulties which have crept in over the last couple of years. That will certainly help new and casual users.

  5. Just a quick response on this discussion from an OSM map user and OSM-mapper point of view.

    I am very happy with the “3DShapes” imports, happy with the result –as a user– and happy that someone else has done the work –as a mapper– (Frank Steggink comes to mind in my own area; thanks, Frank, for the land use and LDP for the buildings).

    The 3D shapes (land use and buildings) are extremely useful if you want to use a map for walking/ hiking/ rambling. No matter how good the written instructions are, you often need to check the lie of the land. I realise that OSM started as a street plan, but I am really very happy it is turning into something much bigger than that.

    I can see no realistic way to make this happen, other than through imports like these in the Netherlands or the similar efforts in France.

    I can’t see how dead-reckoning will do it; and I don’t believe we can go wandering around farmers’ lands to do the surveying. I was at one point considering doing just that myself for OSM maps I needed for a long-distance walking route we had made, and abandoned the idea as unfeasible.

    I have tried the WMS-like underlay and trace approach myself (in Potlatch rather than JOSM) and for land use it is (a) a lot of work, (b) not at all more accurate than the imports done by the “wizards”(people rather than scripts).

    When 3DShapes came along in the area I usually work on, it didn’t take much time to remove the data which was in error: e.g. a large pond where we now have a new housing estate, a golf course which was classified as a park, add a a missing road or two.

    Having said that, I do understand the “community” issues, but I am not convinced that imports like this hinder more than they help.

  6. I just looked at the link to the geofabrik inspector. It seems to me the situation is not so bad, and that most of the road network where you linked to have been touched now.

  7. Pingback: OSM Community...pause...NOT! - North River Geographic Systems Inc

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s