oegeo

researching VGI

Scoping Out An Open Location Platform

with 4 comments

I noted earlier today that it would be nice to have a kind of Open Location Platform in order to get the most out of the Social Location apps that have gained such interest and popularity as of late. The problem with all the existing applications and platforms is that they do not speak each other’s language, and thus cannot benefit from all the information that is being crowdsourced into their respective location repositories. All these applications have a different take on Social Location, attracting a specific, unique crowd. The challenge of an Open Location Platform would be not to lose the uniqueness of each of the existing platforms in the process of unifying the way in which places are searched for and posted. Ideally, the applications would not even have to do very much at all to benefit from an Open Location Platform.

The way I see an Open Location Platform is as a kind of API interfacing between the existing Social Location apps, a central repository of Places – in fact, I prefer the name Open Places Platform over Open Location Platform, for a Place is More Than A Location™ – and the proprietary Places Repositories that the apps maintain now.

A unique Place Identifier must be used to link the Places in the Central Repository – which I believe should be OpenStreetMap, because it has the infrastructure set up and would benefit from the crowdsourcing from all these unique Social Location platforms – to the Places in the proprietary, existing repositories.

The initial assignment of unique Place Identifiers is not really straightforward: there’s bound to be a lot of duplicates, even within the existing repositories, and removing duplicates inserted by humans is always a challenge. All unique Places should be collected in the OpenStreetMap database, and the Identifiers should be assigned there. Currently, every new node inserted is assigned an ID automatically, so that ID could easily be used.

Assuming this model is in place, we can start scoping out the basic operations: Searching for Places, Request Details for a Place, and Post a New Place.

Searching for Places

In a Social Location environment, it is safe to assume that search operations for known Places will be geographical, i.e. within a certain distance from the user’s current location, or within a certain area defined by a bounding box. A response to such a Places search would have to contain a minimal amount of information about each place found: geographical location, name, and something like a genre. That last bit might be troublesome, because it should be platform-agnostic. Either the Social Location platforms need to arrive at a shared taxonomy of ‘Place Genres’, or it should be left open, in the spirit of OpenStreetMap. I would opt for the former.

An actual search operation would thus not involve the proprietary Places Repos at all, because everything is stored in the Central Places Repo, OpenStreetMap – with those elementary nuggets of metadata on each Place to avoid unnecessary client-server round trips. Oh, as an added bonus, the location + name + Place Genre gives OpenStreetMap enough info to render it on their map. A Social Location application that uses OpenStreetMap as a base map layer would have the added benefit of up-to-date maps crowdsourced by their own users.

Request Details for a Place

The next step for a typical Social Location app would be to request more detailed information about one particular Place. Applications could still do this the way they are used to: requesting details directly from their own Places Repo using the their own interface or API. This way, however, the application would not be able to reap the real benefits of the Open Places Platform: firing one Place Details request at the Open Places Platform and getting results from all registered Places Repos, instead of just their own. This might not make much sense for the current Social Location apps, but a newer generation of apps could be crafted to make use of the Place information coming from different communities. In order to be able to process all this heterogenous Place information, some harmonization of Place metadata might be desirable.

To make a Place Details request more efficient, and maybe also to provide some indication of the expected information richness, an intermediate request might be implemented, in which the Application requests a list of Repos that have a record for the Place it intends to query. This could be a really fast request.

In a less-than-ideal-but-still-entirely-feasible scenario, one Social Location platform might not want the client of another platform benefiting from the information crowdsourced intro their Repo. This could be dealt with in a security and access provision layer.

Post A New Place

The posting of a new Place would by design be a two-step process, because the Place would have to be registered in the Central Repository, OpenStreetMap, first, and subsequently inserted into the proprietary Places Repo using the new unique Place Identifier returned by the Central Repo in step 1.

Conclusion

I think this could work. Really, it could.

Written by Martijn van Exel

January 24, 2010 at 11:12 pm

Why We Need An Open Location Platform.

with one comment

Yesterday evening I negotiated the cold and snow that is once more upon us to join the OpenStreetMap Foundation board, who are having a board meeting in a hotel at Schiphol Airport, of all places, for an evening of drinks in a bar in Amsterdam. It was very good seeing them again; I’ve worked closely with most of the current board members organizing State Of The Map 2009 and had not seen them since. Good times chatting about Jetsons-like hotel rooms, living in Kenya, and Social Location, which brings me to the topic of this post.

Male humans wielding their iPhones in a bar

Well into the evening, a group of hipsters enters the bar and occupies a table near ours. Then, a social novelty unfolds that is on the verge of being uncanny to observe. Instead of starting to socialize, they all take out their phones and spend the first five minutes of their night out staring at their tiny screens, the blueish light emitting from their devices accentuating the utter concentration on the hipsters’ faces. They are momentarily out of touch with the simple, three dimensional world we are physically sharing and have ascended into the universe of Social Location – updating their FourSquare, Gowalla, BrightKite and Flook accounts.

What gives?

Social Location. The hipsters are ‘checking in’. Letting their friends, and everyone else who might be interested, know where they are. ‘I am at IKEA’. ‘I am at Top Gear Live‘ – I don’t have to look very hard in my twitter stream to stumble upon a check-in of sorts. It is a Very Popular Thing to be doing – even Kevin Rose is onto it now – to the extent even that 2009 was dubbed ‘The Year Of Social Location‘ on an influential GIS blog. Why Social Location is taking off the way it is, amidst a new wave of heated discussion about privacy in social networking, is a question for another day, but the fact is that it has become all the rage, with a flurry of new platforms and apps like Google Latitude, Gowalla, Foursquare and Flook, joining the ranks of already established ones like Brightkite and Dopplr.

Social Location connects with the real world: become the Mayor of this bar and drink for free.

Some of these Social Location platforms are designed to just simply post you location as a geographic coordinate for other apps to do useful things with, like Yahoo! Fire Eagle and Google Latitude. Others add another layer of fun and functionality on top of that. Brightkite, Gowalla, Flook and Foursquare use your location to search a database of bars, museums, shops and other public places. Instead of posting a nondescript set of coordinates, you can share a meaningful location. I Am At IKEA! To provide some legitimacy to this concept, some platforms add a competitive element, awarding badges or medals for accomplishments (four bars in one evening earns you a ‘Bender’ badge on Foursquare, for example) and ‘mayorship’ for a venue’s most frequent checkers-inners. Foursquare even goes so far as making your Social Location status count in the real world: The Mayor drinks for free.

Another interesting element of the Social Location platforms is its crowdsourcing possibilities. Building on the concept of social review platforms like Yelp – who recently updated their mobile app introducing a check-in option of their own – Social Location platforms can be employed to discover and share interesting locations with other users. Most platforms already allow simple adding of new places, and let you share comments, tips and observations for other users to enjoy.

The question of where all this information is stored arises. We see a wide diversity of platforms, all with their own user base – which is, judging by the hipsters’ time spent staring at their phone screens, probably showing a significant overlap – and their own strengths and weaknesses. Some form of consolidation would be in the interest of us, as users of these apps. Not only would we get to spend more time actually socializing, but we would also benefit from more crowdsourcing power.

I’m thinking there should be an open location platform. The actual locations should go into OpenStreetMap, which would be the central repository for Locations. A Unique Location Identifier will link them to their Yelp, FourSquare, Gowalla, and other location profiles. Comprehensive location information could then be aggregated by any application by pulling the information form any of those platforms using the shared Open Location Identifier.

Any takers?

EDIT Just posted a follow-up article in which I dig into the concept of an Open Location Platform.

Written by Martijn van Exel

January 24, 2010 at 1:16 pm

Posted in spare time

Geotagging in Lightroom made easy

with one comment

Adobe Lightroom does not have much features to add location to your photos. With a few clicks and drags, it is easy enough to get a geotagging workflow going for Lightroom, though. Let me show you how I do it on my Mac.

First, I got the “GPS-Support” Geoencoding Plugin from Jeffrey Friedl. It is distributed as donationware, so if you find it useful, do paypal him some $$$. Install it through Lightroom’s plugin manager.

The Eye-fi Pro smart SD card with geotagging built in - the digital camera accessory that will render this post obsolete.

There’s a lot of ways to geotag images using this versatile plugin. It can work with a ’shadow’ GPS trace to semi-automatically tag a series of photos taken while recording your location using a separate GPS logger. Believe it or not, I mostly do not, so I end up manually geotagging many of my digital photos. Until I get myself one of these, that is.

Here’s how I use the geocoding plugin to quickly tag a set of images just dowloaded from my camera.

  • Open Google Earth and Lightroom.
  • Download images from camera
  • Open the latest import in Grid View (g).
  • Toggle the view until I have full screen with menu bar (f).
  • Adjust the grid view to make the thumbnails big enough to be able to determine the image location without having to switch to single image view most of the time (Cmd+ / Cmd-)
  • Get rid of all the palettes and accessory views (Tab)

My Lightroom screen looks something like this at this point:

You may have noticed I am a keyboard kind of guy – if your style is more mouse-oriented you might not find this workflow all that appealing. Lightroom is quite keyboard-friendly, but not towards plugins. Their options are hidden away in menus without any keyboard shortcuts. To access the gecoding plugin dialog, I would have to mouse through two menus:

Luckily, Mac OSX provides a mouse-less interface to any menu item. You can give any menu item in any application a keyboard shortcut in the Keyboard preference pane, like this:

If you’re too lazy to do that, there’s also the option of accessing the Help menu in any application and just start typing the name of the menu option you want to access – also a lifesaver in less frequently used applications with tons of menu options. Just hit [Cmd Shift /] and start typing, ‘geo’ in this case:

And hit [Enter]. The plugin window pops up. The plugin will work its geotagging magic on all currently selected images, so whenever I have a few images taken in the same place, I select them first (Shift + arrows) and then call the plugin using the keybord shortcuts outlined above.

The plugin window looks like this

Note how it has taken the data from my last geotagging operation, so I could quickly add the same location to another image. Right now, I want to set a new location. The plugin itself does not provide a map interface to do this, but it can take the current location from Google Earth or a permalink URL from OpenStreetMap or Google maps. Because getting a permalink requires more clicking, I prefer to get my location from Google Earth. So I Cmd-Tab to Google Earth and use the arrow keys (hold Alt for precise control) to pan to the location of the photo.

Then I Alt-Tab back into Lightroom and hit ‘Import from Google Earth’ and lo! it even reverse geocodes the location:

Now I just need to hit the ‘Geoencode Images’ button (with unappropriate red label) and I’m done. Here’s a short video showing the entire process:

It cost me only two mouse clicks, those last ones inside the plugin interface, which could use some optimization in more ways than this: I think it’s quite convoluted and could do with a ‘easy mode’ of operation. For my use, I wouldn’t even need an interface at all: just hit the shortcut key and the currently selected images would be geotagged using the current location in Google Earth.

Written by Martijn van Exel

January 20, 2010 at 3:03 pm

Posted in spare time

Visualizing geospatial data quality

leave a comment »

In the coming months, I will be working on how to measure the quality of geospatial information, and visualizing the results of quality analysis. The actual indicators for quality are still to be defined, but will be along the lines of

  • spatial density – how many features of a certain type does dataset A have, and how many does dataset B have?
  • temporal quality – what is the age of the data? How much time has passed since survey, publishing?
  • crowd quality – what I call the ‘5th dimension of spatial data quality’. more complex (separate post will follow) -

OpenStreetMap 'cheat sheet' mug showing the most used tags.

‘Crowd Quality’ has many dimensions. It is about peer review strength: how many surveyors have ‘touched’ a feature? how many surveyors are responsible for area X? It has several consistency components as well. One is internal attribute consistency: to what extent does the data conform to a set of core attrtibutes? Another is spatial and temporal quality consistency: considering a larger region, does the data show consistent measurements for spatial and temporal quality indicators as described above?

Quality analysis is an important issue for Volunteered Geographic Information projects like OpenStreetMap, because their data is consistently strongly scrutinized: it’s open, so it’s easily accessible and it’s very easy to take cheap shots at extensive voids in the map. Because of its openness, professional users have strong reservations pertaining to the quality of the data: there is almost no barrier for entry into the OpenStreetMap community: provide a username and an email address and you’re good to go – and delete all the data for Amsterdam, for example.

In a community of 200,000, map vandalism of such magnitude will be swiftly detected and reverted, and as such should not even be the biggest concern of potential users of VGI data. Smaller acts of map vandalism, however, might go undetected for a longer period of time, if they are detected at all. Moreover, with OpenStreetMap picking up momentum as it is currently doing, there’s a lot of new aspiring surveyors joining every day. Even when they all subscribe and start adding data with the best intentions, ‘newbies’ are bound to get it wrong at first, inadvertently adding a stretch of freeway in their residential neighborhood, or unintentionally moving features around when all they want to do is add their local pub. Even if the community tends to react to map errors – inadvertent or no – swiftly and pro-actively, the concerns potential users have about the quality of the data is legitimate. VGI is anarchy, and where there is anarchy, there are no guarantuees.

The need for quality analysis also arises from within the VGI communities themselves. As a VGI project matures, contributors are likely to shift their attention to details. This can certainly be said for OpenStreetMap, where some regions are nearing or reaching completion of the basic geospatial features. A quick glance of the current map will no longer be enough to decide how and where to direct your surveying and mapping effort. Data and quality analysis tools are needed to aid the contributors in their efforts. These can be really simple tabular comparisons; in many German cities for example, OpenStreetMap contributors have acquired complete and up-to-date street name lists from the local council, which they compare to the named streets that exist in the OpenStreetMap database. This effort (Essen, Germany here) yields a simple list of missing street names which can then be targeted for mapping efforts.

More complex and versatile data quality analysis tools are being developed as well. Let me give a few examples to conclude this article and give some idea of how the results of my quality analysis research could be visualized

OpenStreetBugs

Not an automated data analysis tool, this web site allows for simple map bug reporting. It was designed to provide a no-barrier way to report errors on the map: you do not even need to be registered as an OpenStreetMap user to use it. It provides some indication of data quality. It can be used by OpenStreetMap contributors to fix reported errors quickly; the web site provides a link to the web-based OpenStreetMap editor, Potlatch, with every reported error automatically.

Visual comparison: Map Compare and FLOSM

An often asked question pertaining to data quality of OpenStreetMap is: How does OpenStreetMap compare to TeleAtlas or NAVTEQ, the two major commercial vendors of street data. While comparing the spatial quality is in itself not a complicated task, you need to have

Map Compare

FLOSM

access to both data sets in order to actually do it. TeleAtlas and NAVTEQ data is expensive, so not many are in a position to actually do this comparison. In the course of my research, I will certainly perform a number of these analyses, as I am in the fortunate position to have easy access to commercial spatial data.

A simple but effective way to visually compare two spatial data sets is to overlay them in GIS software, or in a web mapping application. Making such overlay web applications available is generally discouraged in VGI communities, as it is thought to encourage ‘tracing’ data from proprietary sources. This is a violation of the licenses for most all commercial spatial data, and could thus mean legal trouble for VGI projects.

Nevertheless, some visual comparison tools do exist. Map Compare presents a side-by-side view of OpenStreetMap and Google Map, allowing for easy and intuitive exploratory comparing of the two. FLOSM takes it a step further with a full-on overlay of TeleAtlas data on top of OpenStreetMap data.

Automated analysis: KeepRight and OSM Inspector

OSM Inspector

KeepRight

The tools we’ve seen so far do not provide analysis intelligence themselves; they simply display the factual data and leave it to the user to draw conclusions. Another category of quality assurance tools takes the idea a step further and performs different spatial data quality analyses and displays the results in a map view.

German geo-IT company Geofabrik, also responsible for the Map Compare tool mentioned earlier, publishes the widely used OSM Inspector tool, that can be used to perform a range of data quality analyses on OpenStreetMap data. It can effectively visualize topology issues and common tagging errors. Input for the tool’s functionality and for extending its range of visualizations comes from the community. A recent addition requested by the Dutch community has been a visualization that shows the Dutch street data that has not been ‘touched’ since it was imported in 2007, when AND donated their street data for the Netherlands to OpenStreetMap, effectively completing the road network for the Netherlands in OpenStreetMap. This particular visualization helps Dutch OpenStreetMap contributors to establish which features have not yet been checked since they were imported. A similar tool was put in place when TIGER data from the US Census Bureau was imported into OpenStreetMap in 2008.

KeepRight takes a similar approach as OSM Inspector, analysing OpenStreetMap data for common errors and inconsistencies in the data and displaying them in a web map application.

While these tools are extremely useful for OpenStreetMap contributors looking to improve the data and correct mistakes, they are not particularly useful for visualizing quantitative data quality research outcomes, as those outcomes will be aggregated, generalized data.

For many of the ‘Crowd Quality’ indicators, I am probably going to take a grid approach: establishing quantifiable indicators for Crowd Quality and calculate them for each cell in the grid. What that grid will look like is actually also a matter of debate – it would depend on the quality indicator measured, and on the characteristics of the real world situation referenced by that grid cell.

To get an idea of what a grid visualization pertaining to quality could look like, it’s interesting to look at the visualization for the Aerial Imagery Tracing project running in the German OpenStreetMap community. A set of high resolution aerial photos was made available to OpenStreetMap, and integrated into map editing software for purposes of tracing features. Some tools were developed to assist in completing this effort; amongst those, a grid overlay visualizing the progress for each grid cell. No automated analysis is performed, rather, contributors are asked to scrutinize the grid cells themselves and rate completeness on several indicators. Although the pilot project was completed some time ago, the visualization is still online.

[Edit] This blog post goes into the technicalities of setting up a grid in PostGIS.

Written by Martijn van Exel

January 17, 2010 at 1:36 am

Priceless?

with one comment

Volunteered Geographic Information

Free, Priceless Or Somewhere In Between?

This is the title that has been popping into my head since last summer. I am writing it down because it encompasses in a very general sense the themes that I want to cover in my dissertation, and thus serves me well in trying to guide me while I try to elaborate on them.

I have actually already written some paragraphs elaborating on the themes and ideas that follow, but I want to force myself to touch upon them concisely here.

Volunteered Geographic Information (VGI) is a concept that has not been around for a very long time. Geographic Information has, however: it is what maps are made out of, and what your car navigation device relies on to guide you. Traditionally, Geographic Information is collected, processed and used by professionals, but this no longer holds true: Geographic Information has undergone a process of democratization, both in the usage dimension and in the collection and processing dimension. People are now used to dealing with Geographic Information in different contexts, and have started to pool resources to collectively build repositories of Geographic Information, to facilitate the democratization of the entire ecosystem of Geographic Information.

OpenStreetMap is the most prominent of these efforts, and one in which I have been actively involved since early 2007. Since its conception in 2005, it has grown to a worldwide collaborative effort involving more than 100.000 contributors. In some regions, the maps available from OpenStreetMap are so rich and complete that they are used instead of commercially available map data.

I realize that I need to come up with some examples here, and some numbers that give an indication of how OpenStreetMap has grown, but I am on a train, blissfully disconnected from the internet, so you will just have to bear with me for now. But believe me, it’s getting big fast – at a rate that makes me worried about the validity of any quantitative research results that I might present in the context of this dissertation. But this will have to be dealt with in some future note.

Let us assume for now that OpenStreetMap – there are other VGI efforts around, and they will need to be touched upon as well – is indeed starting to occupy a significant share in the commercial market for Geographic Information. That means the OpenStreetMap data represent a commodity and as such, economic value. As OpenStreetMap data is available at no cost, this value is not quantified in the marketplace, however. This poses intriguing questions:

What is this freely available OpenStreetMap data actually worth?

How do you even begin to measure the value of something that is not subject to the usual economic market mechanisms?

When dealing with value, I believe I cannot omit the concept of quality, especially in this context. Any VGI effort relies on volunteers collecting data in their spare time. While some regions have very active communities, getting together to discuss progress and plan improvements to the map, checking and correcting each other’s contributions, other regions rely on single, isolated individuals contributing to the map – or worse: no-one contributing at all. The resulting picture is one of spotty coverage: very densely mapped regions exist side by side with tersely covered regions. More questions arise!

Is it possible to define the quality of volunteered geographic information in any satisfactory way?

How?

More generally: how do quality and value relate when dealing with geographic information?

I think I cannot proceed from here without looking at real world situations. Economic value is defined in the marketplace where supply and demand meet, and thus cannot be studied without some understanding of how and where this demand arises.

There clearly is a demand for VGI, but where does it originate?

Why would people want to use information that comes with no guarantees of completeness or even factual correctness, and that does not have a consistent quality?

I will need to get to the bottom of this. Apparently it is ‘good enough’ for some! If I’m not careful I will be entering into the domain of psychology. I think I need to stop soon, or I will have covered all domains of modern science and will have defined ample questions to last me three dissertations. But let me just finish this train of thought, and by then I will have arrived in Berlin – one of the best covered cities in OpenStreetMap, by the way; you can even get a detailed map of the zoo!

What drives the decision on the demand side to use volunteered geographic information instead of commercial offerings that do come with a quality label?

I can think of a number of reasons. Firstly, there is a growing number of application domains that do not require extensive, nationwide coverage. The growing domain of location based services are often only relevant in metropolitan areas; consider for example pedestrian and bicycle routing, social networking applications, tourist guide services or restaurant / bar recommendation applications. Even many applications in professional domains operate only within a designated metropolitan area: local police, fire brigades and other public safety professionals operate only within their metro area.

Interestingly, supply and demand sync up really nicely here: in areas where there is likely to be a great demand for high quality – whatever that may mean – geographic information, there is also likely to be a large number of contributors to volunteered geographic information repositories. (This reminds me of my master’s thesis that dealt with the quality of public transportation in rural areas. There was a similar process at play: because of the limited and geographically thinly spread demand, the costs of maintaining a reasonable quality of service had become so high that cuts in service quality had become unavoidable, lowering the demand even further. Both Dutch and German regional governments were struggling to counter this downward spiral, and I did a comparative study on the results of those efforts.)

Secondly, because there is very little restrictions and limitations in terms of how and where you can use the data. Commercial data usage licenses are more often than not restricted to a certain type of application, device or to a limited number of users or devices, and the data can only be used as-is. OpenStreetMap data can be used in almost every context imaginable, and you are free to modify and adapt the data to suit your needs.

Lastly, of course, because it’s free.

I have mixed feelings about this post. It feels unfocused, but I guess that is to be expected. More importantly, I don’t feel comfortable in the domain of economics. Sure, I did my two years of high school accounting and economics, but it did not quite take. It does not particularly interest me, but I feel I need to deal with it anyway. Intuitively, I am drawn to the question of defining and measuring quality. I want to think about how to do that, write tools to analyze OSM data – that part I am really passionate about. It seems like a good moment to talk to Henk and maybe some other people I know that could help and advise me at this junction.

Written by Martijn van Exel

October 2, 2009 at 8:48 pm

Posted in research

So this is it!

leave a comment »

So this is it. This is going to be my dissertation diary. I’m not going to make any commitments as to how often I will write in it; I just read that I should be spending at least 15 minutes every day on my dissertation. Every day for the next four, five, six years! Intriguing at least.

I’m at the very beginning of the process, and my thoughts are really unfocused at this point. In this first entry, I will not go into the theme itself, there will be ample opportunity for that. I would just, for a moment, like to ponder over the implications. At least four years of my life will be dedicated, at least to some degree, to researching and writing about this theme that has yet to unfold.

As I am writing this, I feel that I want to write, I like to explore my thoughts by putting them in writing, although writing in English makes it even harder for my fingers to keep up with my ever-wandering mind.

The first question that springs to mind as I embark on this diary is: should I publish it? Not the dissertation I mean, but these notes? It seems, on the one hand, pointless and vain. Who would want to read about the nitty-gritty details of my struggle towards acquiring a doctorate? Not many, probably, but there might be a reason or two to do it anyway.

Publishing my thoughts might help me overcome a feeling of awkwardness that I frequently have about this project: who am I to think I can do original, creative research? These isolated thoughts, rough outlines of a theme that I might want to pursue, seem so superficial and gratuitous! If I would just go ahead and publish my thoughts and ideas and processes – that would seem to provide some validity to them. An irrational thought maybe, but it works for me.

Publishing these notes may also invoke some sense of urgency. I know I have a tendency to keep thoughts and ideas to myself for too long, thinking they need to mature before they are ready to be shared with the world. This is an inhibition that will seriously slow me down and that I must learn to set aside. It has already happened and I have not even begun to formalize a proposal!

More than a year ago now, Henk Scholten invited me to come to the Vrije Universiteit to discuss the possibilities of him supervising my dissertation. We had a really nice and productive discussion and I felt both flattered and motivated, and told him I would write some ideas I had down for him to ingest. We would have a follow-up meeting soon.
I explored the idea for a while, discussed implications with a couple of colleagues and friends, thought about interesting themes. I think I even wrote some things down, but I did not feel any of them were good or mature enough to even put forward to Henk.

Although the though of doing a dissertation was on my mind now and then over the months that followed, I found myself glad to be distracted by other things to occupy my mind and time. And so time passed, and here we are. I feel that I want to do this more strongly now, for reasons I will explain in a future post. So I am going to write. And explore. It will be beautiful. I can be that naive.

Written by Martijn van Exel

October 2, 2009 at 6:43 pm

Posted in research

Gemeentegrenzen uit OpenStreetMap

with 2 comments

OpenStreetMap is de vrije wereldkaart waaraan iedereen kan bijdragen. De geodata is vrij beschikbaar volgens een Creative Commons-licentie. OpenStreetMap (OSM) bevat allang niet meer alleen straten, maar is uitgegroeid tot een veelzijdige repository van vrij beschikbare geodata. Het is alleen nog niet zo makkelijk om er uit te pakken wat je nodig hebt.
Het standaard exportformaat van OpenStreetMap is een eigen XML-formaat. Dit is met allerlei open source tools, die beschikbaar zijn via de OSM-wiki op http://wiki.openstreetmap.org of de subversion-repository op http://svn.openstreetmap.org/.
Dit artikel illustreert hoe je de actuele Nederlandse gemeentegrenzen uit de live OSM-database haalt en deze importeert in een PostGIS-database.

De database

Het startpunt voor het zoeken naar specifieke informatie in de OSM-database is de Map Features-wikipagina: http://wiki.openstreetmap.org/wiki/Map_Features. Deze pagina bevat een overzicht van alle gebruikte ‘tags’ voor objecten in de database. Gemeentegrenzen vallen onder de Administrative Boundaries: http://wiki.openstreetmap.org/wiki/Key:boundary. Hoewel de op deze pagina bijgehouden tabel met de indeling per land hier niet helemaal specifiek over is – er wordt gesproken van ‘boundaries for cities like Amsterdam but also smaller like Volendam and Lutjebroek’ – vallen de gemeentegrenzen onder admin_level=8. Op dezelfde pagina lezen we dat de modus operandi om administratieve grenzen in OSM te zetten is door gebruik te maken van ‘relations’. (OpenStreetMap kent slechts drie soorten objecten: nodes (punten), ways (lijnen) en relations (relaties tussen groepen van de andere twee types).)

Extractie

We weten nu dat we alle ‘relations’ van het type ‘admin_level=8′ willen hebben. Er zijn verschillende manieren om een dergelijke abstractie uit de live-database te maken. De ene is een actuele dump van het gewenste gebied downloaden (deze zijn beschikbaar via http://downloads.cloudmade.com/ ) en hieruit vervolgens met de command-line tool ‘osmosis’ (http://wiki.openstreetmap.org/wiki/Osmosis ) een selectie maken. Een andere manier is om gebruik te maken van de OSM Extended API (OSMXAPI, spreek uit OSM-Zappy, zie http://wiki.openstreetmap.org/wiki/Osmxapi ). De volgende URL levert dan de gemeentegrenzen op in OSM XML-formaat: www.informationfreeway.org/api/0.5/relation[admin_level=8][bbox=3.35376,50.57484,7.22095,53.51513].

Import

Het resulterende OSM-XML-bestand kun je importeren in een PostGIS-database met behulp van OSM2PGSQL: http://wiki.openstreetmap.org/wiki/Osm2pgsql.
Ervan uitgaande dat je al een spatial database hebt met de naam ‘postgis’ gaat het dan als volgt:

> osm2pgsql -H tm-sr -U postgres -W -d postgis gemeentegrenzen_081118.osm

osm2pgsql SVN version 0.55-20081118 $Rev: 10464 $

Password:
Using projection SRS 900913 (Spherical Mercator)
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads
Mid: Ram, scale=100

Reading in file: gemeentegrenzen_081118.osm
Processing: Node(110k) Way(2k) Relation(0k)
Node stats: total(110573), max(312315964)
Way stats: total(2579), max(28446793)
Relation stats: total(690), max(51805)

Writing way(0k)

Te zien is dat osm2pgsql vier tabellen aanmaakt (als deze al bestaan dan worden ze default leeggemaakt, let op dus!).
We maken ons even niet druk om spatial indexes en bekijken het resultaat:

Naschrift

Op de site van Cloudmade zijn ook ready-made shapefiles beschikbaar per land. In dit pakket zit ook een administrative shapefile, maar deze is niet goed:

mwsnap-2008-11-18-14_03_39

Deze wat langere weg verdient dus nog steeds de voorkeur!
Overigens zijn de Nederlandse OSM-ers (waaronder ondergetekende) ook bezig met het invoegen van andere officiële en niet-officiële indelingen in de database. Denk aan COROP-gebieden, wijken en buurten, EGG-gebieden, politieregio’s, postcodegebieden en bebouwdekomgrenzen.


Written by Martijn van Exel

November 18, 2008 at 2:17 pm

Note To Self: The One And Only RD Projection String

with 4 comments

EPSG:28992, or the Dutch double stereographic RD (RijksDriehoekstelsel) projection, is quite often incompletely or just plain badly defined.

My version of MapServer for Windows (2.2.6 from september last year) states

+proj=stere +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.999908 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs  no_defs <>

Which yields the following result when a native 28992 dataset is projected onto a Microsoft Virtual Earth (EPSG:900913 or EPSG:3785 as it is now called):

Note that the buildings layer on top of the VE aerial photos is shifted to the north, by about 100 metres.

Spatialreference.org has a slightly different take on EPSG:28992:

+proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs

which yields an almost identical result:

These projection strings are both incomplete, because they do not take into account the datum shift that is used in the RD projection and can be approximated using the ‘towgs84′ parameter in PROJ4.

The one and only right PROJ4 projection string is

+proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.999908 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +towgs84=565.2369,50.0087,465.658,-0.406857330322398,0.350732676542563,-1.8703473836068,4.0812 +no_defs <>

Links

  • Explanation of the towgs84 parameter on this page
  • Some discussion about the RD datum shift on the PROJ.4 mailing list
  • A non-technical discourse on datum shift and coordinate systems in Dutch.
  • The Dutch national survey has a website on the RD coordinate system.
  • There is also a very Web 0.5 site on the RD system and NAP (Normaal Amsterdams Peil, the Dutch standard sea water level which can be observed in the Amsterdam City Hall)

Written by Martijn van Exel

May 20, 2008 at 3:29 pm

The 5 minute guide to setting up GeoServer and GeoWebCache on Windows

with 4 comments

I came across yet another tile caching implementation, GeoWebCache, through this article on the Google Open Source blog. It integrates nicely with the Geoserver OGC server, which should make it very easy to set up on a Windows box. So let’s try that. Read the rest of this entry »

Written by Martijn van Exel

May 20, 2008 at 2:30 pm

Posted in Uncategorized

Tagged with , , , ,

The End Of Flickr?

leave a comment »

Well, certainly not today, and certainly not soon, but the introduction of georeferenced photos on Google Maps this week will certainly rock the online photo communities’ boat. Sure, there are tons of websites overlaying flickr photos on top of a web map, and most are richer than what Google Maps currently offers. loc.alize.usTake for example loc.alize.us, a flickr/Google Maps mashup that has been around for a while. It offers tag filtering, user filtering, and a very nice and clean interface. To top it off, it offers a bookmarklet that integrates georeferencing into flickr.com very nicely. I still use it, although Yahoo Maps, the mapper of choice for Flickr’s mapping needs, of course, has had adequate coverage of the Netherlands for some time now.

But still.. It’s not directly ON Google Maps, which is – at least in Western Europe at this time – the ubiquitous web map. The general public will rarely discover any layer of the geographic web beyond Google Maps and Google Earth. ‘So, if I want my photos to show up on the web, I need to be on Panoramio.’ – Panoramio being the photo sharing community that has been showing off on Google Earth for as long as I can remember, and as from now on Google Maps as well. Panoramio was acquired by Google in May, 2007.

No, I don’t expect a mass flux of flickr users towards Panoramio. The latter will see a good number of new members though, and if Google remains as picky about which photos to display within Maps – I’m still confused as to where this leaves Picasa; I guess the user base is not large enough – Panoramio might become a force to be reckoned with in the online photo community universe.

Written by Martijn van Exel

May 16, 2008 at 8:19 am