Yesterday we discussed making OpenStreetMap data available as a data source for the Tripod project in which we @ Geodan S&R take part. For that to work, we need the data available as an OGC WFS service. This is a nice opportunity to explore the tool chain needed. In short:
- Getting the OpenStreetMap data
- Setting up a PostGIS database
- Getting the OpenStreetMap tools
- Importing the OpenStreetMap data into PostGIS
- Setting up deegree
That’s it. Let’s explore the different steps a little further.
Note that I used Ubuntu 7.10 (in a VMWare box on my MacBook Pro) to do all of this. It could perhaps be done on Windows and Mac OSX, but I really don’t know about the import part. The osm2pgsql tool is not ported to Windows and I have no idea whether it can be made to compile in a hostile environment. OSX should be doable but I didn’t investigate.
Getting the OpenStreetMap data
OpenStreetMap makes daily database dumps available in a simple XML format. The resulting file is called a ‘planet’ and is generated daily. The whole world is now 60GB in size, still 3.6GB with bzip2 compression. So you might want to consider getting one of the various extracts available from mirrors all over the world. For our needs, the Dutch extract is fine. This file is currently 90MB compressed, aboud 900MB uncompressed. Get the latest Dutch planet here.
We don’t need to uncompress the file, the import tool handles compressed files as well.
Setting up a PostGIS database
This goes beyond the scope of this article. There is an excellent install guide here. I had a PostgreSQL 8.2 / PostGIS 1.1.6 setup already available on a Windows box. There’s some specific steps to set up the database environment for the OpenStreetMap database. These are described well on the OpenStreetMap wiki. Some things I encountered on the way:
- PostgreSQL is set up by default to accept only connections from localhost. You need to edit two config files in order to get access (either using psql or PgAdminIII) from another machine:
- postgresql.conf –> set listen_addresses
- pg_hba.conf –> create an entry to allow connections from a subnet.
- As soon as you have that set up, all PostgreSQL client tools work form a remote machine as long as it’s in the designated subnet.
If you already have a fairly recent postgres / postgis setup on linux, you can just create a new geodatabase from the shell using
createdb -T template_postgis osm
where osm is the name of the database to be created.
Getting the OpenStreetMap tools
OpenStreetMap keeps a subversion repository with all client and server software. This contains source for editors like JOSM and Potlatch, but also the stuff we need: the export and import tools. Specifically we need osm2pgsql to import a planet file into PostGIS. This tool is in some package sources, like lenny (testing) for debian, but this is no use for us. This is an older version that can only connect to a database on localhost.
So we checkout the latest version from the repo:
mvexel@mbp-ubuntu:~/osm$ svn co http://svn.openstreetmap.org/applications/utils/export
We need to satisfy some dependencies as well. From the osm2pgsql README:
- libxml2 http://xmlsoft.org/ - geos http://geos.refractions.net/ - proj http://www.remotesensing.org/proj/ - bzip2 http://www.bzip.org/ - zlib http://www.zlib.net/ - PostgreSQL http://www.postgresql.org/ - PostGIS http://postgis.refractions.net/
So make sure you have those:
mvexel@mbp-ubuntu:~$ sudo apt-get install libxml2-dev libgeos-dev proj libbz2-dev bzip2 zlib1g-dev postgresql-8.2-postgis libpq-dev
Now change into the export/osm2pgsql directory and make the executable.
It will be in the same directory as osm2pgsql.
Importing OpenStreetMap data into PostGIS
Now this is the fun part. It is quite well described on the OpenStreetMap wiki.
Basically this is what it boils down to:
mvexel@mbp-ubuntu:~$ osm/export/osm2pgsql/osm2pgsql -H 192.168.26.83 -U osm -l -W osm/planet/planet-nl-latest.osm osm2pgsql SVN version 0.52-20080307 $Rev: 6920 $Password: Using projection SRS 4326 (Latlong) Setting up table: planet_osm_point Setting up table: planet_osm_line Setting up table: planet_osm_polygon Setting up table: planet_osm_roadsReading in file: osm/planet/planet-nl-latest.osmProcessing: Node(4433k) Way(1216k) Relation(1k) Writing rel(0k) Writing way(44k)Node stats: total(4433509), max(250236805)Way stats: total(1216105), max(23158086) Relation stats: total(1348), max(6364)
The -H option sets the remote PostgreSQL database host. This is why we need the svn repo version, it was implemented recently. Before, it just assumed you had a database ‘gis’ on localhost port 5432.
The -U option sets the database user
The -l sets the coordinate system to EPSG:4326 or Lat/Long WGS84. Other options are ‘OSM’ Mercator or proper spherical Mercator.
The -W option forces a password prompt for the database user.
The argument at the end specifies the source planet file to import. There’s no need to decompress a bz2-compressed planet, osm2pgsql will handle that on the fly.
If osm2pgsql croaks complaining about a missing ‘default.style’ file, make sure you launch osm2pgsql from its own directory and check if the file made it from the repo to your local source tree.
You need lots of RAM to do this. 512MB surely won’t be enough.
The process took a little less than an hour on my machine, ymmv. The -v option gives more verbose output, you might want to set that switch to see what’s going on.
I wanted to load the data in a GIS to check visually that the import had succeeded. Quantum GIS complained about the lack of a primary key. Strangely, osm2pgsql doesn’t add one by default and it does not create a OID column either, so we need to do some postprocessing of the osm2pgsql created tables. There are four:
For each of them, we’re going to add a serial column and define that as the primary key:
osm=# ALTER TABLE planet_osm_point ADD COLUMN pid serial;
osm=# ALTER TABLE planet_osm_point ADD CONSTRAINT pid_pkey PRIMARY KEY (pid); osm=# ALTER TABLE planet_osm_line ADD COLUMN pid serial; osm=# ALTER TABLE planet_osm_line ADD CONSTRAINT pid_pkey PRIMARY KEY (pid); osm=# ALTER TABLE planet_osm_polygon ADD COLUMN pid serial; osm=# ALTER TABLE planet_osm_polygon ADD CONSTRAINT pid_pkey PRIMARY KEY (pid); osm=# ALTER TABLE planet_osm_roads ADD COLUMN pid serial; osm=# ALTER TABLE planet_osm_roads ADD CONSTRAINT pid_pkey PRIMARY KEY (pid);
So now the data is ready for consumption. Let’s see some results!
This is the Netherlands in OSM data in uDig
It consists of
This is what is looks like when zoomed in to Amsterdam:
Setting up deegree WFS
My colleague Steven did this, may be he’ll blog about it some day. Bug him about it. It took him less than half an hour, so it can’t be difficult.