Detecting Highway Trouble in OpenStreetMap


For the impatient: do you want to get to work solving highway trouble in OpenStreetMap right away? Download the Trouble File here!

Making pretty and useful maps with freely available OpenStreetMap data has never been so easy and so much fun to do. The website Switch2OSM is an excellent starting point, and with great tools like MapBox’s TileMill at your disposal, experimenting with digital cartography is almost effortless. Design bureau Stamen shows us some beautiful examples of digital cartography based on OpenStreetMap data. Google starting to charge  for using their maps API provides a compelling push factor for some to start down this road, and the likes of Foursquare and Apple lead the way.

With all the eyes on OpenStreetMap as a source of pretty maps now, you would almost forget that the usefulness of freely available OpenStreetMap data extends way beyond that. One of the more compelling uses of OpenStreetMap data is routing and navigation, and things have been moving there. Skobbler has succeeded in making a tangible dent in the turn-by-turn navigation market for mobile devices in some countries, offering similar functionality as TomTom but at a much, much lower price point, using freely available OpenStreetMap data. MapQuest and CloudMade offer routing APIs based on OpenStreetMap. New open source routing software projects OSRM and MoNav show promise with very fast route calculation and a full feature set, and both are built from the ground up to work with OpenStreetMap data.

Routing puts very different, much stricter requirements on the source data than map rendering. For a pretty map, it does not matter much if roads in the source data do not always connect or lack usage or turn restriction information. For routing, this makes all the difference. Topological errors and lacking usage restriction metadata make for incorrect routes. They will direct you to turn left onto a one-way street, get off the highway for no apparent reason, even if there is no exit. That may seem funny if you read about it in a British tabloid, but it’s annoying when you’re on a road trip, and totally unacceptable if you depend on routing software for your business. So unless the data is pretty much flawless, we won’t see major providers of routing and navigation products make the switch to OpenStreetMap that some have so eagerly made for their base maps.

It turns out the data is not flawless. A study done at the University of Heidelberg shows that even for Germany, the country with the most prolific OpenStreetMap community by a distance, the data is not on par with commercial road network data when compared on key characteristics for routing. (Even though the study predicts that in a few months, it will be).

Turning to the US, the situation is bound to be much worse. With a much smaller community that is spread pretty thin geographically (and in some regions, almost nonexistent), and the TIGER import as a very challenging starting point, there is no way that any routing based on OpenStreetMap data in the US is going to be anywhere near perfect. Sure, the most obvious routing related problems with the TIGER data were identified and weeded out in an early effort (led by aforementioned CloudMade) shortly after the import, but many challenges still remain.

In an effort to make OpenStreetMap data more useful for routing in the US, I started to identify some of those challenges. Routing is most severely affected by problems with the primary road network, so I decided to start from there. Using some modest PostGIS magic, I isolated a set of Highway Trouble Points. The Trouble breaks down into four main classes:

Bridge Trouble

This is the case where a road crossing over or under a highway is not tagged as a bridge, and even worse, shares vertices with the highway, as illustrated below. This tricks routing software into thinking there is a turn opportunity there when there is not. This is bad enough if there actually is an exit, like in the example, but it gets really disastrous when there is not.

These cases take some practice to repair. It involves either deleting or ungluing the shared nodes, splitting the road that should be a bridge, and tagging it as a bridge=yes, layer=1.

Imaginary Exit Trouble

Sometimes, a local road or track will be connected to a highway, tricking routing software into possibly taking a shortcut. Repairing these is simple: unglue the shared node and move the end of the local road to where it actually ends, looking at the aerial imagery.

Service Road Trouble

The separate roadways of a highway are sometimes connected to allow emergency vehicles to make a U-turn. Regular traffic is not allowed to use these connector service ways, but during the TIGER import they were usually tagged as public access roads, again potentially tricking routing software into taking a shortcut. I repair these by tagging them as highway=service and access=official, access=no, emergency=yes.

Rest Area Trouble

This is of secondary importance, as rest areas are usually not connected to the road network except for their on- and off-ramps. Finding these Trouble points was an unexpected by-product of the query I ran on the data. What we have here is rest areas that are not tagged as such, instead just existing as a group of ‘residential’ roads connecting to the highway features, without a motorway_link. While we’re at it, we can clean these up nicely by adding motorway_links at the on- and off-ramps, the other road features as highway=service, adding the necessary oneway=yes and identifying a node as highway=rest_area. It’s usually obvious if there are toilets=yes from the aerial image, too.

I have done test runs of the query on OSM data for Vermont and Missouri. The query is performed on a PostGIS database with the osmosis snapshot schema, optionally with the linestring extension, and goes like this:

DROP TABLE IF EXISTS candidates;
CREATE TABLE candidates AS
    WITH agg_intersections AS
    (
        WITH intersection_nodes_wayrefs AS
        (
            WITH intersection_nodes AS
            (
                SELECT
                    a.id AS node_id,
                    b.way_id,
                    a.geom
                FROM
                    nodes a,
                    way_nodes b
                WHERE
                    a.id = b.node_id AND
                    a.id IN
                    (
                        SELECT 
                            DISTINCT node_id
                        FROM 
                            way_nodes
                        GROUP BY 
                            node_id
                        HAVING 
                            COUNT(1) = 2
                    )
            )
            SELECT
                DISTINCT a.node_id AS node_id,
                b.id AS way_id,
                b.tags->'highway' AS osm_highway,
                a.geom AS geom,
                b.tags->'ref' AS osm_ref
            FROM
                intersection_nodes a,
                ways b
            WHERE
                a.way_id = b.id
        )
        SELECT
            node_id,
            array_agg(way_id) AS way_ids,
            array_agg(osm_highway) AS osm_highways,
            array_agg(osm_ref) AS osm_refs
        FROM 
            intersection_nodes_wayrefs
        GROUP BY 
            node_id
    )
    SELECT
        a.* ,
        b.geom AS node_geom,
        -- COMMENT NEXT LINE OUT IF YOU DON'T HAVE
        -- OR WANT WAY GEOMETRIES
        c.linestring AS way_geom
    FROM 
        agg_intersections a, 
        nodes b,
        ways c
    WHERE
        (
            'motorway' = ANY(osm_highways)
            AND NOT
            (
                'motorway_link' = ANY(osm_highways)
                OR
                'service' = ANY(osm_highways)
                OR 
                'motorway' = ALL(osm_highways)
                OR 
                'construction' = ANY(osm_highways)
            )
        )    
    AND
        a.node_id = b.id
    AND
        c.id = ANY(a.way_ids);
;

The query took about a minute to run for Vermont and about 5 minutes for Missouri. For Vermont, it yielded 77 points and for Missouri 193 points. You can download these files here, but note that I have already done much of the cleanup work in these states since, as part of my thinking on how to improve the query. It still yields a some false positives, notably points where a highway=motorway turns into a highway=trunk or  highway=primary, see below.

UPDATE: This query filters out these false positives, it uses the ST_Startpoint and ST_Endpoint PostGIS functions to determine if two line features ‘meet':

DROP TABLE IF EXISTS candidates_noendpoints;
CREATE TABLE candidates_noendpoints AS

SELECT 
    DISTINCT c.node_id,
    c.node_geom
FROM
    ways a,
    ways b,
    candidates c
WHERE
    ST_Intersects(c.node_geom, a.linestring)
AND
    ST_Intersects(c.node_geom, b.linestring)    
AND NOT
(
    ST_Intersects(c.node_geom, ST_Union(ST_StartPoint(a.linestring),ST_Endpoint(a.linestring))) 
    AND
    ST_Intersects(c.node_geom, ST_Union(ST_StartPoint(b.linestring),ST_Endpoint(b.linestring))) 
)
;

This query requires the availability of line geometries for the ways, obviously.

UPDATE 2: The query as-is made the PostgreSQL server croak because it ran out of memory, so I had to redesign the query to rely much less on in-memory tables. I will provide the updated query to anyone interested. I’m going to leave the original SQL up there, it was meant to convey the approach and it still does. The whole US trouble file is available as an OSM XML file from here.

I plan to make the Highway Trouble files available on a regular basis for all 50 states if there’s an interest for them. And as always I’m very interested to hear your opinion: any Trouble I am missing? Ways to improve the query? Let me know.

About these ads

11 thoughts on “Detecting Highway Trouble in OpenStreetMap

  1. This is definitely something which would be useful. You may also want to try to get in touch with the people running keepright, or the other such services to have your bug results included in their normal assessments.

    • Thanks, a great suggestion. Do you know who runs keepright?
      I am preparing the data to run this query on the entire US now. I will make the .OSM file available when PostGIS is done crunching.

  2. Pingback: OpenStreetMap Chile » Blog Archive » Resumen Semanal OSM #40

  3. Pingback: Weekly OSM Summary #40 | J2ME GPS

  4. To Frédéric: Thanks for reminding me of osmose, that is an excellent QA tool that had slipped under my radar. I want to have a look to see how useful it could be for the US – your current geographical scope is France, right?

  5. This is exactly what I was looking for: a listing of trouble spots on the motorways! The other tools have a very broad focus.

    I have been working thru the list and continue to fix issues. Please re-run your query in a month (or so) to see how much progress has been made. The other issue that makes routing painful is motorway_links with flipped or missing one way tags. I’ve been looking at the interchanges as I go. Thanks again for the excellent tool!

    • And thanks for helping to fix the issues! I noticed you’ve been doing a lot of it. I will re-run the analysis at some point to see what progress has been made.

  6. Pingback: Weekly OSM Summary #40 | OpenStreetMap Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s