Building Better Tools for Transit Data Management

By Matthew Wigginton Conway 23 Jan 2015
GTFS Data Manager showing validation results for a feed

One of the projects we’ve been working on here at Conveyal over the past months is improving OpenTripPlanner support for large geographic regions with multiple transit agencies. However, large OTP deployments also present a significant data management challenge.

We needed a data workflow that could integrate feeds from many sources, and combining existing public feeds with feeds that are maintained internally.

We also needed to track feed statistics (e.g. number of stops, number of routes, and number of data warnings) over time. Finally, we needed to be able group feeds together, ensure that they are valid and not close to expiration, and deploy collections of them to OpenTripPlanner in production or test environments.

We ended up using a combination of two tools to accomplish this task. For managing and cataloging GTFS data, we built GTFS Data Manager. The tool shows a table of the most recent version of all of the GTFS feeds, showing basic statistics about a feed such as the number of routes and stops, when the feed expires, and how many validation warnings were encountered when importing the feed. By clicking on a feed, you can see details on what the warnings were and can deploy the feed to a test environment running OpenTripPlanner, allowing you to see how the feed will perform once deployed to the production journey planner. Feeds can be uploaded manually or fetched directly from a URL.

GTFS Editor configuring stop times

For GTFS editing, we improved the existing Conveyal GTFS Editor, which had already been used to build GTFS for the Atlanta region. We made a number of improvements, the most notable being a timetable editor that allows you to edit transit schedules in the browser. This is basically a spreadsheet application that is optimized for transit schedules. For example, you can duplicate a trip and offset it in time (allowing rapid creation of trips that have the same run time) or offset a block of times. We integrated the GTFS Editor with the GTFS Data Manager, so that you can simply click on ‘edit’ on a feed to be taken to the editor, ready to make any necessary modifications, without the need for an additional login. Once you have made your edits, you can return to GTFS Data Manager to see the results and deploy them to OpenTripPlanner so that trips can be planned.

This closes the loop on GTFS maintenance. If an agency finds a problem in their feed, they can edit the feed, ensure it passes validation, and test it in the trip planner, without having to wait for staff to manually deploy data to servers. This means that users editing GTFS can now iterate very quickly by editing the feed, seeing it in the trip planner, and returning to the editor to make additional changes. GTFS management and maintenance has traditionally been handled with multiple tools; now a single user can quickly create and test GTFS data.

As always, all of the tools we built are open source.