Having fun with PostGIS

Some time ago, a friend of mine told me he wanted to get Milan’s subway geo data in order to build an HTML5 representation using Canvas. Since the Italian government started an Open Data movement, each city council has to publish some data in an open way. He was able to find the location of each subway station nicely formatted CSV file.

I briefly searched on the Internet and instead of finding the same CSV files, I discovered the GIS data sets below:

So I thought it could be fun to use them to play with PostGIS and build a visualization using Google Maps Javascript API v3.

Could you guess what I did for the first step?

I searched on Docker Hub Registry for a PostGIS image and I found mdillon/postgis!

It will be the application’s database and I’ll create another Docker image that will contain my application, and that will be linked to the PostGIS database. The application has a little shell script that uses shp2pgsql to convert the shape files provided by the links above in a sql format to load two spatial enabled tables in the PostGIS container. Finally a simple Python script extracts data from the container so it can be used to generate the final HTML file using a Jinja2 template. PostGIS provides a really useful function that is ST_Centroid. It computes the center for the maps given the set of all the subway’s stations.

It’s possible to compare the Official Milan Subway Map:

Official Milan Subway Map

with the one generated by the application:

Generated Milan Subway Map

Data quality

The overall quality of the data provided is quite good, even if I can’t understand why there are three segments marked with line number 0 (I don’t think that it exists) that have a non-existent starting station.

Moreover, the stations coming from the data set overlaps or are very close to the stations that Google itself displays.

I’m only disappointed that the data provided covers only the main lines while in the “official” map there are also some minor lines.

Written on April 7, 2015