York Open Data

York Open Data is a huge step forward for the transparency and open government agenda in York. It’s an online clearinghouse of big data about York covering everything from council salaries to air pollution to the number of library books being borrowed. We worked with the City of York Council Business Intelligence Hub to design a system based around the CKAN platform – the de-facto standard for open data.

The Brief

The Council came to us with a very clear brief: to use CKAN to host data supplied by the Council’s own internal business systems. It also needed a more user-friendly, content managed wrapper for news and events, and to ease people in to the huge pool of data within.

In the fullness of time, other public-sector organisations may come on board and the system could become the clearinghouse for anyone in York who wants to open their datasets to the public.

It goes without saying that the site needs to be responsive and fully accessible.

The Technology

CKAN is used by local and national governments around the world, including the UK and US national governments. It’s open source, coded in Python, and very extensible. Rather than developing our own ‘home-brew’ fork of the original application, as many local authorities have done, we created an extension to customise the interface and fit CKAN around the main front-end website.

The front end of the site is driven by the more familiar WordPress CMS. Council staff can use WordPress to manage content, including news and event information.

The open data movement is closely associated with the open source movement, so it’s fitting that the system is built around two open-source applications.

Our Approach

We run the two systems on different subdomains. When you visit www.yorkopendata.org, you are viewing the WordPress system; click on “data” and you are through to data.yorkopendata.org which is powered by CKAN. A list of CKAN groups is displayed in WordPress with ease thanks to a custom plugin which makes requests to the extensive CKAN API. There is no complex sharing of databases, template files or difficult to manage integration. We simply store JavaScript, CSS and image files with WordPress and they are loaded externally where required in CKAN.

This means our templates are independent and updating the look and feel is simple across both systems.

The design is fully responsive, created in-house by our in-house designer but based around a logo and photography supplied by CoYC.

Making it flow

York City Council have a large amount of data in different locations,  so they set up regular transfers internally to a single server in their system. From there these files are sent to our server to be processed by our bespoke import system. Data files are sent with a corresponding XML file which gives us metadata for the new resource such as the title, the dataset that the information belongs to, and whether it should replace existing resources or be added alongside them.

Our import system then validates each file and uploads via the API. This makes data population almost entirely automated, minimising the ongoing manpower costs of getting public data into the public domain.

Specialist Hosting

CKAN is written in Python, requires a PostgreSQL database, and has a cacheing proxy that runs on NGINX, whereas WordPress is written in PHP, requires a MySQL database, and runs smoothly on Apache.

This required a pretty unique hosting envionment. With help from local high-end hosting company Bytemark (who actually supply the server for the UK national data hub at data.gov.uk) we built it on a new dedicated server. CKAN was installed from source along with all its dependencies.

NGINX handles requests and caching. Requests not served from the cache are reverse proxied to Apache, which runs the Python code via mod_wsgi. Apache is also responsible for serving the WordPress installation, running PHP via FastCGI.

The Results

York Open Data was launched with a presentation by City of York Council Chief Executive Kersten England at the 2015 conference of the Local Area Research and Intelligence Association. That summer, CoYC was one of only five UK local authorities awarded top marks by NESTA, an independent innovation charity, for opening data. They also commissioned some data-driven storytelling.

As of the start of 2017 the system contains about 850 datasets and has been expanded to host environmental data associated with the York City Environmental Observatory programme.

A whole new sector of the IT industry is emerging around mining big datasets. Providing up-to-date information about public services or the public realm is incredibly useful, and new apps and websites are appearing all the time.

www.yorkopendata.org

Get in touch and let’s discuss how we can help