CKAN and WordPress integration

We’ve just launched York Open Data, built on two open source platforms, CKAN and WordPress. At first glance, CKAN and WordPress integration seemed like a complex task but it needn’t be.

Our content management system of choice is WordPress. Armed with a portfolio of our own bespoke plugins we can achieve almost anything a client might need for managing their content. WordPress is mature, stable, well maintained and provides the features to quickly create blogs, news articles, pages, galleries and more. We used WordPress to serve our content.

CKAN was the data portal platform we were asked to use for the publishing, managing and and sharing of data. Also open source, CKAN has become the de-facto standard if you are publishing large amounts of data. It has been adopted world wide for open data platforms.

First glance

At first glance integrating the two systems seemed complex. After all, CKAN is written in Python and requires a PostgreSQL database, whereas WordPress is written in PHP and requires a MySQL database. CKAN does have some plugins available to help integrate the two systems, however such plugins only allow access to some of WordPress’ features, and we want complete control and flexibility.

CKAN recommends a loose integration between itself and content management systems and that’s the approach we took. We needed the two systems to work seamlessly side by side and look like one single application. Since CKAN provides an amazing API, we can display any of its data directly in WordPress with a few API requests. Getting at WordPress data from CKAN could be achieved with a custom plugins however that’s not required in this project. CKAN does its job very well and we had no need to pull in any data from WordPress.

The setup

We run the two systems on different subdomains. When you visit www.yorkopendata.org, you are viewing the WordPress system; click on “data” and you are through to data.yorkopendata.org which is powered by CKAN. There is no complex sharing of databases, template files or difficult to manage integration. We simply store JavaScript, CSS and image files with WordPress and they are loaded externally where required in CKAN. This means our templates are independent and updating the look and feel is simple across both systems.

A list of CKAN groups is displayed in WordPress with ease, thanks to a custom plugin which makes requests to the CKAN API. That’s as far as it goes with this project but with API access to datasets and resources, more elaborate needs can be easily met using the same plugin. When it comes to displaying WordPress content in CKAN, a CKAN extension can be written to directly access the WordPress database. This would give complete flexibility to access core WordPress data, along with any information stored by plugins.

York City Council have a large amount of data in different locations so they setup regular transfers to a single server. From there these files are sent to our server to be processed by our bespoke import system. Data files are sent with a corresponding XML file which we gives us metadata for the new resource, such as the title, and tells us which dataset the information belongs to and whether it should replace existing resources or be added alongside. Our import system then validates each file and uploads via the API. This makes data population almost entirely automated.

Installation

CKAN was installed from source along with all its dependencies and nginx handles requests and caching. Requests not served from the cache are reverse proxied to Apache, running the code via modwsgi. Apache is also responsible for serving the WordPress installation.